Tree Encoding of Speech Signals at Low Bit Rates

M.Eng. Thesis, March 1986

Supervisor: P. Kabal

The use of a delayed decision multi-path tree quantizer in a
Code Excited Linear Predictive (CELP) coder is studied. In a CELP coder, the
predictable information is efficiently removed from the input speech signals
using a cascade of two time varying predictor filters. The first exploits
near-sample correlations while the second exploits far-sample correlations
related to the pitch excitation. The resulting prediction residual signals are
low in amplitude and noise-like in appearance. The delayed decision multi-path
tree encoder implemented by the (*M*,*L*)-algorithm realizes waveform coding of the
prediction residual. Knowledge of human speech perception is used to define a
frequency weighted error measure. This criterion is both inside and outside the
tree quantizer to increase fidelity of reconstructed speech signals. The
quantizer, which takes into account the past history, is capable of
approximating the prediction residual using a large set of de-structured
codewords at fractional encoding bit rates (1/4 bit per sample). The vector
quantizer used in the original CELP system studied by Schroeder and Atal is a
special case of the tree quantizer. When controlled by proper combinations of
five relevant parameters, the tree quantizer has a superior performance to the
original vector quantizer in terms of objective performance, subjective
performance, computational complexity, and memory requirement. The effects of
pre-emphasizing the input signals an de-emphasizing the output signals on
overall system performance are studied also. Potential further studies on the
system are proposed.

Pitch Filtering in Adaptive Predictive Coding of Speech

M.Eng. Thesis, March 1986

Supervisor: P. Kabal

This thesis investigates the problem of stability of pitch filters in speech coding. The concern is on adaptive predictive coders employing pitch predictors.

A new algorithm that estimates the pitch period is coupled with the covariance formulation of determining the pitch predictor coefficients in order to realize a transversal structured filter. Since this approach does not guarantee the stability of the corresponding synthesis filter, a computationally efficient stability test based on a simple but tight sufficient condition is formulated. This is much less computationally demanding than utilizing a set of necessary and sufficient conditions derived from known stability tests. From the sufficient condition, a stabilization technique that ensures a stable pitch filter is introduced. An alternate method of deriving the filter coefficients such that stability is guaranteed at the outset is obtained by applying the Burg algorithm in realizing a lattice structured filter. For a lattice predictor, the pitch period is estimated in a different way than for a transversal predictor. The effect of the presence of unstable pitch filters on decoded speech is also investigated.

At the analysis stage, the formant and pitch predictors may be placed in either order. Both configurations are compared with regards to the stability and performance of pitch pfilters. Recommendations for future research are given.

Tap Leakage Applied to Echo Cancellation

M.Eng. Project, November 1985

Supervisor: P. Kabal

This study presents a tap-leakage adjustment algorithm to control the tap drifting problem in an adaptive echo canceller. A nonrecursive transversal filter structure and stochastic gradient adaptation algorithm are first studied. On the basis of these studies, the effect of tap drift when the input spectrum does not cover the full band is presented. The tap-leakage algorithm, which has been used in fractional spaced equalizers and speech coding is introduced. In this thesis, the tap-leakage algorithm is applied to an echo canceller. In addition, a least-squares lattice filter is proposed to overcome slow convergence problems due to narrowband inputs. Finally, the simulation results of the stochastic gradient, tap-leakage and least-squares lattice algorithms are studied.

An Adaptive Prefilter for Timing Recovery

M.Eng. Thesis, October 1985

Supervisor: P. Kabal

This thesis presents a new technique for improving the
performance of the timing recovery scheme for baseband synchronous pulse
amplitude modulation (PAM) data signals. This technique uses adaptive
prefiltering to adaptively shape the pulses entering the timing path. A review
of the timing recovery problem and the timing jitter of the PAM system is first
introduced. Then, a discussion of the properties of the timing wave,
including the effects of the prefilter, is presented. The rest of the thesis
describes the design, implementation, and performance of the adaptive prefilter
with tap spacing of one-quarter of the symbol time interval *T*. An
analysis of an adaptive algorithm for adjusting the tap weights of a
tapped-delay line to minimize the mean square distortion is given. The intention
is to see the effects of the number of taps and the step size on the speed of
convergence. Attention is focused on the convergence of the mean coefficient
vector. The thesis concludes with an implementation for a computer simulation
examining the techniques. Comparison with results obtained from some specific
examples shows that the convergence of the mean coefficient vector also leads to
fast convergence of the output mean square error.

Design of Filter Banks for Subband Coding Systems

M.Eng. Thesis, June 1985

Supervisor: P. Kabal

The performance of subband coders relies on the ability of analysis and reconstruction filter banks to provide good isolation between contiguous frequency bands of speech signals. In general, analysis/reconstruction filter banks introduce, to some degree, aliasing, amplitude distortion and phase distortion to the reconstructed signal. These impairments as well as the overall system delay and implementation complexity are the major issues in the design of filter banks for subband coding systems.

This study presents a detailed discussion of the different filter families that deal with the above issues. The discussion includes linear phase quadrature mirror (QMF) filters, IIR-QMF filters, Pseudo-QMF filters and nonlinear phase time-reversed QMF filters. Emphasis is given to nonlinear phase time-reversed QMF filters since they can be designed to remove all three types of distortion from the reconstructed signal. These filters are designed using the McClellan-Parks algorithm. Experimental results show that the amplitude distortion introduced by time-reversed QMF filters when implemented with finite precision arithmetic is negligible.

The Stability of Pitch Synthesis Filters in Speech Coding

M.Eng. Thesis, June 1985

Supervisor: P. Kabal

This thesis studies the problem of instability in pitch synthesis filters found in Adaptive Predictive Coding of speech. The performance of such coders is often improved by adding, in the analysis stage, a pitch predictor which remove the redundancy due to the pitch periodicity in the speech signal. The pitch synthesis filter used to restore this periodicity is known to be quite susceptible to instability, causing distortion in the decoded speech. The system function of the synthesis filter has a denominator polynomial of relatively high degree, ranging from 20 to 120 for a signal sampled at 8 kHz. Testing the stability of the filter by solving for the roots of the polynomial is time consuming and impractical for real time applications.

This study establishes a simple criterion to check the stability for a given frame of speech, it also proposes several stabilization schemes, and examines the effects of stabilizing the filter on the decoded speech. One criterion determines the filter stability by checking the sum of the magnitudes of the predictor coefficients against unity. It introduces a negligible delay and is shown to be a sufficient condition for the stability of the pitch synthesis filter.

Thesis titles.