Saturday 8 June 2013

Electronics Projects: Speech Coder

Implementation of Speech Coder Using Sub-Bands:

    Sub band processing is based on splitting the frequency range into M segments (subbands), which together encompass the entire range. Each subband is processed independently, as called for by the specific application. The subbands are recombined after processing, to form an output signal whose bandwidth occupies the entire frequency range.
    In subband coding, the speech is first split into frequency bands using a bank of band-pass filters. The individual band pass signals are then decimated and encoded for transmission. A filter bank is a collection of band-pass filters, all processing the same input signal. The important parameters in subband coders are the number of frequency bands and the frequency coverage of the system, and the way subband coders are coded.
    There are two kinds of subband structures: Uniform band structures, where all the bands have equal widths, and octave band structures, where the bandwidths are half as great as the higher adjacent band and twice as great as the lower adjacent band.
    This project digitizes the speech signal and represents it with a digital bit stream and to produce the highest possible speech quality at the lowest possible bit-rate.  The speech coder generally consists of three components: speech analysis, parameter quantization and parameter coding. After analysis, the samples must be quantized to reduce the number of bits required.  The output of the quantizer is provided to the coder which assigns a unique binary code to each possible quantized representation. These binary codes are packed together for efficient transmission.
    Quantizers are generally divided in two: Uniform or nonuniform quantizers, and adaptive quantizers. A nonuniform quantizer or an adaptive quantizer followed by an encoder that assigns a code to each quantization level is called companding pulse code modulation (companding PCM) or adaptive pulse code modulation (APCM).
    First, we carried out Subband coder by creating analysis and synthesis sections. This actions does not necessary add distortions in the voice, and the output is equal to the source, except for some loss of higher frequencies. We recorded our voice using sound recorder at 8KHz.It means the highest freq component is 4 KHz which correspond to 1 in Matlab. It is computationally very intensive to take FFT of entire signal; therefore a better choice is to take some finite samples say 36000 for the purpose of computing FFT. We implemented the code by two methods
a).Without noble identities: In this method we passed the input signal (speech) through a low pass and high pass filters in the first stage. After that we decimated the two signals (Lower band and upper band).This has the disadvantage of computing the samples which we are finally going to throw away. However the end result was good without getting aliasing in the final signal.
b). Using noble identities:  : In this method we  decimated the signals first and then passed them through a low pass and high pass filters in the first stage. We continued with this approach till we get the four bands in the analysis section. However the end result was aliasing in the final signal.

No comments:

Post a Comment