1.1 Speech Signals
The human speech in its pristine form is an acoustic signal. For the purpose of communication and storage, it is necessary to convert it into an electrical signal. This isaccomplished with the help of certain instruments called µtransducers¶.This electrical representation of speech has certain properties.1. It is a one-dimensional signal, with time as its independent variable.2. It is random in nature.3. It is non-stationary, i.e. the frequency spectrum is not constant in time.4. Although human beings have an audible frequency range of 20Hz ±20kHz, the human speechhas significant frequency components only upto 4kHz,a property that is exploited in thecompression of speech.1.1.1 Digital representation of speechWith the advent of digital computing machines, it was propounded to exploit the powersof the same for processing of speech signals. This required a digital representation of speech. Toachieve this, the analog signal is sampled at some frequency and then quantized at discretelevels. Thus, parameters of digital speech are1. Sampling rate2. Bits per second3. Number of channels.The sound files can be stored and played in digital computers. Various formats have been proposed by different manufacturers for example µ.wav¶ µ.au¶ to name a few.In this thesis, theµ.wav¶ format is used extensively due to the convenience in recording it with µSound recorder¶software, shipped with WINDOWS OS.
1.2 Compression ± An Overview
In the recent years, large scale information transfer by remote computing and thedevelopment of massive storage and retrieval systems have witnessed a tremendous growth. Tocope up with the growth in the size of databases, additional storage devices need to be installedand the modems and multiplexers have to be continuously upgraded in order to permit largeamounts of data transfer between computers and remote terminals. This leads to an increase in