How it works
LPC starts with the assumption that a speech signal is produced by a buzzer at the end of a tube. The glottis (the space between the vocal cords) produces the buzz, which is characterized by its intensity (loudness) and frequency (pitch). The vocal tract (the throat and mouth) forms the tube, which is characterized by its resonances, which are called formants. LPC analyzes the speech signal by estimating the formants, removing their effects from the speech signal, and estimating the intensity and frequency of the remaining buzz. The process of removing the formants is called inverse filtering, and the remaining signal is called the residue. The numbers which describe the formants and the residue can be stored or transmitted somewhere else. LPC synthesizes the speech signal by reversing the process: use the residue to create a source signal, use the formants to create a filter (which represents the tube), and run the source through the filter, resulting in speech. Because speech signals vary with time, this process is done on short chunks of the speech signal, which are called frames; generally 30 to 50 frames per second give intelligible speech with good compression.
LPC is frequently used for transmitting spectral envelope information, and as such it has to be tolerant for transmission errors. Transmission of the filter coefficients directly (see linear prediction for definition of coefficients) is undesirable, since they are very sensitve to errors. In other words, a very small error can distort the whole spectrum, or worse, a small error might make the precition filter unstable.
More advanced representations are such as log area ratios[?] (LAR), line sprectrum pairs[?] (LSP) decomposition and reflection coefficients. Of these, especially LSP decomposition has gained popularity, since it ensures stability of the predictior, and spectral errors are local for small coefficient deviations.
LPC is generally used for speech resynthesis. It is used as a form of voice compression by phone companies (e.g. by GSM telephones, and electronic music composers have used it to impressive effect in their compositions via its cross-synthesis ability.
See also: Audio compression
Search Encyclopedia
|
Featured Article
|