Title :
Compression of acoustic features for speech recognition in network environments
Author :
Ramaswamy, G.N. ; Gopalakrishnan, Ponani S.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
In this paper, we describe a new compression algorithm for encoding acoustic features used in typical speech recognition systems. The proposed algorithm uses a combination of simple techniques, such as linear prediction and multi-stage vector quantization, and the current version of the algorithm encodes the acoustic features at a fixed rate of 4.0 kbit/s. The compression algorithm can be used very effectively for speech recognition in network environments, such as those employing a client-server model, or to reduce storage in general speech recognition applications. The algorithm has also been tuned for practical implementations, so that the computational complexity and memory requirements are modest. We have successfully tested the compression algorithm against many test sets from several different languages, and the algorithm performed very well, with no significant change in the recognition accuracy due to compression
Keywords :
computational complexity; linear predictive coding; speech coding; speech recognition; vector quantisation; 4.0 kbit/s; acoustic features compression; client-server model; compression algorithm; computational complexity; linear prediction; memory requirements; multi-stage vector quantization; network environments; recognition accuracy; speech recognition; Acoustic testing; Bandwidth; Compression algorithms; Computer networks; Intelligent networks; Network servers; Performance evaluation; Robustness; Speech recognition; Vector quantization;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675430