DocumentCode :
454710
Title :
Profile Based Compression of N-Gram Language Models
Author :
Olsen, Jesper ; Oria, Daniela
Author_Institution :
Lab. of Multimedia Technol., Nokia Res. Center, Helsinki
Volume :
1
fYear :
2006
fDate :
14-19 May 2006
Abstract :
A profile based technique for compression of n-gram language models is presented. The technique is intended to be used in combination with existing techniques for size reduction of n-gram language models such as pruning, quantisation and word class modelling. The technique is here evaluated on a large vocabulary embedded dictation task. When used in combination with quantisation, the technique can reduce the memory needed for storing probabilities by a factor of 10 or more with only a small degradation in word accuracy. The structure of the language model is well suited for "best-first" type decoding styles, and is here used for guiding an isolated word recogniser by predicting likely continuations at word boundaries
Keywords :
data compression; decoding; natural languages; speech coding; speech recognition; best-first type decoding; large vocabulary embedded dictation task; n-gram language models; profile based compression; pruning; quantisation; word class modelling; word recogniser; Decoding; Degradation; Predictive models; Quantization; Vocabulary; dictation; embedded systems; n-gram language models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
ISSN :
1520-6149
Print_ISBN :
1-4244-0469-X
Type :
conf
DOI :
10.1109/ICASSP.2006.1660202
Filename :
1660202
Link To Document :
بازگشت