DocumentCode :
1064989
Title :
An extended clustering algorithm for statistical language models
Author :
Ueberla, J.P.
Author_Institution :
DRA Malvern
Volume :
4
Issue :
4
fYear :
1996
fDate :
7/1/1996 12:00:00 AM
Firstpage :
313
Lastpage :
316
Abstract :
An existing clustering algorithm is extended to deal with higher order N-grams and a faster heuristic version is developed. Even though results are not comparable to back-off trigram models, they outperform back-off bigram models when many million words of training data are not available
Keywords :
grammars; natural languages; speech processing; statistical analysis; back-off bigram models; extended clustering algorithm; heuristic algorithm; higher order N-grams; statistical language models; training data; Clustering algorithms; Convergence; Probability distribution; Standards publication; Training data; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.506936
Filename :
506936
Link To Document :
بازگشت