DocumentCode :
353717
Title :
Enhanced language modelling with phonologically constrained morphological analysis
Author :
Fang, A.C. ; Huckvale, M.
Author_Institution :
Dept. of Phonetics & Linguistics, Univ. Coll. London, UK
Volume :
3
fYear :
2000
fDate :
2000
Firstpage :
1711
Abstract :
Phonologically constrained morphological analysis (PCMA) is the decomposition of words into their component morphemes conditioned by both orthography and pronunciation. The article describes PCMA and its application in large-vocabulary continuous speech recognition to enhance recognition performance in some tasks. Our experiments, based on the British National Corpus and the LOB Corpus for training data and WSJCAM0 for test data, show clearly that PCMA leads to smaller lexicon size, smaller language models, superior word lattices and a decrease in word error rates. PCMA seems to show most benefit in open-vocabulary tasks, where the productivity of a morph unit lexicon makes a substantial reduction in out-of-vocabulary rates
Keywords :
linguistics; modelling; speech recognition; word processing; British National Corpus; LOB Corpus; PCMA; WSJCAM0; component morphemes; enhanced language modelling; language models; large-vocabulary continuous speech recognition; lexicon size; morph unit lexicon productivity; open-vocabulary tasks; orthography; out-of-vocabulary rates; phonologically constrained morphological analysis; pronunciation; recognition performance; test data; word decomposition; word error rates; word lattices; Educational institutions; Error analysis; Hidden Markov models; Lattices; Speech recognition; Statistical analysis; Statistics; Testing; Training data; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
ISSN :
1520-6149
Print_ISBN :
0-7803-6293-4
Type :
conf
DOI :
10.1109/ICASSP.2000.862081
Filename :
862081
Link To Document :
بازگشت