DocumentCode :
2436126
Title :
Boosting of Speech Recognition Performance by Language Model Adaptation
Author :
Korkmazsky, Filipp ; Jojic, Oliver ; Shevade, Bageshree
Author_Institution :
Comcast Inc., Washington
fYear :
2007
fDate :
3-10 March 2007
Firstpage :
1
Lastpage :
10
Abstract :
This paper presents a novel approach to language model adaptation for speech recognition. We define mutual information histograms which account for different semantic and syntactic relations between words in text data. We introduce a novel word distance measure which is based on mutual information histograms. By using this measure we were able to create linguistically meaningful word clusters composed of words obtained in first-pass speech recognition. Words included in the clusters were used to adapt language models. Adapted language models were used for a second pass of speech recognition. We conducted experiments on the Fisher speech corpus of telephone conversations. Mutual information histograms for word pairs were estimated from the Fisher data as well as from data extracted from a corpus of New York Times articles. Results showed that word clusters conveyed significant information and could be helpful in improving speech recognition accuracy.
Keywords :
feature extraction; speech recognition; Fisher speech corpus; New York Times articles; data extraction; first-pass speech recognition; language model adaptation; mutual information histograms; semantic-syntactic relations; speech recognition; telephone conversations; word clusters; Adaptation model; Boosting; Clustering algorithms; Histograms; Information resources; Lattices; Mutual information; Natural languages; Speech recognition; Strontium;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Aerospace Conference, 2007 IEEE
Conference_Location :
Big Sky, MT
ISSN :
1095-323X
Print_ISBN :
1-4244-0524-6
Electronic_ISBN :
1095-323X
Type :
conf
DOI :
10.1109/AERO.2007.352980
Filename :
4161420
Link To Document :
بازگشت