Title :
Unsupervised acoustic and language model training with small amounts of labelled data
Author :
Novotney, Scott ; Schwartz, Richard ; Ma, Jeff
Author_Institution :
BBN Technol., Cambridge, MA
Abstract :
We measure the effects of a weak language model, estimated from as little as 100k words of text, on unsupervised acoustic model training and then explore the best method of using word confidences to estimate n-gram counts for unsupervised language model training. Even with 100k words of text and 10 hours of training data, unsupervised acoustic modeling is robust, with 50% of the gain recovered when compared to supervised training. For language model training, multiplying the word confidences together to get a weighted count produces the best reduction in WER by 2% over the baseline language model and 0.5% absolute over using unweighted transcripts. Oracle experiments show that a larger gain is possible, but better confidence estimation techniques are needed to identify correct n-grams.
Keywords :
acoustic signal processing; estimation theory; natural language processing; speech processing; language model training; n-gram count; unsupervised acoustic model training; word confidence estimation; Acoustic measurements; Decoding; Labeling; Natural languages; Robustness; Speech recognition; Telephony; Terminology; Training data; Vocabulary; Conversational Telephone Speech; Language Modeling; Unsupervised Training; Word Confidence;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960579