Title :
Improving Pronunciation Inference using N-Best List, Acoustics and Orthography
Author :
Anumanchipalli, Gopala K. ; Ravishankar, Mosur ; Reddy, Raj
Author_Institution :
Language Technol. Res. Center, International Inst. of Inf. Technol., Hyderabad
Abstract :
In this paper, we tackle the problem of pronunciation inference and out-of-vocabulary (OOV) enrollment in automatic speech recognition (ASR) applications. We combine linguistic and acoustic information of the OOV word using its spelling and a single instance of its utterance to derive an appropriate phonetic baseform. The novelty of the approach is in its employment of an orthography-driven n-best hypothesis and rescoring strategy of the pronunciation alternatives. We make use of decision trees and heuristic tree search to construct and score the n-best hypotheses space. We use acoustic alignment likelihood and phone transition cost to leverage the empirical evidence and phonotactic priors to rescore the hypotheses and refine the baseforms.
Keywords :
linguistics; speech recognition; vocabulary; acoustic alignment likelihood; acoustic information; acoustics orthography; automatic speech recognition; decision trees; heuristic tree search; linguistic information; n-best list; orthography-driven n-best hypothesis; out-of-vocabulary; phone transition cost; pronunciation inference; rescoring strategy; Acoustic applications; Acoustic signal detection; Acoustic transducers; Automatic speech recognition; Costs; Decision trees; Decoding; Inference algorithms; Viterbi algorithm; Vocabulary; Out-of-Vocabulary; automatic pronunciation learning; letter-to-sound rules; n-best list; pronunciation modeling;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2007.367222