Title :
Probabilistic Pronunciation Variation Model Based on Bayesian Network for Conversational Speech Recognition
Author :
Sakti, Sakriani ; Markov, Konstantin ; Nakamura, Satoshi
Author_Institution :
Nat. Inst. of Inf. & Commun. Technol.
Abstract :
This paper reports on an ongoing study on modeling pronunciation variation for conversational speech recognition, in which the mapping from canonical pronunciations (baseforms) to the actual/realized phoneme (surface forms) is modeled by a Bayesian network. The advantage of this graphical model framework is that the probabilistic relationship between baseforms, surface forms, and any additional knowledge sources can be learned in a unified manner. Thus, we can easily incorporate various additional knowledge sources from different domains. In this preliminary study, we investigate the dependency of surface forms on the current, preceding and succeeding base- form phonemes, the position of current baseform phoneme in the word, and also whether or not the preceding surface phoneme was deleted. The performance of the proposed method was evaluated using spontaneous telephone conversations from a portion of the Switchboard corpus. Experimental results show that this method provides consistent improvement in word accuracy over the standard pronunciation dictionary.
Keywords :
belief networks; probability; speech recognition; Bayesian network; Switchboard corpus; baseform phonemes; conversational speech recognition; knowledge sources; probabilistic pronunciation variation model; spontaneous telephone conversations; surface phoneme; Automatic control; Automatic speech recognition; Bayesian methods; Communications technology; Control systems; Dictionaries; Graphical models; Natural languages; Speech recognition; Telephony; Bayesian network; Pronunciation modeling; incorporating knowledge sources;
Conference_Titel :
Universal Communication, 2008. ISUC '08. Second International Symposium on
Conference_Location :
Osaka
Print_ISBN :
978-0-7695-3433-6
DOI :
10.1109/ISUC.2008.33