Title :
Syllable: A self-contained unit to model pronunciation variation
Author :
Ng, Raymond W M ; Hirose, Keikichi
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
Abstract :
In this paper, we demonstrate the potential of incorporating syllable-level information in acoustic modeling. The unit of syllable is not rigorously defined, which leads to a problem for its use. In this study, we derive syllable structures from the sonorant-band intensity profile of speech signal. We analyze the error statistics of a phone-based context-dependent speech recognizer and find interesting error patterns. Phone errors mainly occur inside a syllable but not at syllable boundaries. Pronunciation variation can thus be regarded as the replacement of phonetic elements within the time span of a solitary syllable. We apply simple rules to model the pronunciation variation phenomenon. A lexical modeling approach modifies the bi-phone transcription in the dictionary. It leads to a significant increase of phone correctness. The results shed light on a more intuitive and direct approach to model pronunciation variation within the scope of syllables.
Keywords :
error statistics; speech processing; acoustic modeling; bi-phone transcription; error statistics; lexical modeling; phone correctness; phone errors; phone-based context-dependent speech recognizer; phonetic elements; pronunciation variation phenomenon; self-contained unit; solitary syllable; sonorant-band intensity profile; speech signal; syllable boundaries; syllable-level information; Accuracy; Acoustics; Databases; Dictionaries; Hidden Markov models; Speech; Speech recognition; Syllable; pronunciation variation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288909