Title :
Pronunciation ambiguity vs. pronunciation variability in speech recognition
Author :
Saraclar, Murat ; Khudanpur, Sanjeev
Author_Institution :
Centre for Language & Speech Processing, Johns Hopkins Univ., Baltimore, MD, USA
Abstract :
It is widely acknowledged that pronunciations in spontaneous speech differ significantly from citation form. For this reason, pronunciation modeling has received considerable attention in recent automatic speech recognition literature. Most of the attention however has focused on describing an alternate pronunciation as a different sequence of phonetic units using the same inventory of phones which describe canonical pronunciations. Analysis of manual phonetic transcription of conversational speech reveals a large number (>20%) of cases of genuine ambiguity: instances where human labelers disagree on the identity of the surface form. The authors investigate and characterize the acoustic evidence in the context of this ambiguity. They show that when a pronunciation change occurs, it is often the case that neither the canonical nor the alternate phone represent the acoustics very well. Based on this analysis, two methods for accommodating pronunciation ambiguity are developed. The first method attempts to resolve the ambiguity by separately modeling each baseform/surface-form pair. The second method treats the surface form as a hidden variable and “averages out” the ambiguity
Keywords :
acoustic signal processing; linguistics; speech recognition; acoustic evidence; alternate phone; alternate pronunciation; ambiguity; automatic speech recognition; baseform/surface-form pair; canonical pronunciations; citation form; conversational speech; hidden variable; human labelers; manual phonetic transcription; phonetic units; pronunciation ambiguity; pronunciation change; pronunciation modeling; pronunciation variability; spontaneous speech; surface form; Acoustics; Automatic speech recognition; Context modeling; Humans; Natural languages; Predictive models; Speech analysis; Speech processing; Speech recognition; Surface treatment;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862073