Title :
Improving speech recognition by explicit modeling of phone deletions
Author :
Ko, Tom ; Mak, Brian
Author_Institution :
Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
Abstract :
In a paper published by Greenberg in 1998, it was said that in conversational speech, phone deletion rate may go as high as 12% whereas syllable deletion rate is about 1%. The finding prompted a new research direction of syllable modeling for speech recognition. To date, the syllable approach has not yet fulfilled its promise. On the other hand, there were few attempts to model phone deletions explicitly in current ASR systems. In this paper, fragmented word models were derived from well-trained cross-word triphone models, and phone deletion was implemented by skip arcs for words consisting of at least four phonemes. An evaluation on CSR-II WSJ1 Hub2 5K task shows that even with this limited implementation of phone deletions in read speech, we obtained a word error rate reduction of 6.73%.
Keywords :
speech processing; speech recognition; conversational speech; phone deletions; speech recognition; syllable deletion; well-trained cross-word triphone models; Automatic speech recognition; Computer science; Context modeling; Councils; Degradation; Error analysis; Explosions; Paper technology; Speech analysis; Speech recognition; Phone deletions; acoustic modeling; fragmented word model; skip arc; syllable;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495131