Title :
Vowel Onset Point Detection for Low Bit Rate Coded Speech
Author :
Vuppala, Anil Kumar ; Yadav, Jainath ; Chakrabarti, Saswat ; Rao, K. Sreenivasa
Author_Institution :
G.S. Sanyal Sch. of Telecommun., Indian Inst. of Technol., Kharagpur, India
Abstract :
In this paper, we propose a method for detecting the vowel onset points (VOPs) for low bit rate coded speech. VOP is the instant at which the onset of the vowel takes place in the speech signal. VOP plays an important role for the applications, such as consonant-vowel (CV) unit recognition and speech rate modification. The proposed VOP detection method is based on the spectral energy present in the glottal closure region of the speech signal. Speech coders considered to carry out this study are Global System for Mobile Communications (GSM) full rate, code-excited linear prediction (CELP), and mixed-excitation linear prediction (MELP). TIMIT database and CV units collected from the broadcast news corpus are used for evaluation. Performance of the proposed method is compared with existing methods, which uses the combination of evidence from the excitation source, spectral peaks energy, and modulation spectrum. The proposed VOP detection method has shown significant improvement in the performance compared to the existing method under clean as well as coded cases. The effectiveness of the proposed VOP detection method is analyzed in CV recognition by using VOP as an anchor point.
Keywords :
cellular radio; database management systems; speech coding; vocoders; CELP; CV recognition; GSM; Global System for Mobile Communications full rate; MELP; TIMIT database; VOP detection method; anchor point; broadcast news corpus; code-excited linear prediction; consonant-vowel unit recognition; excitation source; low bit rate coded speech; mixed-excitation linear prediction; modulation spectrum; spectral peaks energy; speech coders; speech rate modification; speech signal; vowel onset point detection; Cavity resonators; Lungs; Modulation; Speech; Speech coding; Speech enhancement; Speech recognition; Glottal closure region; spectral energy; speech coders; vowel onset point (VOP);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2012.2191284