Title :
Significance of speech enhancement and sonorant regions of speech for robust language identification
Author :
Kumar Vuppala, Anil ; Mounika, K.V. ; Vydana, Hari Krishna
Author_Institution :
Speech & Vision Lab., Int. Inst. of Inf. Technol. - Hyderabad, Hyderabad, India
Abstract :
A high degree of robustness is a prerequisite to operate speech and language processing systems in practical environments. Performance of these systems is highly influenced by varying and mixed background environments. In this paper, we put forward a robust method for automatic language identification in various background environments. Combined temporal and spectral processing method is used as a preprocessing technique for enhancing the degraded speech. Language discriminative information in high sonority regions of speech is used for the task of language identification. Sonority regions are regions of speech whose signal energy is high and these regions are less influenced by background environments. Spectral energy of formants in the glottal closure regions is employed as an acoustic correlate for the detection of sonority regions of speech. In this paper performance of the LID system is studied in various background environments like clean room, car factory, high frequency, pink and white noise environments. In this work, Indian Institute of Technology Kharagpur - Multi Lingual Indian Language Speech Corpus (IITKGP-MLILSC) is used for building language identification system. Noise speech samples from the NOISEX database are employed in the present study. The performance of the proposed method is quite satisfactory compared to existing approaches.
Keywords :
natural language processing; speech enhancement; speech recognition; Indian Institute of Technology Kharagpur; LID system; Multi Lingual Indian Language Speech Corpus; NOISEX database; automatic language identification; language discriminative information; language processing system; robust language identification; sonorant regions; spectral processing method; speech enhancement; temporal processing method; Noise measurement; Robustness; Signal to noise ratio; Speech; Speech enhancement; Testing; Automatic language identification; combined temporal and spectral processing; formant frequencies; glottal closure region; sonority regions; various background environments;
Conference_Titel :
Signal Processing, Informatics, Communication and Energy Systems (SPICES), 2015 IEEE International Conference on
Conference_Location :
Kozhikode
DOI :
10.1109/SPICES.2015.7091438