DocumentCode :
258690
Title :
Towards improving the performance of language identification system for Indian languages
Author :
Anto, Abitha ; Sreekumar, K.T. ; Kumar, C. Santhosh ; Raj, P. C. Reghu
Author_Institution :
Dept. of Comput. Sci. & Eng, Gov. Eng. Coll., Palakkad, India
fYear :
2014
fDate :
17-18 Dec. 2014
Firstpage :
42
Lastpage :
46
Abstract :
In this paper, we present the details of a phonotactic language identification (LID) system developed for five Indian languages, English (Indian), Hindi, Malayalam, Tamil and Kan-nada. Since there are no publicly available speech databases for English, Malayalam and Kannada, we developed the database for each of the target languages by downloading the audio files from YouTube videos and removing the non-speech signals manually. The system was tested using a test data set consisting of 40 utterances with duration of 30, 10, and 3 sees, in each of 5 target languages. The performance evaluation was done separately accordingly to the NIST benchmarking sessions, for 30s, 10s and 3s segments separately. For the baseline system, we got an overall EER of 10.41 %, 19.56 % and 31.45 % for 30, 10, and 3 sees segments when tested with a 3-gram language model. The use of 4-gram language model has helped enhance the performance of the LID system to 9.81 %, 19.38 % and 32.77% respectively for 30,10 and 3 sees test segments. Further, by using the n-gram smoothing, we were able to improve the EER of the LID system, 9.02 %, 18.70 % and 29.24 % for 3-gram language models and 8.88 %, 16.46 % and 32.03 % for 4-gram language models, respectively for 30,10, and 3 sec test segments. The study shows that the use of 4-gram language models can help enhance the performance of LID systems for Indian languages.
Keywords :
natural language processing; speech processing; 3-gram language model; 4-gram language model; English language; Hindi language; Indian languages; Kannada language; LID system; Malayalam language; NIST benchmarking; Tamil language; YouTube videos; audio file downloading; baseline system; language identification system; n-gram smoothing; nonspeech signal removal; overall EER improvement; performance improvement; phonotactic language identification system; target languages; test data set; test segments; utterance duration; Acoustics; Computational modeling; Data models; Databases; Smoothing methods; Speech; Speech recognition; Language Model; Phone Recognition followed by Language Modeling (PRLM); Phonotactic features; n-gram;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Systems and Communications (ICCSC), 2014 First International Conference on
Conference_Location :
Trivandrum
Print_ISBN :
978-1-4799-6012-5
Type :
conf
DOI :
10.1109/COMPSC.2014.7032618
Filename :
7032618
Link To Document :
بازگشت