DocumentCode :
1845915
Title :
Unsupervised clustering of syllables for language identification
Author :
Dey, Subhadeep ; Murthy, Hema
Author_Institution :
Dept. of Comput. Sci. & Eng., IIT Madras, Chennai, India
fYear :
2012
fDate :
27-31 Aug. 2012
Firstpage :
325
Lastpage :
329
Abstract :
Automatic Language Recognition makes extensive use of phonotactics for identifying a language. The accuracy of phonotactic information depends upon the amount of data available for training. The state of the art approaches capture the phonotactics in terms of cross-lingual GMM tokens. The accuracy of such tokenisers crucially depends upon the availability of specific corpora. In this paper, we suggest an alternative to GMM tokens, namely, syllable based tokens. Syllables implicitly capture the phonotactics across phonemes in a language. Unsupervised Syllable tokenisation for language identification requires a) segmentation of speech into syllable-like units syllable level, and b) unsupervised modeling of the syllable tokens by Hidden Markov Models. The first issue is addressed by segmenting the wavform into syllable-like units using a well-established group delay based segmentation algorithm. To address the second issue, two different solutions are proposed, namely, (i) a top down clustering approach, which does not require significant parameter tuning, and is also robust, and (ii) a universal syllable approach. In this syllable models for every language are obtained from adapted universal syllable models. Experimental results on the OGI 1992 multilingual corpus and NIST 2003 LRE corpus show that the proposed approaches donot require significant tuning of parameters and the performance is comparable to that of a well-tuned baseline syllable tokenisation system.
Keywords :
hidden Markov models; natural language processing; pattern clustering; speech recognition; unsupervised learning; NIST 2003 LRE corpus; OGI 1992 multilingual corpus; automatic language recognition; cross-lingual GMM tokens; group delay based segmentation algorithm; hidden Markov models; language identification; phonotactic information; syllable based tokens; top down clustering approach; universal syllable approach; unsupervised clustering; unsupervised syllable tokenisation; Adaptation models; Clustering algorithms; Databases; Hidden Markov models; NIST; Speech; Training; syllable segmentation; top down syllable clustering; universal syllable models; unsupervised clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European
Conference_Location :
Bucharest
ISSN :
2219-5491
Print_ISBN :
978-1-4673-1068-0
Type :
conf
Filename :
6333800
Link To Document :
بازگشت