DocumentCode
2109086
Title
Approaches to Language Identification Using Gaussian Mixture Model and Linear Discriminant Analysis
Author
Zeng, Xiuhua ; Yang, Jian ; Xu, Dan
Author_Institution
Sch. of Inf. Sci. & Eng., Yunnan Univ., Kunming
fYear
2008
fDate
21-22 Dec. 2008
Firstpage
1109
Lastpage
1112
Abstract
The baseline system PRLM has the best performance on NIST language recognition evaluation tasks. But this system needs orthographically or phonetically transcribed utterances which can not be easily obtained from Chinese dialects and minority languages. So, the PRLM system is not used to these languages. To overcome this limitation, we present the Gaussian mixture model recognizer followed by language-dependent language model (GMM-LM) as an approach to language identification. In this paper, we focus on finding the optimum number of frames to train each GMM parameter and comparing two back-end processing approaches in GMM-LM system. The experiments show that the LDA processing approach can achieve average accuracy 78%, which is a 45% relative improvement over simple approach on 30s test data.
Keywords
Gaussian processes; natural language processing; Chinese dialects; Gaussian mixture model; NIST language recognition evaluation tasks; back-end processing approach; language identification; language-dependent language model; linear discriminant analysis; minority languages; Feature extraction; Information retrieval; Information science; Information security; Information technology; Linear discriminant analysis; NIST; National security; Natural languages; Testing; GMM-LM; LDA; language identification;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Information Technology Application Workshops, 2008. IITAW '08. International Symposium on
Conference_Location
Shanghai
Print_ISBN
978-0-7695-3505-0
Type
conf
DOI
10.1109/IITA.Workshops.2008.212
Filename
4732132
Link To Document