Title :
Large Vocabulary Continuous Speech Recognition in Uyghur: Data Preparation and Experimental Results
Author :
Tursun, Nasirjan ; Silamu, Wushour
Abstract :
Uyghur language is an agglutinative language. It is one of the least studied languages on speech recognition area. In this work, we present the research process of Uyghur large vocabulary continuous speech recognition based on HMM (hidden Markov model). This paper introduce the process of data collection (text corpus and speech corpus), the unit selection for speech recognition, the creation of acoustic and language model for Uyghur language. Also presents the experimental results of Uyghur continuous speech recognition in different recognition units.
Keywords :
hidden Markov models; speech recognition; Uyghur language; acoustic model; data collection; data preparation; hidden Markov model; language model; speech corpus; text corpus; vocabulary continuous speech recognition; Character recognition; Data engineering; Databases; Educational institutions; Hidden Markov models; Natural languages; Speech processing; Speech recognition; Speech synthesis; Vocabulary;
Conference_Titel :
Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2942-4
Electronic_ISBN :
978-1-4244-2943-1
DOI :
10.1109/CHINSL.2008.ECP.61