Title :
Effect of gaussian densities and amount of training data on grapheme-based acoustic modeling for Arabic
Author :
Elmahdy, Mohamed ; Gruhn, Rainer ; Minker, Wolfgang ; Abdennadher, Slim
Author_Institution :
Fac. of Eng. & Comput. Sci., Univ. of Ulm, Ulm, Germany
Abstract :
Grapheme-based acoustic modeling for Arabic is a demanding research area since high phonetic transcription accuracy is not yet solved completely. In this paper, we are studying the use of a pure grapheme-based approach using Gaussian mixture model to implicitly model missing diacritics and investigating the effect of Gaussian densities and amount of training data on speech recognition accuracy. Two transcription systems were built: a phoneme-based system and a grapheme-based system. Several acoustic models were created with each system by changing the number of Gaussian densities and the amount of training data. Results show that by increasing the number of Gaussian densities or the amount of training data, the improvement rate in the grapheme-based approach was found to be faster than in the phoneme-based approach. Hence the accuracy gap between the two approaches can be compensated by increasing either the number of Gaussian densities or the amount of training data.
Keywords :
Gaussian processes; acoustic signal processing; learning (artificial intelligence); natural languages; speech processing; speech recognition; Arabic language; Gaussian density; grapheme-based acoustic modeling; machine learning; missing diacritics modeling; phoneme-based system; phonetic transcription; speech recognition; Acoustical engineering; Automatic speech recognition; Books; Computer science; Context modeling; Data engineering; Natural languages; Productivity; Speech recognition; Training data; Acoustic modeling; Arabic language; Graphemic modeling; Speech recognition;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2009. NLP-KE 2009. International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4244-4538-7
Electronic_ISBN :
978-1-4244-4540-0
DOI :
10.1109/NLPKE.2009.5313727