Title :
Grapheme Gaussian model and prosodic syllable based Tamil speech recognition system
Author :
Ganesh, Akila A. ; Ravichandran, C.
Author_Institution :
Dept. of Comput. Sci., D.J. Acad. for Manage. Excellence, Coimbatore, India
Abstract :
Automatic Speech Recognition is an active field of research to identify speech patterns for providing the equivalent text. The challenges in automatic speech recognition start right from choosing the appropriate unit for the language being dealt with. Some of the units that could be used are word, phoneme, triphone, syllable, demisyllable, senone and morpheme. A syllable based Speech model is used which reduces the vocabulary size and also easier to align. This also suits Tamil, which is a syllable based language. In the model described in this paper, the connected word inputs are segmented into individual words using short term energy and the isolated words are further broken down into characters using Varied-Length Maximum Likelihood (VLML) algorithm. Gaussian Mixture Model (GMM), which is a speaker-independent model suitable for large sets of data, is used for classifying the characters for later pattern matching against the trained syllables. This paper introduces a new algorithm named VLML algorithm that is used for identifying the boundary of each character and explains the proposed process of speech recognition system for Tamil language.
Keywords :
Gaussian processes; Unified Modeling Language; mixture models; natural language processing; pattern matching; speech recognition; GMM; Gaussian mixture model; Tamil language; VLML algorithm; automatic speech recognition; demisyllable unit; grapheme Gaussian model; isolated words; morpheme unit; pattern matching; phoneme unit; prosodic syllable based Tamil speech recognition system; senone unit; short term energy; speaker-independent model; speech patterns; syllable based language; syllable based speech model; triphone unit; varied-length maximum likelihood algorithm; vocabulary size; word unit; Classification algorithms; Feature extraction; Gaussian mixture model; Hidden Markov models; Speech; Speech recognition; Formants; Gaussian Mixture Model (GMM); Hidden Markov Toolkit (HTK); Prosodic Syllable; Varied Length Maximum Likelihood (VLML);
Conference_Titel :
Signal Processing and Communication (ICSC), 2013 International Conference on
Conference_Location :
Noida
Print_ISBN :
978-1-4799-1605-4
DOI :
10.1109/ICSPCom.2013.6719821