Grapheme Gaussian model and prosodic syllable based Tamil speech recognition system

Author

Ganesh, Akila A. ; Ravichandran, C.

Author_Institution

Dept. of Comput. Sci., D.J. Acad. for Manage. Excellence, Coimbatore, India

fYear

2013

fDate

12-14 Dec. 2013

Firstpage

401

Lastpage

406

Abstract

Automatic Speech Recognition is an active field of research to identify speech patterns for providing the equivalent text. The challenges in automatic speech recognition start right from choosing the appropriate unit for the language being dealt with. Some of the units that could be used are word, phoneme, triphone, syllable, demisyllable, senone and morpheme. A syllable based Speech model is used which reduces the vocabulary size and also easier to align. This also suits Tamil, which is a syllable based language. In the model described in this paper, the connected word inputs are segmented into individual words using short term energy and the isolated words are further broken down into characters using Varied-Length Maximum Likelihood (VLML) algorithm. Gaussian Mixture Model (GMM), which is a speaker-independent model suitable for large sets of data, is used for classifying the characters for later pattern matching against the trained syllables. This paper introduces a new algorithm named VLML algorithm that is used for identifying the boundary of each character and explains the proposed process of speech recognition system for Tamil language.

Keywords

Gaussian processes; Unified Modeling Language; mixture models; natural language processing; pattern matching; speech recognition; GMM; Gaussian mixture model; Tamil language; VLML algorithm; automatic speech recognition; demisyllable unit; grapheme Gaussian model; isolated words; morpheme unit; pattern matching; phoneme unit; prosodic syllable based Tamil speech recognition system; senone unit; short term energy; speaker-independent model; speech patterns; syllable based language; syllable based speech model; triphone unit; varied-length maximum likelihood algorithm; vocabulary size; word unit; Classification algorithms; Feature extraction; Gaussian mixture model; Hidden Markov models; Speech; Speech recognition; Formants; Gaussian Mixture Model (GMM); Hidden Markov Toolkit (HTK); Prosodic Syllable; Varied Length Maximum Likelihood (VLML);

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing and Communication (ICSC), 2013 International Conference on

Conference_Location

Noida

Print_ISBN

978-1-4799-1605-4

Type

conf

DOI

10.1109/ICSPCom.2013.6719821

Filename

6719821