• DocumentCode
    677147
  • Title

    Grapheme Gaussian model and prosodic syllable based Tamil speech recognition system

  • Author

    Ganesh, Akila A. ; Ravichandran, C.

  • Author_Institution
    Dept. of Comput. Sci., D.J. Acad. for Manage. Excellence, Coimbatore, India
  • fYear
    2013
  • fDate
    12-14 Dec. 2013
  • Firstpage
    401
  • Lastpage
    406
  • Abstract
    Automatic Speech Recognition is an active field of research to identify speech patterns for providing the equivalent text. The challenges in automatic speech recognition start right from choosing the appropriate unit for the language being dealt with. Some of the units that could be used are word, phoneme, triphone, syllable, demisyllable, senone and morpheme. A syllable based Speech model is used which reduces the vocabulary size and also easier to align. This also suits Tamil, which is a syllable based language. In the model described in this paper, the connected word inputs are segmented into individual words using short term energy and the isolated words are further broken down into characters using Varied-Length Maximum Likelihood (VLML) algorithm. Gaussian Mixture Model (GMM), which is a speaker-independent model suitable for large sets of data, is used for classifying the characters for later pattern matching against the trained syllables. This paper introduces a new algorithm named VLML algorithm that is used for identifying the boundary of each character and explains the proposed process of speech recognition system for Tamil language.
  • Keywords
    Gaussian processes; Unified Modeling Language; mixture models; natural language processing; pattern matching; speech recognition; GMM; Gaussian mixture model; Tamil language; VLML algorithm; automatic speech recognition; demisyllable unit; grapheme Gaussian model; isolated words; morpheme unit; pattern matching; phoneme unit; prosodic syllable based Tamil speech recognition system; senone unit; short term energy; speaker-independent model; speech patterns; syllable based language; syllable based speech model; triphone unit; varied-length maximum likelihood algorithm; vocabulary size; word unit; Classification algorithms; Feature extraction; Gaussian mixture model; Hidden Markov models; Speech; Speech recognition; Formants; Gaussian Mixture Model (GMM); Hidden Markov Toolkit (HTK); Prosodic Syllable; Varied Length Maximum Likelihood (VLML);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communication (ICSC), 2013 International Conference on
  • Conference_Location
    Noida
  • Print_ISBN
    978-1-4799-1605-4
  • Type

    conf

  • DOI
    10.1109/ICSPCom.2013.6719821
  • Filename
    6719821