• DocumentCode
    2661364
  • Title

    Acoustic modeling in mandarin speech recognition of minority accent in Yunnan

  • Author

    Peishan, Wu ; Jian, Yang

  • Author_Institution
    Sch. of Inf. Sci. & Technol., Yunnan Univ., Kunming
  • fYear
    2008
  • fDate
    16-18 July 2008
  • Firstpage
    526
  • Lastpage
    530
  • Abstract
    The dialectal and nonnative accents of speakers are challenge questions when spreading and developing the Mandarin speech recognition system. This paper describes an integrated way which combines the rule-based data-driven (DD) method with the expertspsila knowledge to make acoustic models in automatic speech recognition (ASR). The aim is to get regular pairs of the pronunciation variation by statistics. Then, on the basis of this, we can construct the preliminary scheme of mandarin multi-pronunciation dictionary for minority accent in Yunnan. The combined method consists of the following steps. Firstly, baseline hidden Markov models (HMM) were trained by using the project 863 standard Mandarin corpus. Secondly, the non-native speech data from Dai area, Lisu area and Naxi area in Yunnan was transcribed with the baseline HMMs. In addition, the transcribed result was aligned with the reference transcription through dynamic programming. After calculating of the confusion matrix, we analyze the error pairs due to substitute error at the level of base syllables, initials and finals. Next, we consider the regular mandarin pronunciation variation of national language in Yunnan. Many interesting and useful linguistic phenomena which are necessary for the advancement of nonnative Mandarin speech recognition technology were observed in our experiments.
  • Keywords
    dynamic programming; hidden Markov models; matrix algebra; natural languages; speech recognition; Dai area; Lisu area; Mandarin speech recognition; Naxi area; Yunnan; acoustic modeling; automatic speech recognition; confusion matrix; dynamic programming; hidden Markov models; minority accent; rule-based data-driven method; Automatic speech recognition; Dictionaries; Dynamic programming; Error analysis; Hidden Markov models; Loudspeakers; Matrices; Natural languages; Speech recognition; Statistics; Multi-pronunciation Dictionary; National Language in Yunnan; Pronunciation Variation; Speech Recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Control Conference, 2008. CCC 2008. 27th Chinese
  • Conference_Location
    Kunming
  • Print_ISBN
    978-7-900719-70-6
  • Electronic_ISBN
    978-7-900719-70-6
  • Type

    conf

  • DOI
    10.1109/CHICC.2008.4605230
  • Filename
    4605230