• DocumentCode
    3425258
  • Title

    Improving letter-to-sound conversion performance with automatically generated new words

  • Author

    You, Jia-Li ; Chen, Yi-Ning ; Soong, Frank K. ; Wang, Jin-Lin

  • Author_Institution
    Inst. of Acoust., Chinese Acad. of Sci., Beijing
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    4653
  • Lastpage
    4656
  • Abstract
    We propose a novel way to alleviate the data sparseness problem in training letter-to-sound (LTS) N-gram models by adding automatically generated new words to the training set. The proposed method consists of two procedures: (1) generating a large pool of new words automatically; (2) selecting good new word candidates from the new word pool via semi-supervised learning. The new words are created by replacing stressed syllables of an existing word with other stressed syllables under specified contextual constraints. The new word selection by semi-supervised learning is based upon consistent pronunciation predictions by different LTS models. After adding new words to the training set, the performance of LTS conversion is significantly improved. For the NetTalk dictionary, compared with the performance from the N-gram baseline model, 21.6% relative word error rate reduction is obtained. For the CMU dictionary, 9.1% and 5.6% relative word error rate reductions are obtained, respectively, with/without considering the stress.
  • Keywords
    learning (artificial intelligence); natural language processing; speech processing; N-gram baseline model; NetTalk dictionary; consistent pronunciation predictions; data sparseness problem; letter-to-sound N-gram models; letter-to-sound conversion performance; new word pool; new word selection; semi-supervised learning; Acoustics; Asia; Decision trees; Dictionaries; Error analysis; Hidden Markov models; Predictive models; Semisupervised learning; Speech; Stress; Letter-to-Sound; artificial data; data sparseness; semi-supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4518694
  • Filename
    4518694