• DocumentCode
    2423100
  • Title

    Automatic language identification using support vector machines and phonetic N-gram

  • Author

    Deng, Yan ; Liu, Jia

  • Author_Institution
    Dept. of Electron. Eng., Tsinghua Univ., Beijing
  • fYear
    2008
  • fDate
    7-9 July 2008
  • Firstpage
    71
  • Lastpage
    74
  • Abstract
    In this paper, we describe two approaches for language identification (LID) using support vector machines (SVM) and phonetic n-gram. One is to use the language model scores of phone sequences to do SVM training. The other is to use the n-gram probabilities of those phones to train SVM models. For the second approach, we propose a new effective normalization method. In the experiments of 30 s test for 5 languages, our new normalization method shows a relative reduction of 15.8% in terms of equal error rate (EER) compared with the traditional one. And it makes the system using the second approach reaches an EER of 2.4%, a relative reduction of about 35.5% in comparison with the first one. Details of implementation and experimental results are presented in this paper.
  • Keywords
    learning (artificial intelligence); speech processing; speech recognition; support vector machines; SVM training; automatic language identification; equal error rate; n-gram probability; phone sequence; phonetic n-gram; support vector machine; Error analysis; Fuses; Information science; Laboratories; Natural languages; Pattern recognition; Speech; Support vector machine classification; Support vector machines; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Audio, Language and Image Processing, 2008. ICALIP 2008. International Conference on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-1723-0
  • Electronic_ISBN
    978-1-4244-1724-7
  • Type

    conf

  • DOI
    10.1109/ICALIP.2008.4590023
  • Filename
    4590023