• DocumentCode
    2308616
  • Title

    Integrating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification

  • Author

    Tong, Rong ; Ma, Bin ; Zhu, Donglai ; Li, Haizhou ; Chng, Eng Siong

  • Author_Institution
    Inst. for Infocomm Res.
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    The fundamental issue of the automatic language identification is to explore the effective discriminative cues for languages. This paper studies the fusion of five features at different level of abstraction for language identification, including spectrum, duration, pitch, n-gram phonotactic, and bag-of-sounds features. We build a system and report test results on NIST 1996 and 2003 LRE datasets. The system is also built to participate in NIST 2005 LRE. The experiment results show that different levels of information provide complementary language cues. The prosodic features are more effective for shorter utterances while the phonotactic features work better for longer utterances. For the task of 12 languages, the system with fusion of five features achieved 2.38% EER for 30-sec speech segments on NIST 1996 dataset
  • Keywords
    natural languages; speech recognition; 2003 LRE datasets; NIST 1996; acoustic features; automatic language identification; bag-of-sounds features; n-gram phonotactic; phonotactic features; prosodic features; spoken language identification; Acoustical engineering; Character recognition; Computational complexity; Data mining; Humans; NIST; Natural languages; Speech recognition; System testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1659993
  • Filename
    1659993