• DocumentCode
    2181291
  • Title

    A nativeness classifier for TED Talks

  • Author

    Lopes, José ; Trancoso, Isabel ; Abad, Alberto

  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5672
  • Lastpage
    5675
  • Abstract
    This paper presents a nativeness classifier for English. The detector was developed and tested with TED Talks collected from the web, where the major non-native cues are in terms of segmental aspects and prosody. The first experiments were made using only acoustic features, with Gaussian supervectors for training a classifier based on support vector machines. These experiments resulted in an equal error rate of 13.11%. The following experiments based on prosodic features alone did not yield good results. However, a fused system, combining acoustic and prosodic cues, achieved an equal error rate of 10.58%. A small human benchmark was conducted, showing an inter-rater agreement of 0.88. This value is also very close to the agreement value between humans and the best fused system.
  • Keywords
    Internet; natural language processing; signal classification; speech processing; support vector machines; vectors; English; Gaussian supervectors; TED talks; Web; automatic speech classification; equal error rate; nativeness classifier; segmental aspects; segmental prosody; support vector machines; Accuracy; Acoustics; Adaptation models; Feature extraction; Hidden Markov models; Humans; Speech; Non-native accent; pronunciation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947647
  • Filename
    5947647