Title :
A nativeness classifier for TED Talks
Author :
Lopes, José ; Trancoso, Isabel ; Abad, Alberto
Abstract :
This paper presents a nativeness classifier for English. The detector was developed and tested with TED Talks collected from the web, where the major non-native cues are in terms of segmental aspects and prosody. The first experiments were made using only acoustic features, with Gaussian supervectors for training a classifier based on support vector machines. These experiments resulted in an equal error rate of 13.11%. The following experiments based on prosodic features alone did not yield good results. However, a fused system, combining acoustic and prosodic cues, achieved an equal error rate of 10.58%. A small human benchmark was conducted, showing an inter-rater agreement of 0.88. This value is also very close to the agreement value between humans and the best fused system.
Keywords :
Internet; natural language processing; signal classification; speech processing; support vector machines; vectors; English; Gaussian supervectors; TED talks; Web; automatic speech classification; equal error rate; nativeness classifier; segmental aspects; segmental prosody; support vector machines; Accuracy; Acoustics; Adaptation models; Feature extraction; Hidden Markov models; Humans; Speech; Non-native accent; pronunciation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947647