DocumentCode
2181291
Title
A nativeness classifier for TED Talks
Author
Lopes, José ; Trancoso, Isabel ; Abad, Alberto
fYear
2011
fDate
22-27 May 2011
Firstpage
5672
Lastpage
5675
Abstract
This paper presents a nativeness classifier for English. The detector was developed and tested with TED Talks collected from the web, where the major non-native cues are in terms of segmental aspects and prosody. The first experiments were made using only acoustic features, with Gaussian supervectors for training a classifier based on support vector machines. These experiments resulted in an equal error rate of 13.11%. The following experiments based on prosodic features alone did not yield good results. However, a fused system, combining acoustic and prosodic cues, achieved an equal error rate of 10.58%. A small human benchmark was conducted, showing an inter-rater agreement of 0.88. This value is also very close to the agreement value between humans and the best fused system.
Keywords
Internet; natural language processing; signal classification; speech processing; support vector machines; vectors; English; Gaussian supervectors; TED talks; Web; automatic speech classification; equal error rate; nativeness classifier; segmental aspects; segmental prosody; support vector machines; Accuracy; Acoustics; Adaptation models; Feature extraction; Hidden Markov models; Humans; Speech; Non-native accent; pronunciation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5947647
Filename
5947647
Link To Document