DocumentCode :
3585032
Title :
Effective data-driven feature learning for detecting name errors in automatic speech recognition
Author :
Ji He ; Marin, Alex ; Ostendorf, Mari
Author_Institution :
Electr. Eng., Univ. of Washington, Seattle, WA, USA
fYear :
2014
Firstpage :
230
Lastpage :
235
Abstract :
This paper addresses the problem of detecting name errors in automatic speech recognition (ASR) output. The highly skewed label distributions (i.e. name errors are infrequent), sparse training data, and large number of potential lexical features pose significant challenges for training name error classification systems. Data-driven feature learning is needed for handling multiple languages but is sensitive to over fitting. We address the problem by designing aggregate features using a related (sentence-level name detection) task, and reduce dimensionality of the lexical features using word classes. Experiments on conversational domain data in both English and Iraqi Arabic show that best results are obtained using all feature mapping methods plus feature selection using L1 regularization.
Keywords :
feature selection; learning (artificial intelligence); natural language processing; pattern classification; speech recognition; English; Iraqi Arabic; L1 regularization; automatic speech recognition output; data-driven feature learning; feature selection; lexical feature dimensionality; name error detection problem; sentence-level name detection; skewed label distributions; sparse training data; training name error classification systems; word classes; Context; Data models; Feature extraction; Recurrent neural networks; Training; Training data; Vocabulary; auxiliary tasks; feature learning; name error detection; word classes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2014 IEEE
Type :
conf
DOI :
10.1109/SLT.2014.7078579
Filename :
7078579
Link To Document :
بازگشت