مرکز منطقه ای اطلاع رساني علوم و فناوري - Neighbour selection and adaptation for rapid speaker-dependent ASR

DocumentCode :

672329

Title :

Neighbour selection and adaptation for rapid speaker-dependent ASR

Author :

Nallasamy, Udhyakumar ; Fuhs, Mark ; Woszczyna, Monika ; Metze, Florian ; Schultz, Tanja

Author_Institution :

Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2013

fDate :

8-12 Dec. 2013

Firstpage :

Lastpage :

Abstract :

Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speaker independent (SI) systems. However, SD systems require sufficient training data from the target speaker, which is impractical to collect in a short time. We present a technique for training SD models using just few minutes of speaker´s data. We compensate for the lack of adequate speaker-specific data by selecting neighbours from a database of existing speakers who are acoustically close to the target speaker. These neighbours provide ample training data, which is used to adapt the SI model to obtain an initial SD model for the new speaker with significantly lower WER. We evaluate various neighbour selection algorithms on a large-scale medical transcription task and report significant reduction in WER using only 5 mins of speaker-specific data. We conduct a detailed analysis of various factors such as gender and accent in the neighbour selection. Finally, we study neighbour selection and adaptation in the context of discriminative objective functions.

Keywords :

learning (artificial intelligence); speaker recognition; SD training model; SI system; WER; discriminative objective function; large-scale medical transcription task; neighbour adaptation algorithm; neighbour selection algorithm; rapid speaker-dependent ASR system; speaker independent system; speaker-specific data compensation; target speaker database; time 5 min; word error rate; Adaptation models; Computational modeling; Data models; Manuals; Silicon; Training; Training data; Speech recognition; acoustic modeling; data selection approaches; speaker adaptation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location :

Olomouc

Type :

conf

DOI :

10.1109/ASRU.2013.6707706

Filename :

6707706

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=672329