مرکز منطقه ای اطلاع رساني علوم و فناوري - Multilingual a-stabil: A new confidence score for multilingual unsupervised training

DocumentCode :

2329972

Title :

Multilingual a-stabil: A new confidence score for multilingual unsupervised training

Author :

Vu, Ngoc Thang ; Kraus, Franziska ; Schultz, Tanja

Author_Institution :

Cognitive Syst. Lab., Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany

fYear :

2010

fDate :

12-15 Dec. 2010

Firstpage :

183

Lastpage :

188

Abstract :

This paper presents our work in Automatic Speech Recognition (ASR) in the context of multilingual unsupervised training with application to Czech. Starting without any transcribed acoustic training data we built a Czech ASR by combining cross-language bootstrapping and confidence based unsupervised training. We present our new method called “multilingual A-stabil” to compute confidence scores and explore the relative effectiveness of acoustic models from more than one language such as Russian, Bulgarian, Polish and Croatian for unsupervised training. While conventional confidence measures such as gamma and A-stabil work well with well-trained acoustic models but have problems with poorly estimated acoustic models, our new method works well in both cases. We describe our multilingual unsupervised training framework which gives very promising results in our experiments. We were able to select 80.5% of the audio training data (18.5 hours) with a transcription WER of 14.5% when using a small amount of untranscribed data (only about 23 hours). The final best WER on Czech is 23.6% on the development set and 22.9% on the evaluation set by using cross-lingual boostrapping, which is very close to the performance of the Czech ASR trained with 23 hours audio data with manual transcriptions (23.1% on the development set and 22.3% on the evaluation set).

Keywords :

speech recognition; unsupervised learning; automatic speech recognition; confidence score; cross language bootstrapping; multilingual A-stabil; multilingual unsupervised training; transcribed acoustic training data; confidence score; multilingual ASR; unsupervised training;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2010 IEEE

Conference_Location :

Berkeley, CA

Print_ISBN :

978-1-4244-7904-7

Electronic_ISBN :

978-1-4244-7902-3

Type :

conf

DOI :

10.1109/SLT.2010.5700848

Filename :

5700848

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2329972