مرکز منطقه ای اطلاع رساني علوم و فناوري - Vietnamese large vocabulary continuous speech recognition

DocumentCode :

2974153

Title :

Vietnamese large vocabulary continuous speech recognition

Author :

Vu, Ngoc Thang ; Schultz, Tanja

Author_Institution :

Cognitive Syst. Lab. (CSL), Univ. of Karlsruhe, Karlsruhe, Germany

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

333

Lastpage :

338

Abstract :

We report on our recent efforts toward a large vocabulary Vietnamese speech recognition system. In particular, we describe the Vietnamese text and speech database recently collected as part of our GlobalPhone corpus. The data was complemented by a large collection of text data crawled from various Vietnamese websites. To bootstrap the Vietnamese speech recognition system we used our Rapid Language Adaptation scheme applying a multilingual phone inventory. After initialization we investigated the peculiarities of the Vietnamese language and achieved significant improvements by implementing different tone modeling schemes, extended by pitch extraction, handling multiwords to address the monosyllable structure of Vietnamese, and featuring language modeling based on 5-grams. Furthermore, we addressed the issue of dialectal variations between South and North Vietnam by creating dialect dependent pronunciations and including dialect in the context decision tree of the recognizer. Our currently best recognition system achieves a word error rate of 11.7% on read newspaper speech.

Keywords :

database management systems; decision trees; speech recognition; GlobalPhone corpus; Vietnamese large vocabulary continuous speech recognition; Vietnamese text; Vietnamese websites; context decision tree; multilingual phone inventory; pitch extraction; rapid language adaptation scheme; speech database; Data mining; Databases; Decision trees; Error analysis; Natural languages; Speech analysis; Speech processing; Speech recognition; Text recognition; Vocabulary;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373424

Filename :

5373424

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2974153