A study on multilingual acoustic modeling for large vocabulary ASR

Author

Lin, Hui ; Deng, Li ; Yu, Dong ; Gong, Yi-fan ; Acero, Alex ; Lee, Chin-Hui

Author_Institution

Univ. of Washington, Washington, DC

fYear

2009

fDate

19-24 April 2009

Firstpage

4333

Lastpage

4336

Abstract

We study key issues related to multilingual acoustic modeling for automatic speech recognition (ASR) through a series of large-scale ASR experiments. Our study explores shared structures embedded in a large collection of speech data spanning over a number of spoken languages in order to establish a common set of universal phone models that can be used for large vocabulary ASR of all the languages seen or unseen during training. Language-universal and language-adaptive models are compared with language-specific models, and the comparison results show that in many cases it is possible to build general-purpose language-universal and language-adaptive acoustic models that outperform language-specific ones if the set of shared units, the structure of shared states, and the shared acoustic-phonetic properties among different languages can be properly utilized. Specifically, our results demonstrate that when the context coverage is poor in language-specific training, we can use one tenth of the adaptation data to achieve equivalent performance in cross-lingual speech recognition.

Keywords

computational linguistics; speech recognition; vocabulary; acoustic-phonetic property; automatic speech recognition; cross-lingual speech recognition; language-adaptive acoustic model; language-specific model; language-universal acoustic model; large vocabulary ASR; multilingual acoustic modeling; universal phone model; Adaptation model; Automatic speech recognition; Context modeling; Impurities; Large-scale systems; Natural languages; Speech recognition; Training data; Uninterruptible power systems; Vocabulary; Multilingualism; acoustic modeling; language adaptation; universal phone models;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location

Taipei

ISSN

1520-6149

Print_ISBN

978-1-4244-2353-8

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2009.4960588

Filename

4960588