Combination of data borrowing strategies for low-resource LVCSR

Author

Yanmin Qian ; Kai Yu ; Jia Liu

Author_Institution

Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China

fYear

2013

fDate

8-12 Dec. 2013

Firstpage

404

Lastpage

409

Abstract

Large vocabulary continuous speech recognition (LVCSR) is particularly difficult for low-resource languages, where only very limited manually transcribed data are available. However, it is often feasible to obtain large amount of untranscribed data of the low-resource target language or sufficient transcribed data of some non-target languages. Borrowing data from these additional sources to help LVCSR for low-resource language becomes an important research direction. This paper presents an integrated data borrowing framework in this scenario. Three data borrowing approaches were first investigated in detail, including feature, model and data corpus. They borrow data at different levels from additional sources, and all get substantial performance improvements. As these strategies work independently, the obtained gains are likely additive. The three strategies are then combined to form an integrated data borrowing framework. Experiments showed that with the integrated data borrowing framework, significant improvement of more than 10% absolute WER reduction over a conventional baseline was obtained. In particular, the gain under the extreme limited low-resource scenario is 16%.

Keywords

speech recognition; vocabulary; LVCSR; data borrowing strategies; data corpus; integrated data borrowing framework; large vocabulary continuous speech recognition; low-resource languages; manually transcribed data; untranscribed data; Data models; Detectors; Feature extraction; Hidden Markov models; Speech; Speech recognition; Training; Articulatory feature; Data borrowing; Low resource speech recognition; Subspace Gaussian mixture models; Unsupervised training;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on

Conference_Location

Olomouc

Type

conf

DOI

10.1109/ASRU.2013.6707764

Filename

6707764