Acoustic model training using committee-based active and semi-supervised learning for speech recognition

Author

Tsutaoka, Takanori ; Shinoda, Kazuma

Author_Institution

Dept. of Comput. Sci., Tokyo Inst. of Technol., Tokyo, Japan

fYear

2012

fDate

3-6 Dec. 2012

Firstpage

1

Lastpage

4

Abstract

We propose an acoustic model training method which combines committee-based active learning and semi-supervised learning for large vocabulary continuous speech recognition. In this method, each untranscribed training utterance is examined by a committee of multiple speech recognizers, and the degree of disagreement in the committee on its transcription is used for selecting utterances. Those utterances the committee members disagree with each other are transcribed for active learning, while those they agree are used for semi-supervised learning. Our method was evaluated using the Corpus of Spontaneous Japanese. It was shown that it achieved higher recognition accuracy with lower transcription costs than random sampling, active learning alone, and semi-supervised learning alone. We also propose a new data selection method called middle selection in semi-supervised learning.

Keywords

acoustic signal processing; learning (artificial intelligence); speech recognition; acoustic model training method; committee-based active learning; semisupervised learning; untranscribed training utterance; vocabulary continuous speech recognition; Acoustics; Hidden Markov models; Semisupervised learning; Speech; Speech recognition; Training; Training data; LVCSR; active learning; query by committee; semi-supervised learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific

Conference_Location

Hollywood, CA

Print_ISBN

978-1-4673-4863-8

Type

conf

Filename

6412028