Acoustically discriminative training for language models

Author

Kurata, Gakuto ; Itoh, Nobuyasu ; Nishimura, Masafumi

Author_Institution

IBM Res., IBM Japan, Ltd., Yamato

fYear

2009

fDate

19-24 April 2009

Firstpage

4717

Lastpage

4720

Abstract

This paper introduces a discriminative training for language models (LMs) by leveraging phoneme similarities estimated from an acoustic model. To train an LM discriminatively, we needed the correct word sequences and the recognized results that automatic speech recognition (ASR) produced by processing the utterances of those correct word sequences. But, sufficient utterances are not always available. We propose to generate the probable N-best lists, which the ASR may produce, directly from the correct word sequences by leveraging the phoneme similarities. We call this process the ldquoPseudo-ASRrdquo. We train the LM discriminatively by comparing the correct word sequences and the corresponding N-best lists from the Pseudo-ASR. Experiments with real-life data from a Japanese call center showed that the LM trained with the proposed method improved the accuracy of the ASR.

Keywords

speech recognition; training; Japanese call center; Pseudo-ASR; acoustically discriminative training; automatic speech recognition; language models; Acoustic applications; Acoustic transducers; Automatic speech recognition; Decoding; Equations; Laboratories; Natural languages; Telephony; Testing; Discriminative Training; Finite State Transducer; Language Model; Phoneme Similarity;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location

Taipei

ISSN

1520-6149

Print_ISBN

978-1-4244-2353-8

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2009.4960684

Filename

4960684