On jointly learning the parameters in a character-synchronous integrated speech and language model

Author

Chiang, Tung-Hui ; Lin, Yi-Chung ; Su, Keh-Yih

Author_Institution

Comput. & Commun. Res. Labs., Ind. Technol. Res. Inst., Hsinchu, Taiwan

Volume

4

Issue

3

fYear

1996

fDate

5/1/1996 12:00:00 AM

Firstpage

167

Lastpage

189

Abstract

A joint learning algorithm is proposed in this paper to enhance the integrated speech and language model operating in the character-synchronous mode. A character-synchronous approach is first proposed to integrate speech and language information, including morphology and parts-of-speech, for ranking the candidates right after each Chinese character is uttered. Since the search space is cut down very efficiently by applying high-level knowledge in early time, the character-synchronous score function enables our system to operate in real time. To further enhance system performance, a joint learning algorithm is then derived to adjust all parameters of the speech and language processing modules simultaneously, according to their contributions to the overall discrimination power, to minimize the error rate. The proposed approaches are compared with a baseline system, which directly couples an HMM-based speech recognizer with a bigram language model. The performance of 75.71% character accuracy rate is obtained for the baseline system when it is tested on the task of recognizing 1000 Chinese spoken sentences with a very large vocabulary (90495 words) in the speaker-independent isolated-character mode. The character accuracy rate is improved to 88.26% with the character-synchronous approach, and 94.16% after the joint learning algorithm is applied

Keywords

error statistics; hidden Markov models; natural languages; parameter estimation; speech recognition; Chinese character; Chinese spoken sentences; character accuracy rate; character-synchronous approach; character-synchronous integrated speech-language model; character-synchronous score function; discrimination power; error rate; high-level knowledge; joint learning algorithm; language processing modules; morphology; parts-of-speech; search space; speaker-independent isolated-character mode; system performance; Error analysis; Hidden Markov models; Morphology; Natural languages; Power system modeling; Real time systems; Speech enhancement; Speech processing; Speech recognition; System performance;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.496214

Filename

496214