Message-driven speech recognition and topic-word extraction

Author

Ohtsuki, K. ; Furui, S. ; Iwasaki, Akira ; Sakurai, N.

Author_Institution

NTT Human Interface Labs., Kanagawa, Japan

Volume

2

fYear

1999

fDate

15-19 Mar 1999

Firstpage

625

Abstract

This paper proposes a new formulation for speech recognition/understanding systems. In which the posteriori probability of a speaker´s message that the speaker intends to address given an observed acoustic sequence is maximized. This is an extension of the current criterion that maximizes the probability of a word sequence. Among the various possible representations, we employ a co-occurrence score of words measured by mutual information as the conditional probability of a word sequence occurring in a given message. The word sequence hypotheses obtained by bigram and trigram language models are rescored using the co-occurrence score. Experimental results show that the word accuracy is improved by this method. Topic-words which represent the content of a speech signal are then extracted from speech recognition results based on the significance score of each word. When five topic-words are extracted for each broadcast-news article, 82.8% of them are correct in average. This paper also proposes a verbalization-dependent language model which is useful for Japanese dictation systems

Keywords

dictation; natural languages; speech recognition; Japanese dictation systems; acoustic sequence; bigram; broadcast-news article; co-occurrence score; conditional probability; message-driven speech recognition; mutual information; posteriori probability; representations; significance score; speech signal; topic-word extraction; topic-words; trigram language model; understanding; verbalization-dependent language model; word sequence; Automatic speech recognition; Broadcasting; Hidden Markov models; Humans; Laboratories; Loudspeakers; Mutual information; Natural languages; Speech processing; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on

Conference_Location

Phoenix, AZ

ISSN

1520-6149

Print_ISBN

0-7803-5041-3

Type

conf

DOI

10.1109/ICASSP.1999.759744

Filename

759744