Analysis on individual differences in automatic transcription of spontaneous presentations

Author

Shinozaki, Takahiro ; Furui, Sadaoki

Author_Institution

Tokyo Institute of Technology, Department of Computer Science, 2-12-1 Ookayama, Meguro-ku, 152-8552 Japan

Volume

1

fYear

2002

fDate

13-17 May 2002

Abstract

This paper reports an analysis of individual differences in spontaneous presentation speech recognition performances. Ten minutes from each presentation given by 50 male speakers, for a total of 500 minutes, has been automatically recognized for the analysis. Correlation and regression analyses were applied to the word recognition accuracy and various speaker attributes. A restricted set of the speaker attributes comprising the speaking rate, the out of vocabulary rate and the repair rate was found to be most significant to yield individual differences in the word accuracy. Unsupervised MLLR speaker adaptation worked well for improving the word accuracy but did not change the structure of the individual differences. Approximately half of the variance in the word accuracy was explained by a regression model using the limited set of three attributes.

Keywords

Accuracy; Adaptation model; Hidden Markov models; Pragmatics; Silicon; Strontium; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location

Orlando, FL, USA

ISSN

1520-6149

Print_ISBN

0-7803-7402-9

Type

conf

DOI

10.1109/ICASSP.2002.5743821

Filename

5743821