Improved HMM training and scoring strategies with application to accent classification

Author

Arslan, Levent M. ; Hansen, John H L

Author_Institution

Robust Speech Processing Lab., Duke Univ., Durham, NC, USA

Volume

2

fYear

1996

Firstpage

589

Abstract

We propose two methods to improve HMM speech recognition performance. The first method employs an adjustment in the training stage, whereas the second method employs it in the scoring stage. It is well known that a speech recognition system performance increases when the amount of labeled training data is large. However, due to factors such as inaccurate phonetic labeling, end-point detection, and voiced-unvoiced decisions, the labeling procedure can be prone to errors. We propose a selective hidden Markov model (HMM) training procedure in order to reduce the adverse influence of atypical training data on the generated models. To demonstrate its usefulness, selective training is applied to the problem of accent classification, resulting in a 9.4% improvement in classification error rate. The second goal is to improve HMM scoring performance. The objective of HMM training algorithms is to maximize the probability over the training tokens for each model. However, this does not guarantee a minimized error rate across the entire model set. Typically, biases in the confusion matrices can be observed. We propose a method for estimating the bias from input training data, and incorporating it into the general scoring algorithm. Using this technique, a 9.8% improvement is achieved in accent classification error rate

Keywords

hidden Markov models; matrix algebra; probability; speech recognition; HMM scoring; HMM training; HMM training algorithms; accent classification error rate; atypical training data; bias estimation; confusion matrices; end-point detection; input training data; labeled training data; phonetic labeling; probability; scoring algorithm; scoring performance; selective hidden Markov model; speech recognition system performance; training tokens; voiced-unvoiced decisions; Artificial neural networks; Error analysis; Hidden Markov models; Labeling; Laboratories; Robustness; Speech processing; Speech recognition; System performance; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-3192-3

Type

conf

DOI

10.1109/ICASSP.1996.543189

Filename

543189