DocumentCode :
2979192
Title :
ACID/HNN: a framework for hierarchical connectionist acoustic modeling
Author :
Fritsch, Jürgen
Author_Institution :
Karlsruhe Univ., Germany
fYear :
1997
fDate :
14-17 Dec 1997
Firstpage :
164
Lastpage :
171
Abstract :
We propose the ACID/HNN framework for context dependent large vocabulary conversational speech recognition (LVCSR) using connectionist acoustic models. Our approach advocates the principles of modularity and hierarchy for the estimation of thousands of context dependent posterior HMM state probabilities. We argue that a hierarchical organization of the acoustic model is crucial in obtaining competitive performance with connectionist estimators. We introduce ACID, an Agglomerative Clustering scheme based on information divergence and use it to induce soft decision trees for hierarchical classification. A Hierarchy of Neural Networks (HNN) is then applied to the estimation of conditional posterior probabilities. We discuss the benefits of hierarchically structured acoustic models for speaker adaptation and scoring speed-up. Finally, we present experiments on the Switchboard conversational telephone speech corpus, currently a major focus of research in the LVCSR community
Keywords :
hidden Markov models; natural languages; probability; speech recognition; trees (mathematics); ACID/HNN framework; Agglomerative Clustering scheme; Hierarchy of Neural Networks; LVCSR; Switchboard conversational telephone speech corpus; competitive performance; conditional posterior probabilities; connectionist acoustic models; connectionist estimators; context dependent large vocabulary conversational speech recognition; context dependent posterior HMM state probability estimation; hierarchical classification; hierarchical connectionist acoustic modeling; hierarchical organization; hierarchically structured acoustic models; information divergence; neural network hierarchy; soft decision trees; speaker adaptation; Adaptation model; Classification tree analysis; Context modeling; Decision trees; Hidden Markov models; Loudspeakers; Neural networks; Speech recognition; State estimation; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 1997. Proceedings., 1997 IEEE Workshop on
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-7803-3698-4
Type :
conf
DOI :
10.1109/ASRU.1997.659001
Filename :
659001
Link To Document :
بازگشت