Title :
Random-forests-based phonetic decision trees for conversational speech recognition
Author :
Xue, Jian ; Zhao, Yunxin
Author_Institution :
Dept. of Comput. Sci., Univ. of Missouri, Columbia, MO
fDate :
March 31 2008-April 4 2008
Abstract :
In this paper we present a novel technique of constructing phonetic decision trees (PDTs) for acoustic modeling in conversational speech recognition. We use random forests (RF) to train a set of PDTs for each phone-state unit and obtain multiple acoustic models accordingly, and we extend the PDT-based state tying to RF-based state-tying. We combine acoustic scores at the model level in decoding search. Several methods are investigated to estimate the weight parameters for model combination, including maximum likelihood estimation of the weights from training data, as well as using confidence scores of P-value or relative entropy to obtain the weights dynamically from online data. Experimental results on a telemedicine automatic captioning task demonstrate that the proposed RF-PDT technique leads to significant improvements in word recognition accuracy.
Keywords :
acoustic signal processing; maximum likelihood estimation; random processes; speech processing; speech recognition; acoustic modeling; maximum likelihood estimation; random forests; random-forests-based phonetic decision trees; speech recognition; telemedicine automatic captioning task; weight parameter estimation; word recognition accuracy; Bagging; Classification tree analysis; Computer science; Decision trees; Maximum likelihood decoding; Radio frequency; Sampling methods; Speech recognition; Training data; Voting; Random Forests; acoustic modeling; phonetic decision trees; score combination;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518573