DocumentCode :
3161832
Title :
Comparison and combination of different CRBE based MLP features for LVCSR
Author :
Tüske, Zoltán ; Schlüter, Ralf ; Ney, Hermann
Author_Institution :
Comput. Sci. Dept., RWTH Aachen Univ., Aachen, Germany
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4081
Lastpage :
4084
Abstract :
Multi Layer Perceptron (MLP) features extracted from different types of critical band energies (CRBE) - derived from MFCC, GT, and PLP pipeline - are compared on French broadcast news and conversational speech recognition task. Though the MLP structure is kept fixed, ROVER combination of different CRBE based systems leads to 4% relative improvement. Furthermore, aiming at the combination of state-of-the-art features based on various signal analysis methods into one single stream, posterior feature space based combination technique is proposed. The speaker normalized features originated from different CRBEs are merged after additional MLP training by Dempster-Shafer rule. The performance of these posterior features unifying the different CRBE based features is superior to the best single CRBE based posterior features by 6% relative. Further results reveal that the concatenated cepstral and unified posterior features perform nearly as well as the ROVER combination of the different CRBE based systems.
Keywords :
cepstral analysis; feature extraction; inference mechanisms; multilayer perceptrons; speaker recognition; uncertainty handling; CRBE; Dempster-Shafer rule; French broadcast news; GT pipeline; LVCSR system; MFCC pipeline; MLP structure; PLP pipeline; ROVER; combination technique; concatenated cepstral feature; critical band energies; feature extraction; large vocabulary continuous speech recognition system; multilayer perceptron structure; posterior feature space; signal analysis method; speaker normalized feature; unified posterior feature; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Training; CRBE; Dempster-Shafer; GT; LVCSR; MFCC; MRASTA; PLP;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288815
Filename :
6288815
Link To Document :
بازگشت