DocumentCode :
417198
Title :
Enhanced standard compliant distributed speech recognition (Aurora encoder) using rate allocation
Author :
Srinivasamurthy, Naveen ; Ortega, Antonio ; Narayanan, Shrikanth
Author_Institution :
Integrated Media Syst. Center, Univ. of Southern California, Los Angeles, CA, USA
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
The paper proposes modifications to improve the recognition performance obtainable by the ETSI standard distributed speech recognition encoder, Aurora (ES 201 108, 2000). The proposed modifications are standard compliant, i.e., they require no algorithmic modifications to the Aurora operation. Performance improvements are achieved by distributing the available bit budget among Aurora´s seven (different) 2-dimension vector quantizers (VQs) more efficiently. Improved bit-allocation to the different sub-vectors is achieved by incorporating the importance for recognition of each of the sub-vectors into the bit-allocation algorithm. The available bits are efficiently distributed among the sub-vectors by allocating a larger fraction of the available bits to the more important sub-vectors and hence maximizing recognition accuracy. The proposed bit-allocation algorithm is based on a novel mutual information (MI) measure. The MI measure quantifies the information content between a sub-vector and the class label and hence is a good indicator of the importance of the coefficient for recognition. It is shown that the proposed MI based method outperforms both the standard Aurora encoder and an encoder designed using traditional mean square error based bit-allocation. For the TIDIGITS connected digits recognition task, a 15.2% relative decrease in word error rate (WER) is possible with the proposed modified MI based Aurora encoder when compared to the recognition performance achieved using the standard Aurora encoder.
Keywords :
error statistics; optimisation; speech coding; speech recognition; vector quantisation; vocoders; Aurora encoder; WER; bit-allocation; connected digits recognition; enhanced distributed speech recognition encoder; mean square error; mutual information measure; rate allocation; recognition accuracy maximization; standard compliant distributed speech recognition encoder; vector quantizers; word error rate; Bandwidth; Cellular phones; Degradation; Error analysis; Mean square error methods; Mutual information; Personal digital assistants; Speech recognition; Systems engineering and theory; Telecommunication standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326028
Filename :
1326028
Link To Document :
بازگشت