Title :
Accuracy of MP3 speech recognition under real-word conditions: Experimental study
Author :
Pollak, Petr ; Behunek, Martin
Author_Institution :
Fac. of Electr. Eng., Czech Tech. Univ. in Prague, Prague, Czech Republic
Abstract :
This paper presents the study of speech recognition accuracy with respect to different levels of MP3 compression. Special attention is focused on the processing of speech signals with different quality, i.e. with different level of background noise and channel distortion. The work was motivated by possible usage of ASR for offline automatic transcription of audio recordings collected by standard wide-spread MP3 devices. The realized experiments have proved that although MP3 format is not optimal for speech compression it does not distort speech significantly especially for high or moderate bit rates and high quality of source data. The accuracy of connected digits ASR decreased consequently very slowly up to the bit rate 24 kbps. For the best case of PLP parameterization in close-talk channel just 3% decrease of recognition accuracy was observed while the size of the compressed file was approximately 10% of the original size. All results were slightly worse under presence of additive background noise and channel distortion in a signal but achieved accuracy was also acceptable in this case especially for PLP features.
Keywords :
data compression; noise; speech coding; speech recognition; MP3 compression; MP3 devices; MP3 format; MP3 speech recognition; PLP parameterization; additive background noise; audio recordings; channel distortion; close-talk channel; compressed file; connected digits ASR; offline automatic transcription; speech compression; speech distortion; speech recognition accuracy; speech signal quality; speech signals processing; Accuracy; Digital audio players; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Speech recognition; Training; Channel distortion; MP3; MPEG compression; Noise robustness; Speech recognition;
Conference_Titel :
Signal Processing and Multimedia Applications (SIGMAP), 2011 Proceedings of the International Conference on
Conference_Location :
Seville