DocumentCode
1749654
Title
Acoustic synthesis of training data for speech recognition in living room environments
Author
Stahl, Volker ; Fischer, Alexander ; Bippus, Rolf
Author_Institution
Philips Res. Lab., Aachen, Germany
Volume
1
fYear
2001
fDate
2001
Firstpage
285
Abstract
Despite continuous progress in robust automatic speech recognition acoustic mismatch between training and test conditions is still a major problem. Consequently, large speech collections must be conducted in many environments. An alternative approach is to generate training data synthetically by filtering clean speech with impulse responses and/or adding noise signals from the target domain. We compare the performance of a speech recognizer trained on recorded speech in the target domain with a system trained on suitably transformed clean speech. In order to obtain comparable results, our experiments are based on two channel recordings with a close talk and a distant microphone which produce the clean signal and the target domain signal respectively. By filtering and adding noise we obtain error rates which are only 10% higher for natural number recognition and 30% higher for a command recognition task compared to training with target domain data
Keywords
acoustic signal processing; architectural acoustics; filtering theory; microphones; speech recognition; speech synthesis; white noise; acoustic mismatch; acoustic synthesis; channel recordings; clean speech filtering; close talk microphone; colored noise; command recognition task; convolution; distant microphone; error rates; impulse response; living room environments; natural number recognition; noise signals; recognition accuracy; recorded speech; robust automatic speech recognition; speech recognizer performance; target domain signal; test conditions; training conditions; training data generation; transformed clean speech; white noise; whole word recognition; Acoustic testing; Automatic speech recognition; Automatic testing; Filtering; Noise robustness; Signal generators; Speech enhancement; Speech recognition; Target recognition; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940823
Filename
940823
Link To Document