Title :
Semi-blind speech extraction for robot using visual information and noise statistics
Author :
Saruwatari, Hiroshi ; Hirata, Nobuhisa ; Hatta, Toshiyuki ; Wakisaka, Ryo ; Shikano, Kiyohiro ; Takatani, Tomoya
Author_Institution :
Nara Inst. of Sci. & Technol., Nara, Japan
Abstract :
In this paper, speech recognition accuracy improvement is addressed for ICA-based multichannel noise reduction in spoken-dialogue robot. First, a new permutation solving method using a probability statistics model is proposed for realistic sound mixtures consisting of point-source speech and diffuse noise. Next, to achieve high recognition accuracy for the early utterance of the target speaker, we introduce a new rapid ICA initialization method combining robot video information and a prestored initial separation filter bank. From this image information, an ICA initial filter fitted to the user´s direction can be used to save the user´s first utterance. The experimental results show that the proposed approaches can markedly improve the word recognition accuracy.
Keywords :
filtering theory; independent component analysis; robots; speech recognition; ICA based multichannel noise reduction; diffuse noise; filter bank; image information; noise statistics; point source speech; robot; robot video information; semiblind speech extraction; sound mixtures; speech recognition; spoken dialogue robot; visual information; Accuracy; Arrays; Noise; Robots; Shape; Speech; Speech recognition; Blind source extraction; ICA; Noise reduction; Robot; Speech recognition;
Conference_Titel :
Signal Processing and Information Technology (ISSPIT), 2011 IEEE International Symposium on
Conference_Location :
Bilbao
Print_ISBN :
978-1-4673-0752-9
Electronic_ISBN :
978-1-4673-0751-2
DOI :
10.1109/ISSPIT.2011.6151571