DocumentCode :
2559956
Title :
Auditory and Visual Integration based Localization and Tracking of Multiple Moving Sounds in Daily-life Environments
Author :
Hyun-Don Kim ; Komatani, K. ; Ogata, Takaaki ; Okuno, Hiroshi G.
Author_Institution :
Speech Media Process. Group, Kyoto Univ., Kyoto, Japan
fYear :
2007
fDate :
26-29 Aug. 2007
Firstpage :
399
Lastpage :
404
Abstract :
This paper presents techniques that enable talker tracking for effective human-robot interaction. To track moving people in daily-life environments, localizing multiple moving sounds is necessary so that robots can locate talkers. However, the conventional method requires an array of microphones and impulse response data. Therefore, we propose a way to integrate a cross-power spectrum phase analysis (CSP) method and an expectation-maximization (EM) algorithm. The CSP can localize sound sources using only two microphones and does not need impulse response data. Moreover, the EM algorithm increases the system´s effectiveness and allows it to cope with multiple sound sources. We confirmed that the proposed method performs better than the conventional method. In addition, we added a particle filter to the tracking process to produce a reliable tracking path and the particle filter is able to integrate audio-visual information effectively. Furthermore, the applied particle filter is able to track people while dealing with various noises that are even loud sounds in the daily-life environments.
Keywords :
audio signal processing; expectation-maximisation algorithm; man-machine systems; particle filtering (numerical methods); robot vision; spectral analysis; auditory based localization; cross-power spectrum phase analysis; daily-life environment; expectation-maximization algorithm; human-robot interaction; impulse response data; microphone array; multiple moving sound tracking; particle filter; visual based localization; Acoustic noise; Algorithm design and analysis; Human robot interaction; Intelligent robots; Microphone arrays; Particle filters; Particle tracking; Streaming media; Surgery; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Robot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on
Conference_Location :
Jeju
Print_ISBN :
978-1-4244-1634-9
Type :
conf
DOI :
10.1109/ROMAN.2007.4415117
Filename :
4415117
Link To Document :
بازگشت