Title :
Sound source localization for video surveillance camera
Author :
Stachurski, Jacek ; Netsch, Lorin ; Cole, Robert
Author_Institution :
Embedded Signal Process. R&D Lab., Texas Instrum., Dallas, TX, USA
Abstract :
While video analytics used in surveillance applications performs well in normal conditions, it may not work as accurately under adverse circumstances. Taking advantage of the complementary aspects of video and audio can lead to a more effective analytics framework resulting in increased system robustness. For example, sound scene analysis may indicate potential security risks outside field-of-view, pointing the camera in that direction. This paper presents a robust low-complexity method for two-microphone estimation of sound direction. While the source localization problem has been studied extensively, a reliable low-complexity solution remains elusive. The proposed direction estimation is based on the Generalized Cross-Correlation with Phase Transform (GCC-PHAT) method. The novel aspects of our approach include band-selective processing and inter-frame filtering of the GCC-PHAT objective function prior to peak detection. The audio bandwidth, microphone spacing, angle resolution, processing delay and complexity can all be adjusted depending on the application requirements. The described algorithm can be used in a multi-microphone configuration for spatial sound localization by combining estimates from microphone pairs. It has been implemented as a real-time demo on a modified TI DM8127 IP camera. The default 16 kHz audio sampling frequency requires about 5 MIPS processing power in our fixed-point implementation. The test results show robust sound direction estimation under a variety of background noise conditions.
Keywords :
acoustic generators; filtering theory; microphones; security; video surveillance; GCC-PHAT method; GCC-PHAT objective function; angle resolution; audio bandwidth; background noise conditions; band-selective processing; generalized cross-correlation with phase transform method; inter-frame filtering; low-complexity method; low-complexity solution; microphone pairs; microphone spacing; multimicrophone configuration; peak detection; processing delay; robust sound direction estimation; security risks; sound scene analysis; sound source localization; source localization problem; spatial sound localization; surveillance applications; two-microphone estimation; video analytics; video surveillance camera; Adaptive filters; Cameras; Estimation; Microphones; Noise measurement; Signal to noise ratio;
Conference_Titel :
Advanced Video and Signal Based Surveillance (AVSS), 2013 10th IEEE International Conference on
Conference_Location :
Krakow
DOI :
10.1109/AVSS.2013.6636622