Title :
Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software
Author :
Sugiyama, Osamu ; Itoyama, Katsutoshi ; Nakada, Kaoru ; Okuno, Hiroshi G.
Author_Institution :
Grad. Sch. of Inf. Sci. & Eng., Tokyo Inst. of Technol., Tokyo, Japan
Abstract :
With the rise of inexpensive microphone array products and the robot audition software called HARK, we can record and analyze multidirectional sound sources easily. The combination of microphone array and the software enables us to separate, localize, and track multidirectional sound sources. Most of the solutions for accessing these separated sound source information provide clients for interpreting simplified information about the separated sources, but not to directly execute the semantic annotations. Since the multidirectional sound annotation requires simultaneous labeling of separated sound sources and a multidirectional overview of the sources, it is essential to have an efficient way of annotation and an intuitive view of multidirectional sounds. Our proposed sound annotation tool provides drag & drop operation of annotation with a 3D sound source view and also provides annotation autocompletion with a SVM trained with the user´s annotation history. The proposed features enable users to do the annotation task intuitively and confirm its result. We also conducted an evaluation demonstrating the efficiency of annotation done using the tool.
Keywords :
hearing; microphone arrays; multimedia computing; robots; support vector machines; user interfaces; 3D sound source view; HARK robot audition software; SVM; annotation autocompletion; annotation task; inexpensive microphone array products; multidirectional overview; multidirectional sound annotation; multidirectional sound sources; semantic annotations; simplified information; simultaneous labeling; sound annotation tool; sound source information; spatial information; user annotation history; Accuracy; Microphones; Orbits; Robots; Software; Support vector machines; Three-dimensional displays; Audio Annotation; Media Computing; User Interface Design;
Conference_Titel :
Systems, Man and Cybernetics (SMC), 2014 IEEE International Conference on
Conference_Location :
San Diego, CA
DOI :
10.1109/SMC.2014.6974275