DocumentCode :
2014889
Title :
Audio-visual vibraphone transcription in real time
Author :
Tavares, Tiago F. ; Odowichuck, Gabrielle ; Zehtabi, Sonmaz ; Tzanetakis, George
Author_Institution :
Dept. of Comput. Sci., Univ. of Victoria, Victoria, BC, Canada
fYear :
2012
fDate :
17-19 Sept. 2012
Firstpage :
215
Lastpage :
220
Abstract :
Music transcription refers to the process of detecting musical events (typically consisting of notes, starting times and durations) from an audio signal. Most existing work in automatic music transcription has focused on offline processing. In this work we describe our efforts in building a system for real time music transcription for the vibraphone. We describe experiments with three audio-based methods for music transcription that are representative of the state of the art. One method is based on multiple pitch estimation and the other two methods are based on factorization of the audio spectrogram. In addition we show how information from a video camera can be used to impose constraints on the symbol search space based on the gestures of the performer. Experimental results with various system configurations show that this multi-modal approach leads to a significant reduction of false positives and increases the overall accuracy. This improvement is observed for all three audio methods, and indicates that visual information is complimentary to the audio information in this context.
Keywords :
audio signal processing; audio-visual systems; estimation theory; music; real-time systems; search problems; video cameras; video signal processing; audio information; audio methods; audio signal; audio spectrogram factorization; audio-based methods; audio-visual vibraphone transcription; automatic music transcription; false positives; multimodal approach; multiple pitch estimation; musical event detection; offline processing; real time music transcription; symbol search space; system configurations; video camera; visual information; Acoustics; Algorithm design and analysis; Cameras; Computer vision; Harmonic analysis; Instruments; Noise; Audiovisual; Music; Transcription;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing (MMSP), 2012 IEEE 14th International Workshop on
Conference_Location :
Banff, AB
Print_ISBN :
978-1-4673-4570-5
Electronic_ISBN :
978-1-4673-4571-2
Type :
conf
DOI :
10.1109/MMSP.2012.6343443
Filename :
6343443
Link To Document :
بازگشت