Title :
Bayesian semi-supervised audio event transcription based on Markov indian buffet process
Author :
Ohishi, Yasutake ; Mochihashi, Daichi ; Matsui, Takashi ; Nakano, M. ; Kameoka, Hirokazu ; Izumitani, Tomonori ; Kashino, Kunio
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Keihanna Science City, Japan
Abstract :
We present a novel generative model for audio event transcription that recognizes “events” on audio signals including multiple kinds of overlapping sounds. In the proposed model, firstly, the overlapping audio events are modeled based on nonnegative matrix factorization into which Bayesian nonparametric approaches: the Markov Indian buffet process and the Chinese restaurant process, are incorporated. This approach allows us to automatically transcribe the events while avoiding the model selection problem by assuming a countably infinite number of possible audio events in the input signal. Then, Bayesian logistic regression annotates the audio frames with the multiple event labels in a semi-supervised learning setup. Experimental results show that our model can better annotate an audio signal in comparison with a baseline method. Additionally, we verify that our infinite generative model is also able to detect unknown audio events that are not included in the training data.
Keywords :
Bayes methods; audio signal processing; matrix decomposition; regression analysis; Bayesian logistic regression; Bayesian nonparametric approach; Bayesian semisupervised audio event transcription; Chinese restaurant process; Markov Indian buffet process; audio signals; nonnegative matrix factorization; overlapping sounds; Acoustics; Bayes methods; Event detection; Hidden Markov models; Logistics; Markov processes; Speech; Audio event transcription; Bayesian nonparametric approach; Generative model; Nonnegative matrix factorization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6638241