مرکز منطقه ای اطلاع رساني علوم و فناوري - Bayesian semi-supervised audio event transcription based on Markov indian buffet process

DocumentCode :

1665946

Title :

Bayesian semi-supervised audio event transcription based on Markov indian buffet process

Author :

Ohishi, Yasutake ; Mochihashi, Daichi ; Matsui, Takashi ; Nakano, M. ; Kameoka, Hirokazu ; Izumitani, Tomonori ; Kashino, Kunio

Author_Institution :

NTT Commun. Sci. Labs., NTT Corp., Keihanna Science City, Japan

fYear :

2013

Firstpage :

3163

Lastpage :

3167

Abstract :

We present a novel generative model for audio event transcription that recognizes “events” on audio signals including multiple kinds of overlapping sounds. In the proposed model, firstly, the overlapping audio events are modeled based on nonnegative matrix factorization into which Bayesian nonparametric approaches: the Markov Indian buffet process and the Chinese restaurant process, are incorporated. This approach allows us to automatically transcribe the events while avoiding the model selection problem by assuming a countably infinite number of possible audio events in the input signal. Then, Bayesian logistic regression annotates the audio frames with the multiple event labels in a semi-supervised learning setup. Experimental results show that our model can better annotate an audio signal in comparison with a baseline method. Additionally, we verify that our infinite generative model is also able to detect unknown audio events that are not included in the training data.

Keywords :

Bayes methods; audio signal processing; matrix decomposition; regression analysis; Bayesian logistic regression; Bayesian nonparametric approach; Bayesian semisupervised audio event transcription; Chinese restaurant process; Markov Indian buffet process; audio signals; nonnegative matrix factorization; overlapping sounds; Acoustics; Bayes methods; Event detection; Hidden Markov models; Logistics; Markov processes; Speech; Audio event transcription; Bayesian nonparametric approach; Generative model; Nonnegative matrix factorization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on

Conference_Location :

Vancouver, BC

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2013.6638241

Filename :

6638241

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1665946