DocumentCode :
1668017
Title :
Soft nonnegative matrix co-factorizationwith application to multimodal speaker diarization
Author :
Seichepine, Nicolas ; Essid, Slim ; Fevotte, Cedric ; Cappe, Olivier
Author_Institution :
LTCI, Telecom ParisTech, Paris, France
fYear :
2013
Firstpage :
3537
Lastpage :
3541
Abstract :
This paper presents a new method for bimodal nonnegative matrix factorization (NMF). This method is well-suited to situations where two streams of data are concurrently analyzed and are expected to be related by loosely common factors. It allows for a soft co-factorization, which takes into account the relationship that exists between the modalities being processed, but returns different factors for distinct modalities. There is no need that the data related with each modality live in the same feature space; there is also no need that they have the same dimensionality. The co-factorization is obtained via a majorization-minimization (MM) algorithm. The behavior of the method is illustrated on both synthetic and real-world data. In particular, we show that exploiting the correlation between audio and video modalities in edited talk-show videos improve speaker diarization results.
Keywords :
matrix decomposition; minimisation; speaker recognition; MM algorithm; audio modality; bimodal NMF; bimodal nonnegative matrix factorization; data streams; feature space; loosely common factors; majorization-minimization algorithm; multimodal speaker diarization; soft nonnegative matrix cofactorization; speaker diarization improvement; talk-show videos; video modality; Correlation; Cost function; Histograms; Joints; Minimization; Speech; Nonnegative matrix factorization; co-factorization; multimodality; speaker diarization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638316
Filename :
6638316
Link To Document :
بازگشت