Title :
Soft nonnegative matrix co-factorizationwith application to multimodal speaker diarization
Author :
Seichepine, Nicolas ; Essid, Slim ; Fevotte, Cedric ; Cappe, Olivier
Author_Institution :
LTCI, Telecom ParisTech, Paris, France
Abstract :
This paper presents a new method for bimodal nonnegative matrix factorization (NMF). This method is well-suited to situations where two streams of data are concurrently analyzed and are expected to be related by loosely common factors. It allows for a soft co-factorization, which takes into account the relationship that exists between the modalities being processed, but returns different factors for distinct modalities. There is no need that the data related with each modality live in the same feature space; there is also no need that they have the same dimensionality. The co-factorization is obtained via a majorization-minimization (MM) algorithm. The behavior of the method is illustrated on both synthetic and real-world data. In particular, we show that exploiting the correlation between audio and video modalities in edited talk-show videos improve speaker diarization results.
Keywords :
matrix decomposition; minimisation; speaker recognition; MM algorithm; audio modality; bimodal NMF; bimodal nonnegative matrix factorization; data streams; feature space; loosely common factors; majorization-minimization algorithm; multimodal speaker diarization; soft nonnegative matrix cofactorization; speaker diarization improvement; talk-show videos; video modality; Correlation; Cost function; Histograms; Joints; Minimization; Speech; Nonnegative matrix factorization; co-factorization; multimodality; speaker diarization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6638316