DocumentCode :
76970
Title :
Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models
Author :
Ziqiang Shi ; Jiqing Han ; Tieran Zheng ; Ji Li
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
Volume :
21
Issue :
3
fYear :
2013
fDate :
Mar-13
Firstpage :
611
Lastpage :
623
Abstract :
In this paper, we generalize the Gaussian Mixture Model (GMM) in two ways: a) by introducing novel distance measures between two vectors based on nonlinear maps to give more general mixture models; b) by building mixture models based on multiple different kinds of distributions. These two generalizations cope with different problems arisen in feature modeling. Mixture model obtained by first method is called pseudo Gaussian Mixture Model (pseudo GMM). Compared to the traditional GMM, pseudo GMM with nonlinear maps have better performance on nonlinear problems, while the computational complexity is almost the same as the Expectation-Maximization (EM) algorithm for traditional GMM according to the iteration procedures. The second generalization considers that in practice the practical learning problem often involves multiple, heterogeneous data sources, while classical mixture models are based on a single kind of distribution. In this work, we consider heterogeneous mixture models (hetMM) based on multiple different kinds of distributions. Different types of distributions in hetMM may have quite different properties and may capture different features of the data. Component classifiers including pseudo and hetMM based classifiers are employed in our task of erotic audio recognition. Experimental results with classifiers built based on pseudo GMM and hetMM for erotic audio recognition demonstrate the effectiveness of the proposed model. Online and off-line experiments show that the proposed approach is highly effective for erotic audio recognition.
Keywords :
Gaussian distribution; audio signal processing; computational complexity; expectation-maximisation algorithm; iterative methods; learning (artificial intelligence); Gaussian distribution; computational complexity; erotic audio recognition; expectation maximization algorithm; feature modeling; hetMM; heterogeneous data source; heterogeneous mixture model; iteration procedure; learning problem; nonlinear maps; nonlinear problem; objectionable audio segment identification; pseudo GMM; pseudo Gaussian mixture model; Computational modeling; Data models; Kernel; Speech; Speech processing; Ensemble classifiers; Gaussian mixture model; SVM; erotic audio; expectation-maximization (EM) algorithm; logistic distribution; pseudo GMM; student\´s $t$-distribution; voiced fragment;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2012.2229980
Filename :
6362181
Link To Document :
بازگشت