DocumentCode :
1576932
Title :
Mixed Type Audio Classification using Sinusoidal Parameters
Author :
Mahale, P. Mowlaee Begzade ; Sayadiyan, A. ; Faez, K.
Author_Institution :
Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran
fYear :
2008
Firstpage :
1
Lastpage :
5
Abstract :
A preprocessing stage in every audio application including music/speech separation, speech or speaker recognition and audio transcription task is inevitable to determine each frame belongs to which classes, namely: speech only, music only and finally mixture. Such classification can significantly lower the computational complexity due to factorial search commonly used in many model-based systems including monaural separation systems as well as music transcription scenarios. In this paper, employing sinusoidal parameters obtained by a fixed dimension modified sinusoidal model (FDMSM) already proposed in [7] a new classification approach is proposed to separate mixed type audio frames based on Support Vector Machine (SVM) and Relevance Vector Machine (RVM). The challenging problem in this work is seeking the most appropriate features to discriminate the underlying classes. As a result, we employ some unsupervised feature selection procedure to determine which feature to select to get the best results. The experimental results show that the proposed system presents acceptable classification result and outperforms other classification systems including k Nearest Neighbor (k-NN), Multi-Layer Perceptron (MLP).
Keywords :
audio signal processing; computational complexity; multilayer perceptrons; pattern classification; source separation; support vector machines; computational complexity; fixed dimension modified sinusoidal model; k nearest neighbor; mixed type audio classification; monaural separation systems; multi-layer perceptron; music transcription scenarios; relevance vector machine; support vector machine; Computational complexity; Content management; Information retrieval; Multilayer perceptrons; Music information retrieval; Nearest neighbor searches; Speaker recognition; Speech recognition; Support vector machine classification; Support vector machines; Feature selection; KNN; MLP; RBF; RVM; SVM; Sinusoidal parameters;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on
Conference_Location :
Damascus
Print_ISBN :
978-1-4244-1751-3
Electronic_ISBN :
978-1-4244-1752-0
Type :
conf
DOI :
10.1109/ICTTA.2008.4530061
Filename :
4530061
Link To Document :
بازگشت