DocumentCode
1576932
Title
Mixed Type Audio Classification using Sinusoidal Parameters
Author
Mahale, P. Mowlaee Begzade ; Sayadiyan, A. ; Faez, K.
Author_Institution
Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran
fYear
2008
Firstpage
1
Lastpage
5
Abstract
A preprocessing stage in every audio application including music/speech separation, speech or speaker recognition and audio transcription task is inevitable to determine each frame belongs to which classes, namely: speech only, music only and finally mixture. Such classification can significantly lower the computational complexity due to factorial search commonly used in many model-based systems including monaural separation systems as well as music transcription scenarios. In this paper, employing sinusoidal parameters obtained by a fixed dimension modified sinusoidal model (FDMSM) already proposed in [7] a new classification approach is proposed to separate mixed type audio frames based on Support Vector Machine (SVM) and Relevance Vector Machine (RVM). The challenging problem in this work is seeking the most appropriate features to discriminate the underlying classes. As a result, we employ some unsupervised feature selection procedure to determine which feature to select to get the best results. The experimental results show that the proposed system presents acceptable classification result and outperforms other classification systems including k Nearest Neighbor (k-NN), Multi-Layer Perceptron (MLP).
Keywords
audio signal processing; computational complexity; multilayer perceptrons; pattern classification; source separation; support vector machines; computational complexity; fixed dimension modified sinusoidal model; k nearest neighbor; mixed type audio classification; monaural separation systems; multi-layer perceptron; music transcription scenarios; relevance vector machine; support vector machine; Computational complexity; Content management; Information retrieval; Multilayer perceptrons; Music information retrieval; Nearest neighbor searches; Speaker recognition; Speech recognition; Support vector machine classification; Support vector machines; Feature selection; KNN; MLP; RBF; RVM; SVM; Sinusoidal parameters;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on
Conference_Location
Damascus
Print_ISBN
978-1-4244-1751-3
Electronic_ISBN
978-1-4244-1752-0
Type
conf
DOI
10.1109/ICTTA.2008.4530061
Filename
4530061
Link To Document