Mixed Type Audio Classification using Sinusoidal Parameters

Author

Mahale, P. Mowlaee Begzade ; Sayadiyan, A. ; Faez, K.

Author_Institution

Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran

fYear

2008

Firstpage

1

Lastpage

5

Abstract

A preprocessing stage in every audio application including music/speech separation, speech or speaker recognition and audio transcription task is inevitable to determine each frame belongs to which classes, namely: speech only, music only and finally mixture. Such classification can significantly lower the computational complexity due to factorial search commonly used in many model-based systems including monaural separation systems as well as music transcription scenarios. In this paper, employing sinusoidal parameters obtained by a fixed dimension modified sinusoidal model (FDMSM) already proposed in [7] a new classification approach is proposed to separate mixed type audio frames based on Support Vector Machine (SVM) and Relevance Vector Machine (RVM). The challenging problem in this work is seeking the most appropriate features to discriminate the underlying classes. As a result, we employ some unsupervised feature selection procedure to determine which feature to select to get the best results. The experimental results show that the proposed system presents acceptable classification result and outperforms other classification systems including k Nearest Neighbor (k-NN), Multi-Layer Perceptron (MLP).

Keywords

audio signal processing; computational complexity; multilayer perceptrons; pattern classification; source separation; support vector machines; computational complexity; fixed dimension modified sinusoidal model; k nearest neighbor; mixed type audio classification; monaural separation systems; multi-layer perceptron; music transcription scenarios; relevance vector machine; support vector machine; Computational complexity; Content management; Information retrieval; Multilayer perceptrons; Music information retrieval; Nearest neighbor searches; Speaker recognition; Speech recognition; Support vector machine classification; Support vector machines; Feature selection; KNN; MLP; RBF; RVM; SVM; Sinusoidal parameters;

fLanguage

English

Publisher

ieee

Conference_Titel

Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on

Conference_Location

Damascus

Print_ISBN

978-1-4244-1751-3

Electronic_ISBN

978-1-4244-1752-0

Type

conf

DOI

10.1109/ICTTA.2008.4530061

Filename

4530061