• DocumentCode
    1576932
  • Title

    Mixed Type Audio Classification using Sinusoidal Parameters

  • Author

    Mahale, P. Mowlaee Begzade ; Sayadiyan, A. ; Faez, K.

  • Author_Institution
    Dept. of Electr. Eng., Amirkabir Univ. of Technol., Tehran
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    A preprocessing stage in every audio application including music/speech separation, speech or speaker recognition and audio transcription task is inevitable to determine each frame belongs to which classes, namely: speech only, music only and finally mixture. Such classification can significantly lower the computational complexity due to factorial search commonly used in many model-based systems including monaural separation systems as well as music transcription scenarios. In this paper, employing sinusoidal parameters obtained by a fixed dimension modified sinusoidal model (FDMSM) already proposed in [7] a new classification approach is proposed to separate mixed type audio frames based on Support Vector Machine (SVM) and Relevance Vector Machine (RVM). The challenging problem in this work is seeking the most appropriate features to discriminate the underlying classes. As a result, we employ some unsupervised feature selection procedure to determine which feature to select to get the best results. The experimental results show that the proposed system presents acceptable classification result and outperforms other classification systems including k Nearest Neighbor (k-NN), Multi-Layer Perceptron (MLP).
  • Keywords
    audio signal processing; computational complexity; multilayer perceptrons; pattern classification; source separation; support vector machines; computational complexity; fixed dimension modified sinusoidal model; k nearest neighbor; mixed type audio classification; monaural separation systems; multi-layer perceptron; music transcription scenarios; relevance vector machine; support vector machine; Computational complexity; Content management; Information retrieval; Multilayer perceptrons; Music information retrieval; Nearest neighbor searches; Speaker recognition; Speech recognition; Support vector machine classification; Support vector machines; Feature selection; KNN; MLP; RBF; RVM; SVM; Sinusoidal parameters;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information and Communication Technologies: From Theory to Applications, 2008. ICTTA 2008. 3rd International Conference on
  • Conference_Location
    Damascus
  • Print_ISBN
    978-1-4244-1751-3
  • Electronic_ISBN
    978-1-4244-1752-0
  • Type

    conf

  • DOI
    10.1109/ICTTA.2008.4530061
  • Filename
    4530061