مرکز منطقه ای اطلاع رساني علوم و فناوري - Classification of audios containing speech and music

DocumentCode :

2160578

Title :

Classification of audios containing speech and music

Author :

Uzu, Erkam ; Sencar, Hüsrev Taha

Author_Institution :

Bilgisayar Muhendisligi Bolumu, TOBB Ekonomi ve Teknoloji Univ., Ankara, Turkey

fYear :

2012

fDate :

18-20 April 2012

Firstpage :

Lastpage :

Abstract :

We propose an automated technique that uses perceptual and non-perceptual audio quality measures for discrimination of speech and music signals with high accuracy. Deployed audio quality measures used for characterization of audio are obtained via de-noising of the original audio. The underlying idea of the approach is that de-noising operation affects speech and music signals in a different and consistent manner and these differences can be captured by the audio quality metrics. Obtained quality measures are then used in conjunction with a machine learning classifier to statistically model speech and music signals. To determine the accuracy of the proposed method, tests have been performed on different datasets with and without audio compression.

Keywords :

audio signal processing; learning (artificial intelligence); music; speech processing; audio quality measures; machine learning classifier; music signals; original audio denoising; speech classification; speech signals; Discrete wavelet transforms; Hidden Markov models; Multiple signal classification; Noise reduction; Speech; Speech recognition; Support vector machines;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal Processing and Communications Applications Conference (SIU), 2012 20th

Conference_Location :

Mugla

Print_ISBN :

978-1-4673-0055-1

Electronic_ISBN :

978-1-4673-0054-4

Type :

conf

DOI :

10.1109/SIU.2012.6204616

Filename :

6204616

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2160578