Speech and music classification using hybrid Form of spectrogram and fourier transformation

Author

Neammalai, Piyawat ; Phimoltares, Suphakant ; Lursinsap, Chidchanok

Author_Institution

Dept. of Math. & Comput. Sci., Chulalongkorn Univ., Bangkok, Thailand

fYear

2014

fDate

9-12 Dec. 2014

Firstpage

1

Lastpage

6

Abstract

This paper presents the technique for feature extraction to classify speech and music audio data. The combination of image processing and signal processing is used to classify audio data. There are three main steps. First, the audio data is segments and transformed to spectrogram image and then apply image processing methods to find the salient characteristics on the spectrogram image. The next step transforms the salient spectrogram image using 2-dimensional Fourier Transform and then calculates the energy of signal at the specific frequencies to form the feature vector. Next, in classification process, Support Vector Machine is used as bi-classification tool. The method is tested on an audio database containing 510 instances with 1.5 seconds length of each. The experimental results show that the acceptable classification accuracy of our proposed technique is achieved.

Keywords

Fourier transforms; audio signal processing; feature extraction; image classification; speech processing; 2D Fourier transform; Fourier transformation; audio database; bi-classification tool; feature extraction; hybrid form; image processing; music audio data; music classification; salient characteristics; salient spectrogram image; speech classification; support vector machine; Accuracy; Feature extraction; Multiple signal classification; Spectrogram; Speech; Support vector machines; Vectors; Fourier Transform; Spectrogram; Speech music classification;

fLanguage

English

Publisher

ieee

Conference_Titel

Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)

Conference_Location

Siem Reap

Type

conf

DOI

10.1109/APSIPA.2014.7041658

Filename

7041658