مرکز منطقه ای اطلاع رساني علوم و فناوري - Discrimination between singing and speech in real-world audio

DocumentCode :

3585062

Title :

Discrimination between singing and speech in real-world audio

Author :

Thompson, Brian

Author_Institution :

MIT Lincoln Lab., Lexington, MA, USA

fYear :

2014

Firstpage :

407

Lastpage :

412

Abstract :

The performance of a spoken language system suffers when non-speech is incorrectly classified as speech. Singing is particularly difficult to discriminate from speech, since both are natural language. However, singing conveys a melody, whereas speech does not; in particular, a singer´s fundamental frequency should not deviate significantly from an underlying sequence of notes, while a speaker´s fundamental frequency is freer to deviate about a mean value. The present work presents a novel approach to discrimination between singing and speech that exploits the distribution of such deviations. The melody in singing is typically not known a priori, so the distribution cannot be measured directly. Instead, an approximation to its Fourier transform is proposed that allows the unknown melody to be treated as multiplicative noise. This feature vector is shown to be highly discriminative between speech and singing segments when coupled with a simple maximum likelihood classifier, outperforming prior work on real-world data.

Keywords :

Fourier transforms; maximum likelihood estimation; signal classification; speech processing; Fourier transform approximation; feature vector; maximum likelihood classifier; melody; multiplicative noise; singing segments; singing-speech discrimination; speech segments; Approximation methods; Discrete Fourier transforms; Histograms; Speech; Trajectory; Vectors; Audio classification; speech vs. singing discrimination;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Spoken Language Technology Workshop (SLT), 2014 IEEE

Type :

conf

DOI :

10.1109/SLT.2014.7078609

Filename :

7078609

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3585062