مرکز منطقه ای اطلاع رساني علوم و فناوري - Segmental Modeling for Audio Segmentation

DocumentCode :

2701893

Title :

Segmental Modeling for Audio Segmentation

Author :

Aronowitz, Hagai

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume :

fYear :

2007

fDate :

15-20 April 2007

Abstract :

Trainable speech/non-speech segmentation and music detection algorithms usually consist of a frame based scoring phase combined with a smoothing phase. This paper suggests a framework in which both phases are explicitly unified in a segment based classifier. We suggest a novel segment based generative model in which audio segments are modeled as supervectors and each class (speech, silence, music) is modeled by a distribution over the supervector space. Segmental speech classes can then be modeled by generative models such as GMMs or can be classified by SVMs. Our suggested framework leads to a significant reduction in error rate.

Keywords :

Gaussian processes; audio signal processing; smoothing methods; speech processing; support vector machines; GMM; SVM; audio segmentation; detection algorithms; nonspeech segmentation; segment based generative model; segmental modeling; smoothing phase; Broadcasting; Detection algorithms; Error analysis; Hidden Markov models; Mel frequency cepstral coefficient; Natural languages; Smoothing methods; Speaker recognition; Speech; Testing; GMM supervectors; Speech segmentation; music detection; segmental modeling; voice activity detection;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on

Conference_Location :

Honolulu, HI

ISSN :

1520-6149

Print_ISBN :

1-4244-0727-3

Type :

conf

DOI :

10.1109/ICASSP.2007.366932

Filename :

4218120

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2701893