مرکز منطقه ای اطلاع رساني علوم و فناوري - Modelling Sound Dynamics Using Deformable Spectrograms: Segmenting the Spectrogram into Smooth Regions

DocumentCode :

2453858

Title :

Modelling Sound Dynamics Using Deformable Spectrograms: Segmenting the Spectrogram into Smooth Regions

Author :

Reyes-Gomez, Manuel J. ; Jojic, Nebojsa ; Ellis, Daniel P W

Author_Institution :

Microsoft Res., Columbia Univ., New York, NY

fYear :

2006

fDate :

Oct. 29 2006-Nov. 1 2006

Firstpage :

Lastpage :

Abstract :

Speech and other natural sounds show high temporal correlation and smooth spectral evolution punctuated by a few, irregular and abrupt changes. We model successive spectra as transformations of their immediate predecessors, capturing the evolution of the signal energy through time. The speech production model is used to decomposed the log-spectrogrum into two additive layers, which are able to separately explain and model the evolution of the harmonic excitation, and formant filtering of speech and similar sounds. We present results on a speech recognition task, that suggest that the model discovers a global structure on the dynamics of the signal´s energy that helps to alleviate the problems generated by noise interferences. The model is also used to segment mixtures of speech into dominant speaker regions on a unsupervised source separation task.

Keywords :

sound reproduction; speech processing; speech recognition; deformable spectrograms; harmonic excitation; high temporal correlation; log-spectrogrum; signal energy; smooth spectral evolution; sound dynamics; speech formant filtering; speech production model; speech recognition task; Acoustic noise; Additives; Deformable models; Energy capture; Filtering; Noise generators; Power harmonic filters; Signal generators; Spectrogram; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signals, Systems and Computers, 2006. ACSSC '06. Fortieth Asilomar Conference on

Conference_Location :

Pacific Grove, CA

ISSN :

1058-6393

Print_ISBN :

1-4244-0784-2

Electronic_ISBN :

1058-6393

Type :

conf

DOI :

10.1109/ACSSC.2006.356581

Filename :

4176510

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2453858