مرکز منطقه ای اطلاع رساني علوم و فناوري - Power function-based power distribution normalization algorithm for robust speech recognition

DocumentCode :

2970430

Title :

Power function-based power distribution normalization algorithm for robust speech recognition

Author :

Kim, Chanwoo ; Stern, Richard M.

Author_Institution :

Dept. of Electr. & Comput. Eng. & Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA

fYear :

2009

fDate :

Nov. 13 2009-Dec. 17 2009

Firstpage :

188

Lastpage :

193

Abstract :

A novel algorithm that normalizes the distribution of spectral power coefficients is described in this paper. The algorithm, called power-function-based power distribution (PPDN) is based on the observation that the ratio of arithmetic mean to geometric mean changes as speech is corrupted by noise, and a parametric power function is used to equalize this ratio. We also observe that a longer Â¿medium-durationÂ¿ observation window (of approximately 100 ms) is better suited for parameter estimation for noise compensation than the briefer window typically used for automatic speech recognition. We also describe the implementation of an online version of PPDN based on exponentially weighted temporal averaging. Experimental results shows that PPDN provides comparable or slightly better results than state of- the-art algorithms such as vector Taylor series for speech recognition while requiring much less computation. Hence, the algorithm is suitable for both real-time speech communication or as a real-time preprocessing stage for speech recognition systems.

Keywords :

algorithm theory; real-time systems; speech recognition; Taylor series speech recognition; automatic speech recognition; medium duration observation window; normalization algorithm; parameter estimation noise compensation; parametric power function; power function based power distribution; ratio arithmetic mean; real-time preprocessing stage; robust speech recognition; spectral power coefficients; weighted temporal averaging; Arithmetic; Automatic speech recognition; Noise robustness; Parameter estimation; Power distribution; Real time systems; Signal to noise ratio; Speech enhancement; Speech recognition; Taylor series; Power distribution; equalization; medium-duration window; ratio of arithmetic mean to geometric mean;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on

Conference_Location :

Merano

Print_ISBN :

978-1-4244-5478-5

Electronic_ISBN :

978-1-4244-5479-2

Type :

conf

DOI :

10.1109/ASRU.2009.5373233

Filename :

5373233

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2970430