مرکز منطقه ای اطلاع رساني علوم و فناوري - Iterative self-learning speaker and channel adaptation under various initial conditions

DocumentCode :

294660

Title :

Iterative self-learning speaker and channel adaptation under various initial conditions

Author :

Zhao, Yunxin

Author_Institution :

Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA

Volume :

fYear :

1995

fDate :

9-12 May 1995

Firstpage :

712

Abstract :

A self-learning adaptation technique is presented which handles the speaker and channel induced spectral variations without enrolment speech. At the acoustic level, the distortion spectral bias is estimated in two steps using the unsupervised maximum likelihood estimation: in the first step, the probability distributions of the speech spectral features are assumed uniform for severely mismatched channels; in the second step, the spectral bias is reestimated assuming Gaussian distributions for the spectral features. At the phone unit level, unsupervised sequential adaptation is performed via Bayesian estimation from the online, bias-removed speech data, and iterative adaptation is further performed for dictation applications. Over four 198-sentence test sets, on a continuous speech recognition task with vocabulary size=853 and grammar perplexity=105, the largest increase of average word accuracy is 85.2% from the baseline accuracy of -0.3%, and the maximum average word accuracy is 89.4% from the baseline accuracy of 56.5%

Keywords :

Bayes methods; Gaussian distribution; acoustic signal processing; adaptive signal processing; dictation; iterative methods; maximum likelihood estimation; normal distribution; spectral analysis; speech processing; speech recognition; telecommunication channels; unsupervised learning; Bayesian estimation; Gaussian distributions; acoustic level; average word accuracy; continuous speech recognition; dictation applications; distortion spectral bias estimation; grammar perplexity; initial conditions; iterative self-learning channel adaptation; iterative self-learning speaker adaptation; mismatched channels; online bias-removed speech data; phone unit level; probability distributions; sentence test sets; spectral variations; speech spectral features; uniform distribution; unsupervised maximum likelihood estimation; unsupervised sequential adaptation; vocabulary size; Acoustic distortion; Bayesian methods; Delay estimation; Gaussian distribution; Hidden Markov models; Loudspeakers; Maximum likelihood decoding; Maximum likelihood estimation; Microphones; Probability distribution; Speech recognition; Testing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on

Conference_Location :

Detroit, MI

ISSN :

1520-6149

Print_ISBN :

0-7803-2431-5

Type :

conf

DOI :

10.1109/ICASSP.1995.479793

Filename :

479793

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=294660