مرکز منطقه ای اطلاع رساني علوم و فناوري - Speaker adaptation applied to HMM and neural networks

DocumentCode :

3520990

Title :

Speaker adaptation applied to HMM and neural networks

Author :

Nakamura, Satoshi ; Shikano, Kiyohiro

Author_Institution :

ATR Interpreting Telephony Res. Lab., Kyoto, Japan

fYear :

1989

fDate :

23-26 May 1989

Firstpage :

Abstract :

The authors propose a speaker adaptation algorithm which does not depend on speech recognition algorithms. The proposed spectral mapping algorithm is based on three ideas: (1) accurate representation of the input vector by separate vector quantization and fuzzy vector quantization, (2) continuous spectral mapping from one speaker to another by fuzzy mapping, and (3) accurate establishment of spectral correspondence based on the fuzzy relationship of the membership function obtained from supervised training. The spectrum dynamic features are also utilized. The algorithm is applied to hidden Markov models (HMMs) and neural networks and evaluated using a database of 216 phonetically balanced words and 5240 important Japanese words uttered by three speakers. The HMM speaker adapted recognition rate for /b,d,g/ is 79.5%. The average recognition rate for the top-three choices is about 91%. The algorithm was applied to neural networks and resulted in almost the same performance. The algorithm was also applied to voice conversion, and a preference score of 65.6% was obtained

Keywords :

Markov processes; neural nets; speech recognition; HMM; Japanese words; continuous spectral mapping; fuzzy mapping; fuzzy vector quantization; hidden Markov models; membership function; neural networks; phonetically balanced words; separate vector quantization; speaker adaptation algorithm; spectral correspondence; spectral mapping algorithm; speech recognition; supervised training; voice conversion; Auditory system; Hidden Markov models; Histograms; Laboratories; Neural networks; Spatial databases; Speech recognition; System testing; Telephony; Vector quantization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on

Conference_Location :

Glasgow

ISSN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.1989.266370

Filename :

266370

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3520990