DocumentCode :
3333687
Title :
Segment-based speaker adaptation by neural network
Author :
Fukuzawa, Keiji ; Sawai, Hidefumi ; Sugiyama, Masahide
Author_Institution :
ATR Interpreting Telephony Res. Labs., Kyoto, Japan
fYear :
1991
fDate :
30 Sep-1 Oct 1991
Firstpage :
442
Lastpage :
451
Abstract :
The authors propose a segment-to-segment speaker adaptation technique using a feed-forward neural network with a time shifted sub-connection architecture. Differences in voice individuality exist in both the spectral and temporal domains. It is generally known that frame based speaker adaptation techniques can not compensate for speaker individuality in the temporal domain. Segment based speaker adaptation compensates for these spectral and temporal differences. The results of 23 Japanese phoneme recognition experiments using TDNN (time-delay neural network) show that the recognition rate by segment-based adaptations was 83.7%, 22.8% higher than the rate without adaptation
Keywords :
feedforward neural nets; speech analysis and processing; speech recognition; Japanese phoneme recognition; feed-forward neural network; segment-to-segment speaker adaptation; spectral domains; temporal domains; time shifted sub-connection architecture; Feedforward neural networks; Feedforward systems; Laboratories; Neural networks; Performance evaluation; Research and development; Speech recognition; Telephony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
Conference_Location :
Princeton, NJ
Print_ISBN :
0-7803-0118-8
Type :
conf
DOI :
10.1109/NNSP.1991.239497
Filename :
239497
Link To Document :
بازگشت