DocumentCode
3333687
Title
Segment-based speaker adaptation by neural network
Author
Fukuzawa, Keiji ; Sawai, Hidefumi ; Sugiyama, Masahide
Author_Institution
ATR Interpreting Telephony Res. Labs., Kyoto, Japan
fYear
1991
fDate
30 Sep-1 Oct 1991
Firstpage
442
Lastpage
451
Abstract
The authors propose a segment-to-segment speaker adaptation technique using a feed-forward neural network with a time shifted sub-connection architecture. Differences in voice individuality exist in both the spectral and temporal domains. It is generally known that frame based speaker adaptation techniques can not compensate for speaker individuality in the temporal domain. Segment based speaker adaptation compensates for these spectral and temporal differences. The results of 23 Japanese phoneme recognition experiments using TDNN (time-delay neural network) show that the recognition rate by segment-based adaptations was 83.7%, 22.8% higher than the rate without adaptation
Keywords
feedforward neural nets; speech analysis and processing; speech recognition; Japanese phoneme recognition; feed-forward neural network; segment-to-segment speaker adaptation; spectral domains; temporal domains; time shifted sub-connection architecture; Feedforward neural networks; Feedforward systems; Laboratories; Neural networks; Performance evaluation; Research and development; Speech recognition; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
Conference_Location
Princeton, NJ
Print_ISBN
0-7803-0118-8
Type
conf
DOI
10.1109/NNSP.1991.239497
Filename
239497
Link To Document