• DocumentCode
    3333687
  • Title

    Segment-based speaker adaptation by neural network

  • Author

    Fukuzawa, Keiji ; Sawai, Hidefumi ; Sugiyama, Masahide

  • Author_Institution
    ATR Interpreting Telephony Res. Labs., Kyoto, Japan
  • fYear
    1991
  • fDate
    30 Sep-1 Oct 1991
  • Firstpage
    442
  • Lastpage
    451
  • Abstract
    The authors propose a segment-to-segment speaker adaptation technique using a feed-forward neural network with a time shifted sub-connection architecture. Differences in voice individuality exist in both the spectral and temporal domains. It is generally known that frame based speaker adaptation techniques can not compensate for speaker individuality in the temporal domain. Segment based speaker adaptation compensates for these spectral and temporal differences. The results of 23 Japanese phoneme recognition experiments using TDNN (time-delay neural network) show that the recognition rate by segment-based adaptations was 83.7%, 22.8% higher than the rate without adaptation
  • Keywords
    feedforward neural nets; speech analysis and processing; speech recognition; Japanese phoneme recognition; feed-forward neural network; segment-to-segment speaker adaptation; spectral domains; temporal domains; time shifted sub-connection architecture; Feedforward neural networks; Feedforward systems; Laboratories; Neural networks; Performance evaluation; Research and development; Speech recognition; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks for Signal Processing [1991]., Proceedings of the 1991 IEEE Workshop
  • Conference_Location
    Princeton, NJ
  • Print_ISBN
    0-7803-0118-8
  • Type

    conf

  • DOI
    10.1109/NNSP.1991.239497
  • Filename
    239497