• DocumentCode
    1749214
  • Title

    An extended model for speech segregation

  • Author

    Hu, Guoning ; Wang, DeLiang

  • Author_Institution
    Biophys. Program, Ohio State Univ., Columbus, OH, USA
  • Volume
    2
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    1089
  • Abstract
    Speech segregation is an important task of auditory scene analysis (ASA), in which the speech of a certain speaker is separated from other interfering signals. Wang and Brown (1999) proposed a multistage neural model for speech segregation, the core of which is a two-layer oscillator network. We extend their model by adding further processes based on psychoacoustic evidence to improve the performance. These processes include estimation of the pitch of target speech and refined generation of a target speech stream with the estimated pitch. Our model is systematically evaluated and compared with the Wang-Brown model, and it yields significantly better performance
  • Keywords
    neural nets; physiological models; speech processing; speech synthesis; auditory scene analysis; extended model; multistage neural model; pitch estimation; psychoacoustic evidence; speech segregation; two-layer oscillator network; Automatic speech recognition; Image analysis; Oscillators; Psychoacoustic models; Psychology; Speech analysis; Speech enhancement; Speech processing; Speech synthesis; Wideband;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-7044-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2001.939512
  • Filename
    939512