• DocumentCode
    1910006
  • Title

    Combining Machine Learning and Computational Auditory Scene Analysis to Separate Monaural Speech of Two-Talker

  • Author

    Li, Peng ; Guan, Yong ; Liu, Wenju ; Xu, Bo

  • Author_Institution
    Digital Media Content Technol. Res. Center, Chinese Acad. of Sci., Beijing
  • fYear
    2007
  • fDate
    Aug. 30 2007-Sept. 1 2007
  • Firstpage
    280
  • Lastpage
    284
  • Abstract
    Monaural speech separation is one of the most difficult problems in speech signal processing. In this paper, a new method based on machine learning and computational auditory scene analysis (CASA) is suggested to separate the monaural speech of two-talker. The technique of machine learning is used to learn the grouping cues on isolated clean data from single speaker. By using a factorial-max vector quantization model (MAXVQ) to infer the masking signals needed in resynthesis, the objective of separation is accomplished. The results of experiment on a standard corpus show that this proposed method could separate the mixed speech of two speakers very well. The SNR of the separated speech are improved obviously.
  • Keywords
    learning (artificial intelligence); speaker recognition; speech processing; vector quantisation; computational auditory scene analysis; factorial-max vector quantization model; machine learning; monaural speech separation; speech signal processing; Automation; Humans; Image analysis; Machine learning; Pattern recognition; Prototypes; Speech analysis; Speech coding; Speech processing; Timbre;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-1610-3
  • Electronic_ISBN
    978-1-4244-1611-0
  • Type

    conf

  • DOI
    10.1109/NLPKE.2007.4368044
  • Filename
    4368044