• DocumentCode
    79429
  • Title

    Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors

  • Author

    Jukic, Ante ; van Waterschoot, Toon ; Gerkmann, Timo ; Doclo, Simon

  • Author_Institution
    Dept. of Med. Phys. & Acoust., Univ. of Oldenburg, Oldenburg, Germany
  • Volume
    23
  • Issue
    9
  • fYear
    2015
  • fDate
    Sept. 2015
  • Firstpage
    1509
  • Lastpage
    1520
  • Abstract
    The quality of speech signals recorded in an enclosure can be severely degraded by room reverberation. In this paper, we focus on a class of blind batch methods for speech dereverberation in a noiseless scenario with a single source, which are based on multi-channel linear prediction in the short-time Fourier transform domain. Dereverberation is performed by maximum-likelihood estimation of the model parameters that are subsequently used to recover the desired speech signal. Contrary to the conventional method, we propose to model the desired speech signal using a general sparse prior that can be represented in a convex form as a maximization over scaled complex Gaussian distributions. The proposed model can be interpreted as a generalization of the commonly used time-varying Gaussian model. Furthermore, we reformulate both the conventional and the proposed method as an optimization problem with an lp-norm cost function, emphasizing the role of sparsity in the considered speech dereverberation methods. Experimental evaluation in different acoustic scenarios show that the proposed approach results in an improved performance compared to the conventional approach in terms of instrumental measures for speech quality.
  • Keywords
    Fourier transforms; Gaussian distribution; maximum likelihood estimation; optimisation; reverberation; speech processing; blind batch methods; general sparse prior; lp-norm cost function; maximization; maximum-likelihood estimation; multichannel linear prediction-based speech dereverberation; optimization problem; room reverberation; scaled complex Gaussian distributions; short-time Fourier transform domain; speech signals; time-varying Gaussian model; Acoustics; Estimation; Microphones; Optimization; Speech; Speech enhancement; Time-frequency analysis; Multi-channel linear prediction; sparse priors; speech dereverberation; speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    2329-9290
  • Type

    jour

  • DOI
    10.1109/TASLP.2015.2438549
  • Filename
    7113816