Single-channel speech separation using a sparse periodic decomposition

Author

Nakashizuka, Makoto ; Okumura, Hiroyuki ; Iiguni, Youji

Author_Institution

Grad. Sch. of Eng. Sci., Osaka Univ., Toyonaka, Japan

fYear

2009

fDate

24-28 Aug. 2009

Firstpage

218

Lastpage

222

Abstract

In this paper, we propose a single-channel speech separation method by using a sparse decomposition with a periodic signal model. In our separation method, a mixture of speeches is approximated with periodic signals with time-varying amplitude. The decomposition with the periodic signal model is performed under a sparsity penalty. Due to the sparsity penalty, a segment of the speech mixture is decomposed into periodic signals, each of them is a component of the individual speaker. For speech separation, we introduce the clustering using a K-means algorithm for the set of the periodic signals. After the clustering, each cluster is assigned to its corresponding speaker using codebooks that contain spectral features of the speakers. In experiments, comparison with MaxVQ that performs separation on frequency spectrum domain is demonstrated. The experimental results in terms of signal-to-distortion ratio (SDR) show that our method outperforms MaxVQ with less computational cost for assignment of speech components.

Keywords

frequency-domain analysis; pattern clustering; speaker recognition; speech processing; K-means clustering algorithm; SDR; codebook usage; frequency spectrum domain; periodic signal model; signal-to-distortion ratio; single-channel speech separation method; sparse periodic decomposition; sparsity penalty; speaker spectral features; speech mixture segment; time-varying amplitude; Clustering algorithms; Dictionaries; Discrete Fourier transforms; Noise; Speech; Speech processing; Vectors;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal Processing Conference, 2009 17th European

Conference_Location

Glasgow

Print_ISBN

978-161-7388-76-7

Type

conf

Filename

7077713