Catalog-based single-channel speech-music separation for automatic speech recognition

Author

Cemil Demir;Mehmet Uğur Doğan;A. Taylan Cemgil;Murat Saraçlar

Author_Institution

TÜ

fYear

2012

fDate

4/1/2012 12:00:00 AM

Firstpage

1

Lastpage

4

Abstract

In this study, single-channel speech source separation is carried out to separate the speech from the background music, which degrades the speech recognition performance especially in broadcast news transcription systems. In the proposed method, assuming that we know a catalog of the background music, we developed a generative model for the superposed speech and music spectrograms. We represent the speech spectrogram by a Non-negative Matrix Factorization (NMF) model and the music spectrogram by a conditional Mixture Model. In this model, we assume that the background music is generated by repeating and changing the gain of the jingle in the music catalog. We compare the performance of our system with the performance of the traditional NMF model.We address the gain estimation problem of the catalog-based method. In this study, we showed that traditional NMF method outperforms the catalogbased method. However, using Gamma Markov Chain (GMC) in the gain estimation improves the separation performance and yields better separation compared to NMF model.

Publisher

ieee

Conference_Titel

Signal Processing and Communications Applications Conference (SIU), 2012 20th

Print_ISBN

978-1-4673-0055-1

Type

conf

DOI

10.1109/SIU.2012.6204782

Filename

6204782