Fast NMF based approach and improved VQ based approach for speech recognition from mixed sound

Author

Nakano, Shunsuke ; Yamamoto, Koji ; Nakagawa, Sachiko

Author_Institution

Dept. of Comput. Sci. & Eng., Toyohashi Univ. of Technol., Toyohashi, Japan

fYear

2012

fDate

3-6 Dec. 2012

Firstpage

1

Lastpage

4

Abstract

We have considered a speech recognition method for mixed sound, consisting of speech and music, that removes only the music based on vector quantization (VQ) and non-negative matrix factorization (NMF). This paper describe fast calculation technique of music removal based on NMF and improvement using a VQ method. For isolated word recognition using the clean speech model, an improvement of 46% word error reduction rate was obtained compared with the case of not removing music. Furthermore, a high recognition rate, close to clean speech recognition was obtained at 10 dB. For the case of the multi-conditions, our proposed method reduced the error rate of 50% compared with the multi-conditions model.

Keywords

matrix decomposition; speech recognition; vector quantisation; VQ based approach; fast NMF based approach; mixed sound; music; non-negative matrix factorization; speech recognition; vector quantization; Hidden Markov models; Music; Noise measurement; Speech; Speech coding; Speech recognition; Vectors;

fLanguage

English

Publisher

ieee

Conference_Titel

Signal & Information Processing Association Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific

Conference_Location

Hollywood, CA

Print_ISBN

978-1-4673-4863-8

Type

conf

Filename

6411807