مرکز منطقه ای اطلاع رساني علوم و فناوري - HMM-supervised classification of the NMF components for robust speech recognition

DocumentCode :

3109501

Title :

HMM-supervised classification of the NMF components for robust speech recognition

Author :

Nengheng Zheng ; Xia Li ; Yi Cai

Author_Institution :

Shenzhen Key Lab. of Telecommun. & Inf. Process., Shenzhen Univ., Shenzhen, China

fYear :

2012

fDate :

9-12 Dec. 2012

Firstpage :

Lastpage :

Abstract :

This paper presents a nonnegative matrix factorization (NMF)-based source separation algorithm for robust speech recognition with music interference. NMF is applied to decompose the mixture signal into a set of basis vectors and corresponding gain vectors, each belonging to either speech or music. Source separation is achieved via classifying the NMF components, i.e. the basis and the corresponding gain vectors into their respective classes. HMM models are incorporated to supervise the classification. More specifically, the likelihood score output from the Viterbi search, i.e. the probability of the input speech given the recognized word models, is adopted as the classification criterion. Such that the separated speech consists of those NMF components having positive contributions to the Viterbi search score. As a result, the recognition output after the separation processing is mostly confident. Automatic speech recognition experiments demonstrate that the proposed source separation algorithm significantly improve the robustness of the recognition system under music interference.

Keywords :

hidden Markov models; matrix decomposition; maximum likelihood estimation; signal classification; source separation; speech recognition; vectors; HMM-supervised classification; NMF component; NMF-based source separation algorithm; Viterbi search; basis vector; classification criterion; gain vector; hidden Markov model; input speech probability; likelihood score output; music interference; nonnegative matrix factorization; robust speech recognition; separation processing; Acoustics; Hidden Markov models; Source separation; Speech; Speech recognition; Support vector machine classification; Viterbi algorithm; Speech separation; nonnegative matrix factorization; speech and music; speech recognition; supervised classification;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Speech Database and Assessments (Oriental COCOSDA), 2012 International Conference on

Conference_Location :

Macau

Print_ISBN :

978-1-4673-2811-1

Electronic_ISBN :

978-1-4673-2812-8

Type :

conf

DOI :

10.1109/ICSDA.2012.6422467

Filename :

6422467

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3109501