Factor Analyzed Subspace Modeling and Selection

Author

Chien, Jen-Tzung ; Ting, Chuan-Wei

Author_Institution

Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan

Volume

16

Issue

1

fYear

2008

Firstpage

239

Lastpage

248

Abstract

We present a novel subspace modeling and selection approach for noisy speech recognition. In subspace modeling, we develop a factor analysis (FA) representation of noisy speech, which is a generalization of a signal subspace (SS) representation. Using FA, noisy speech is represented by the extracted common factors, factor loading matrix, and specific factors. The observation space of noisy speech is accordingly partitioned into a principal subspace, containing speech and noise, and a minor subspace, containing residual speech and residual noise. We minimize the energies of speech distortion in the principal subspace as well as in the minor subspace so as to estimate clean speech with residual information. Importantly, we explore the optimal subspace selection via solving the hypothesis test problems. We test the equivalence of eigenvalues in the minor subspace to select the subspace dimension. To fulfill the FA spirit, we also examine the hypothesis of uncorrelated specific factors/residual speech. The subspace can be partitioned according to a consistent confidence towards rejecting the null hypothesis. Optimal solutions are realized through the likelihood ratio tests, which arrive at the approximated chi-square distributions as test statistics. In the experiments on the Aurora2 database, the FA model significantly outperforms the SS model for speech enhancement and recognition. Subspace selection via testing the correlation of residual speech achieves higher recognition accuracies than that of testing the equivalent eigenvalues in the minor subspace.

Keywords

matrix algebra; signal representation; speech recognition; statistical distributions; chi-square distribution; factor analysis representation; factor loading matrix; likelihood ratio test; noisy speech recognition; signal subspace representation; speech distortion minimization; subspace modeling; subspace selection; Distortion; Eigenvalues and eigenfunctions; Hidden Markov models; Signal analysis; Speech analysis; Speech enhancement; Speech processing; Speech recognition; Testing; Working environment noise; Factor analysis (FA); likelihood ratio test; noisy speech recognition; signal subspace (SS); subspace selection;

fLanguage

English

Journal_Title

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher

ieee

ISSN

1558-7916

Type

jour

DOI

10.1109/TASL.2007.910790

Filename

4381233