DocumentCode :
1466510
Title :
Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition
Author :
Zhang, Wei-Qiang ; He, Liang ; Deng, Yan ; Liu, Jia ; Johnson, Michael T.
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Volume :
19
Issue :
2
fYear :
2011
Firstpage :
266
Lastpage :
276
Abstract :
The shifted delta cepstrum (SDC) is a widely used feature extraction for language recognition (LRE). With a high context width due to incorporation of multiple frames, SDC outperforms traditional delta and acceleration feature vectors. However, it also introduces correlation into the concatenated feature vector, which increases redundancy and may degrade the performance of backend classifiers. In this paper, we first propose a time-frequency cepstral (TFC) feature vector, which is obtained by performing a temporal discrete cosine transform (DCT) on the cepstrum matrix and selecting the transformed elements in a zigzag scan order. Beyond this, we increase discriminability through a heteroscedastic linear discriminant analysis (HLDA) on the full cepstrum matrix. By utilizing block diagonal matrix constraints, the large HLDA problem is then reduced to several smaller HLDA problems, creating a block diagonal HLDA (BDHLDA) algorithm which has much lower computational complexity. The BDHLDA method is finally extended to the GMM domain, using the simpler TFC features during re-estimation to provide significantly improved computation speed. Experiments on NIST 2003 and 2007 LRE evaluation corpora show that TFC is more effective than SDC, and that the GMM-based BDHLDA results in lower equal error rate (EER) and minimum average cost (Cavg) than either TFC or SDC approaches.
Keywords :
Gaussian processes; discrete cosine transforms; matrix algebra; speech recognition; DCT; GMM; HLDA; LRE; SDC; TFC feature vector; cepstrum matrix; concatenated feature vector; feature extraction; heteroscedastic linear discriminant analysis; language recognition; shifted delta cepstrum; temporal discrete cosine transform; time-frequency cepstral feature; Acceleration; Cepstral analysis; Cepstrum; Concatenated codes; Discrete cosine transforms; Feature extraction; Linear discriminant analysis; Redundancy; Time frequency analysis; Vectors; Language recognition (LRE); block diagonal heteroscedastic linear discriminant analysis (BDHLDA); time–frequency cepstrum (TFC);
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2010.2047680
Filename :
5444973
Link To Document :
بازگشت