DocumentCode
2017976
Title
DCT-based processing of dynamic features for robust speech recognition
Author
Lin, Wen-Chi ; Fan, Hao-Teng ; Hung, Jeih-weih
Author_Institution
Dept of Electr. Eng., Nat. Chi Nan Univ., Nantou, Taiwan
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
12
Lastpage
17
Abstract
In this paper, we explore the various properties of cepstral time coefficients (CTC) in speech recognition, and then propose several methods to refine the CTC construction process. It is found that CTC are the filtered version of mel-frequency cepstral coefficients (MFCC), and the used filters are from the discrete cosine transform (DCT) matrix. We modify these DCT-based filters by windowing, removing DC gain, and varying the filter length. The speech recognition task using Aurora-2 digit database show that the proposed methods can enhance the original CTC in improving the recognition accuracy. The resulting relative error reduction is around 20%.
Keywords
discrete cosine transforms; matrix algebra; speech recognition; Aurora-2 digit database; CTC construction process; DCT-based processing; cepstral time coefficient; discrete cosine transform matrix; melfrequency cepstral coefficient; speech recognition; windowing; Discrete cosine transforms; Frequency modulation; Frequency response; Mel frequency cepstral coefficient; Speech; Speech recognition; automatic speech recognition; discrete cosine transform; temporal filter;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684893
Filename
5684893
Link To Document