مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

2875028

Title :

Fepstrum representation of speech signal

Author :

Tyagi, Vivek ; Wellekens, Christian

Author_Institution :

Inst. Eurecom, Sophia Antipolis

fYear :

2005

fDate :

27-27 Nov. 2005

Firstpage :

Lastpage :

Abstract :

Pole-zero spectral models in the frequency domain have been well studied and understood in the past several decades. Exploiting the duality between the temporal domain and the frequency domain, Kumaresan et al (R. Kumaresan, et al., March 1999), (R. Kumaresan, October 1998) have shown that the pole-zero model of the analytic speech signal in the temporal domain leads to its characterization in terms of the positive amplitude modulation (AM) and positive instantaneous frequency (PIF). In this paper, we carefully define AM and frequency modulation (FM) signals in the context of ASR. We show that for a theoretically meaningful estimation of the AM signal, it is necessary to decompose the speech signal into several narrow spectral bands as opposed to the previous use of the speech modulation spectrum (V. Tyagi, et al., 2003), (M. Athineos and D. Ellis, 2003), (M. Athineos, et al., April 2004), (Q. Zhu, and A. Alwan, 2000), (B. E. D. Kingsbury, et al., Aug. 1998), which was derived by decomposing the speech signal into increasingly wider spectral bands (such as critical, Bark or Mel). The estimated AM message signals are downsampled and their lower DCT coefficients are retained as speech features. These features carry information that is complementary to the MFCCs. A Tandem (H. Hermansky, 2003), (D. P. W. Ellis, et al., May 2001) combination of these two features is shown to improve recognition accuracy

Keywords :

amplitude modulation; discrete cosine transforms; frequency modulation; poles and zeros; signal representation; speech processing; DCT coefficients; fepstrum representation; frequency modulation; pole-zero spectral models; positive amplitude modulation; positive instantaneous frequency; speech modulation spectrum; speech signal; Amplitude modulation; Automatic speech recognition; Frequency domain analysis; Frequency modulation; Low pass filters; Narrowband; Pattern classification; Signal analysis; Speech analysis; Wideband;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on

Conference_Location :

San Juan

Print_ISBN :

0-7803-9478-X

Electronic_ISBN :

0-7803-9479-8

Type :

conf

DOI :

10.1109/ASRU.2005.1566475

Filename :

1566475

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2875028