Author_Institution :
Dept. of Inf. Technol., L.N. Gumilev Eurasian Nat. Univ., Astana, Kazakhstan
Abstract :
In the current work, we investigate the effect of combining the mel-frequency cepstral coefficients (MFCC) with the acoustic parameters (AP) in the task of segmentation of continuous speech into sonorant and obstruent regions using Hidden Markov Models (HMM) with Gaussian Mixture Models (GMM). Along with the influence of APs to the performance of the model built, we analyze the set of acoustic features extracted for each phoneme to see how robust they are in the noise. All the experiments were conducted on TIMIT database. The results of the experiments show that there are APs, which have nice separating property and, therefore, improve the performance of a system if used with MFCCs, however, they are not robust to noise. On the other hand, there are APs, which do not have this property, but possess the intrinsic stability in noisy conditions and, as a result, add some robustness to a system.
Keywords :
Gaussian processes; acoustic signal processing; hidden Markov models; speech synthesis; GMM; Gaussian mixture models; HMM; MFCC; TIMIT database; acoustic features extracted; acoustic parameters; continuous speech segmentation; hidden Markov models; intrinsic stability; mel-frequency cepstral coefficients; robust segmentation; sonorant; speech signal; Erbium; Hidden Markov models; Mel frequency cepstral coefficient; Noise; Noise measurement; Speech; GMM; HMM; MFCC; acoustic parameters; robust segmentation;