DocumentCode
454661
Title
Automatic Speech Segmentation Combining an HMM-Based Approach and Recurrence Trend Analysis
Author
Yan, Runqiang ; Zu, Yiqing ; Zhu, Yisheng
Volume
1
fYear
2006
fDate
14-19 May 2006
Abstract
Aiming at improving the speech segmentation accuracy acquired from standard HMM-based approach, this paper presents a nonlinear dynamical method for phoneme boundary adjustment by discerning and measuring the nonstationarity of speech dynamics. Dynamical systems of different phones present diversified invariant attractor structures in phase space. Therefore, when analyzing adjacent phones, there may exist a point, at which the underlying dynamics changes. In this study, time-dependent recurrence trend (TDRT) is proposed to describe the local changing degree of the nonstationarity of speech dynamics as time progress and identify the largest paling slop in the windowed recurrence plots (RPs) as the phoneme boundary. The experimental result shows that 9.41% increase in agreement within 20 ms with TDRT correction is obtained on TIMIT database.
Keywords
Biomedical engineering; Biomedical measurements; Databases; Hidden Markov models; Measurement standards; Neural networks; Speech analysis; Speech processing; Speech synthesis; Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location
Toulouse
ISSN
1520-6149
Print_ISBN
1-4244-0469-X
Type
conf
DOI
10.1109/ICASSP.2006.1660141
Filename
1660141
Link To Document