DocumentCode :
738857
Title :
Unsupervised optimal phoneme segmentation: theory and experimental evaluation
Author :
Yu Qiao ; Dean Luo ; Minematsu, Nobuaki
Author_Institution :
Shenzhen Key Lab. for CVPR, Shenzhen Inst. of Adv. Technol., Shenzhen, China
Volume :
7
Issue :
7
fYear :
2013
fDate :
9/1/2013 12:00:00 AM
Firstpage :
577
Lastpage :
586
Abstract :
Automatic phoneme segmentation of a speech sequence is a basic problem in speech engineering. This study investigates unsupervised phoneme segmentation without using prior information on linguistic contents and acoustic models of an input sequence. The authors formulate the unsupervised segmentation as an optimal problem by means of maximum likelihood, and show that the optimal segmentation corresponds to minimising the coding length of the input sequence. Under different assumptions, five different objective functions are developed, namely log determinant, rate distortion (RD), Bayesian log determinant, Mahalanobis distance and Euclidean distance objectives. The authors prove that the optimal segmentations have the transformation-invariant properties, introduce a time-constrained agglomerative clustering algorithm to find the optimal segmentations, and propose an efficient implementation of the algorithm by using integration functions. The experiments are carried out on the TIMIT database to compare the above five objective functions. The results show that RD achieves the best performance, and the proposed method outperforms the previous unsupervised segmentation methods.
Keywords :
Bayes methods; rate distortion theory; speech recognition; Bayesian log determinant objectives; Euclidean distance objectives; Mahalanobis distance; TIMIT database; acoustic models; automatic phoneme segmentation; coding length; linguistic contents; objective functions; rate distortion; speech engineering; speech sequence; time constrained agglomerative clustering algorithm; unsupervised optimal phoneme segmentation;
fLanguage :
English
Journal_Title :
Signal Processing, IET
Publisher :
iet
ISSN :
1751-9675
Type :
jour
DOI :
10.1049/iet-spr.2012.0191
Filename :
6606963
Link To Document :
بازگشت