Title :
Towards building Indonesian viseme: A clustering-based approach
Author :
Arifin ; Muljono ; Sumpeno, Surya ; Hariadi, Mochamad
Author_Institution :
Dept. of Inf. Technol., Univ. Dian Nuswantoro, Semarang, Indonesia
Abstract :
Lips animation plays an important role in facial animation. A realistic lips animation requires synchronization of viseme (visual phoneme) with the spoken phonemes. This research aims towards building Indonesian viseme by configuring viseme classes based on the clustering process result of visual speech images data. The research used Subspace LDA, which is a combination of Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA), as the extraction feature method. The Subspace LDA method is expected to be able to produce an optimal dimension reduction. The clustering process utilized K-Means algorithms to split data into a number of clusters. The quality of clustering result is measured by using Sum of Squared Error (SSE) and a ratio of Between-Class Variation (BCV) and Within-Class Variation (WCV). From these measurements, we found that the best quality clustering occurs at k=9. The finding of this research is the Indonesian viseme consisting of 10 classes (9 classes of clustering result and one neutral class). For a future work, the result of this research can be used as a reference to the Indonesian viseme structure that is defined based on linguistic knowledge.
Keywords :
computer animation; face recognition; feature extraction; natural language processing; pattern clustering; principal component analysis; speech processing; BCV; Indonesian viseme structure; SSE; WCV; between-class variation; clustering process; clustering-based approach; facial animation; feature extraction method; k-means algorithms; linear discriminant analysis; linguistic knowledge; lips animation; optimal dimension reduction; principal components analysis; spoken phonemes; subspace LDA method; sum-of-squared error; viseme synchronization; visual phoneme; visual speech images data; within-class variation; Animation; Covariance matrices; Feature extraction; Lips; Principal component analysis; Speech; Visualization; K-Means; Sum of Squared Error; clustering; feature extraction; subspace LDA; viseme;
Conference_Titel :
Computational Intelligence and Cybernetics (CYBERNETICSCOM), 2013 IEEE International Conference on
Conference_Location :
Yogyakarta
DOI :
10.1109/CyberneticsCom.2013.6865781