DocumentCode
2132633
Title
Learning distances to improve phoneme classification
Author
Curtin, Ryan ; Vasiloglou, Nikolaos ; Anderson, David V.
Author_Institution
Sch. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA, USA
fYear
2011
fDate
18-21 Sept. 2011
Firstpage
1
Lastpage
6
Abstract
In this work we aim to learn a Mahalanobis distance to improve the performance of phoneme classification using the standard 39-dimensional MFCC features. To learn and to evaluate the performance of our distance, we use the simple k-nearest-neighbors (k-NN) classifier. Although this classifier exhibits low performance relative to state-of-the-art phoneme classifiers, it can be used to determine a distance metric that is applicable to many other better-performing machine learning methods. We devise a novel optimization method that minimizes the error function of the k-NN classifier with respect to the covariance matrix of the Mahalanobis distance, based on finite-difference stochastic approximation (FDSA) gradient estimates combined with a random perturbation term to avoid local minima. We apply our method to the problem of phoneme classification with the k-NN classifier and show that our learned distance provides performance improvement of up to 8:19% over the standard k-NN classifier, and additionally outperforms other state-of-the-art distance learning methods by approximately 4 percentage points. We also find that the computational complexity of our method, while not optimal, is better than other distance learning methods. The performance improvements for individual phoneme classes are given. The distances learned are applicable to other scale-variant machine learning methods, such as support vector machines, multidimensional scaling, and maximum variance unfolding, as well as others.
Keywords
computational complexity; distance learning; gradient methods; learning (artificial intelligence); pattern classification; support vector machines; FDSA; MFCC features; Mahalanobis distance; computational complexity; distance learning methods; distance metric; finite difference stochastic approximation; gradient estimation; k-NN; k-nearest-neighbors; learning distances; machine learning methods; optimization method; phoneme classification; support vector machines; Accuracy; Computer aided instruction; Learning systems; Machine learning; Measurement; Mel frequency cepstral coefficient; Optimization; Mahalanobis distance; distance learning; k-nearest-neighbors; phoneme classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning for Signal Processing (MLSP), 2011 IEEE International Workshop on
Conference_Location
Santander
ISSN
1551-2541
Print_ISBN
978-1-4577-1621-8
Electronic_ISBN
1551-2541
Type
conf
DOI
10.1109/MLSP.2011.6064601
Filename
6064601
Link To Document