DocumentCode :
2731934
Title :
GA-facilitated KNN classifier optimization with varying similarity measures
Author :
Peterson, Michael R. ; Doom, Travis E. ; Raymer, Michael L.
Author_Institution :
Dept. of Comput. Sci. & Eng.,, Wright State Univ., Dayton, OH, USA
Volume :
3
fYear :
2005
fDate :
2-5 Sept. 2005
Firstpage :
2514
Abstract :
Genetic algorithms are powerful tools for k-nearest neighbors classifier optimization. While traditional knn classification techniques typically employ Euclidian distance to assess pattern similarity, other measures may also be utilized. Previous research demonstrates that GAs can improve predictive accuracy by searching for optimal feature weights and offsets for a cosine similarity-based knn classifier. GA-selected weights determine the classification relevance of each feature, while offsets provide alternative points of reference when assessing angular similarity. Such optimized classifiers perform competitively with other contemporary classification techniques. This paper explores the effectiveness of GA weight and offset optimization for knowledge discovery using knn classifiers with varying similarity measures. Using Euclidian distance, cosine similarity, and Pearson correlation, untrained classifiers are compared with weight-optimized classifiers for several datasets. Simultaneous weight and offset optimization experiments are also performed for cosine similarity and Pearson correlation. This type of optimization represents a novel technique for maximizing Pearson correlation-based knn performance. While unoptimized cosine and Pearson classifiers often perform worse than their Euclidian counterparts, optimized cosine and Pearson classifiers typically show equivalent or improved performance over optimized Euclidian classifiers. In some cases, offset optimization provides further improvement for knn classifiers employing cosine similarity or Pearson correlation.
Keywords :
data mining; genetic algorithms; pattern classification; Euclidian distance; GA facilitated KNN classifier optimization; Pearson correlation; classification relevance; cosine similarity; genetic algorithms; k-nearest neighbors; knowledge discovery; pattern similarity; varying similarity measures; Accuracy; Biological cells; Data analysis; Feature extraction; Gene expression; Genetic algorithms; Genetic mutations; Pattern recognition; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Evolutionary Computation, 2005. The 2005 IEEE Congress on
Print_ISBN :
0-7803-9363-5
Type :
conf
DOI :
10.1109/CEC.2005.1555009
Filename :
1555009
Link To Document :
بازگشت