مرکز منطقه ای اطلاع رساني علوم و فناوري - An Optimization Algorithm of K-NN Classification

DocumentCode :

2892885

Title :

An Optimization Algorithm of K-NN Classification

Author :

Zhan, Yan ; Chen, Hao ; Zhang, Guo-chun

Author_Institution :

Machine Learning Center, Hebei Univ.

fYear :

2006

fDate :

13-16 Aug. 2006

Firstpage :

2246

Lastpage :

2251

Abstract :

K-nearest neighbor (K-NN) algorithm is a classification method based on statistical theory. In this algorithm the Euclidean distance is usually chosen as the similarity measure, which usually relates to all attributes. Accordingly one practical issue in applying K-NN algorithm is that the distance between instances is calculated based on all attributes of the instance. One interesting approach to overcoming this problem is to weight each attribute differently when calculating the distance between two instances. So we can decide different functions of each feature by using feature weight learning. Another issue is that we still need evaluate K value by testing different values. In order to avoid searching for K value in nearest neighbor experiment and make the accuracy and efficiency more perfect, we bring forward one validity function in this paper for judging clustering when the classification of data set is clear. We apply it into classification problem such as K-NN combining with supervised classification. Thus we can only select the nearest neighbor (1-NN) not only to achieve more precise classification but also to avoid the trouble of looking for K, which will reduce the query complexity greatly and improve the efficiency of nearest neighbor algorithm. Simultaneously, the nearest neighbor algorithm is one of the most basic case-base reasoning (CBR) problems and case-base maintenance (CBM) is an important issue in CBR system to obtain the efficient case bases. This paper proposes a new approach to select representative cases based on generalization capability of cases. Using this method, most redundant cases, which affect the solution accuracy, can be deleted. It will improve indexing efficiency in searching near neighbors

Keywords :

case-based reasoning; generalisation (artificial intelligence); learning (artificial intelligence); optimisation; pattern classification; pattern clustering; statistical analysis; CBR problems; Euclidean distance; K-NN classification algorithm; case generalization capability; case-base maintenance; case-base reasoning problems; optimization algorithm; representative cases; similarity measure; statistical theory; Classification algorithms; Clustering algorithms; Computer science; Cybernetics; Electronic mail; Euclidean distance; Machine learning; Machine learning algorithms; Mathematics; Nearest neighbor searches; Optimization methods; Testing; K-NN algorithm; case-base maintenance; clustering validity; feature weight; generalization capability; similarity metrics;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics, 2006 International Conference on

Conference_Location :

Dalian, China

Print_ISBN :

1-4244-0061-9

Type :

conf

DOI :

10.1109/ICMLC.2006.258667

Filename :

4028438

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2892885