DocumentCode :
1626034
Title :
A new fuzzy rule-based initialization method for K-Nearest neighbor classifier
Author :
Chua, TeckWee ; Tan, WoeiWan
Author_Institution :
Dept. of Electr. & Comput. Eng., Nat. Univ. of Singapore, Singapore, Singapore
fYear :
2009
Firstpage :
415
Lastpage :
420
Abstract :
The performances of conventional crisp and fuzzy K-nearest neighbor (K-NN) algorithms trained using finite samples tends to be poor . With ldquoholesrdquo in the training data, it is unlikely that the decision area formed can actually represent the underlying data distribution. There is a need to capture more useful information from the limited training samples, therefore we propose a new fuzzy rule-based K-NN algorithm. A fuzzy rule-based initialization procedure differentiates our proposed algorithm from the conventional fuzzy K-NN algorithm. The new initialization procedure allows us to handle the imprecise inputs (neighborhood density and distance) through the natural framework of fuzzy logic system. Unlike conventional K-NN algorithms, the ability to fine tune the membership functions can lead to a highly versatile decision boundary. Thus, the new algorithm can be specifically tuned for different problems to achieve better results. The advantage is demonstrated on a synthetic data set in two-dimensional space. In addition, we also adopt weighted Euclidean distance measurement to overcome the curse of dimensionality . The Euclidean distance weights and the parameters of the fuzzy rule-based system are then optimized with genetic algorithm (GA) simultaneously. The practical applicability of the proposed algorithm is verified on four UCI data sets (Bupa liver disorders, Glass, Pima Indians diabetes and Wisconsin breast cancer) and Ford automotive data set with an improvement of 3.42% in classification rate on average.
Keywords :
decision theory; fuzzy logic; fuzzy set theory; fuzzy systems; genetic algorithms; knowledge based systems; learning (artificial intelligence); pattern classification; statistical distributions; Euclidean distance measurement; Ford automotive data set; K-nearest neighbor classifier; UCI data set; data distribution; decision boundary; finite sample; fuzzy logic system; fuzzy rule-based initialization method; genetic algorithm; machine learning; membership function; synthetic data set; two-dimensional space; Diabetes; Euclidean distance; Fuzzy logic; Fuzzy systems; Genetic algorithms; Glass; Knowledge based systems; Liver; Training data; Weight measurement;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE International Conference on
Conference_Location :
Jeju Island
ISSN :
1098-7584
Print_ISBN :
978-1-4244-3596-8
Electronic_ISBN :
1098-7584
Type :
conf
DOI :
10.1109/FUZZY.2009.5277215
Filename :
5277215
Link To Document :
بازگشت