مرکز منطقه ای اطلاع رساني علوم و فناوري - An amalgam KNN to predict diabetes mellitus

DocumentCode :

605911

Title :

An amalgam KNN to predict diabetes mellitus

Author :

NirmalaDevi, M. ; Appavu, Subramanian ; Swathi, U.V.

Author_Institution :

Dept. of IT, Thiagarajar Coll. of Eng., Madurai, India

fYear :

2013

fDate :

25-26 March 2013

Firstpage :

691

Lastpage :

695

Abstract :

Medical Data mining extracts hidden patterns from medical data. This paper presents the development of an amalgam model for classifying Pima Indian diabetic database (PIDD). This amalgam model combines k-means with k-Nearest Neighbor (KNN) with multi-sep preprocessing. Many researchers have found that the KNN algorithm accomplishes very good performance in their experiments on different data sets. In this amalgam model, the quality of the data is improved by removing noisy data thereby helping to improve the accuracy and efficiency of the KNN algorithm.k-means clustering is used to identify and eliminate incorrectly classified instances. The missing values are replaced by means and medians. A fine tuned classification is done using k-Nearest Neighbor (KNN) by taking the correctly clustered instance with preprocessed subset as inputs for the KNN. The best choice of k depends upon the data. Generally, larger values of k reduce the effect of noise on the classification. A good k is selected by cross-validation technique. The aim of this paper is determining the value of k for PIDD for better classification accuracy using amalgam KNN. Experimental results signify the proposed amalgam KNN along with preprocessing produces best result for different k values. If k value is more the proposed model obtained the classification accuracy of 97.4%. Ten fold cross validation with larger k value produces better classification accuracy for PIDD. The results are also compared with simple KNN and cascaded K-MEANS and KNN for the same k values.

Keywords :

diseases; learning (artificial intelligence); medical computing; pattern classification; pattern clustering; PIDD; Pima Indian diabetic database classification; amalgam KNN algorithm; amalgam model; cross-validation technique; data quality improvement; diabetes mellitus prediction; fine tuned classification; k-means clustering; k-nearest neighbor algorithm; multistep preprocessing; noisy data removal; Accuracy; Classification algorithms; Clustering algorithms; Data mining; Diabetes; Diseases; Prediction algorithms; Amalgam KNN algorithm; Diabetes Mellitus Disease; KNN algorithm; k-means;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Emerging Trends in Computing, Communication and Nanotechnology (ICE-CCN), 2013 International Conference on

Conference_Location :

Tirunelveli

Print_ISBN :

978-1-4673-5037-2

Type :

conf

DOI :

10.1109/ICE-CCN.2013.6528591

Filename :

6528591

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=605911