DocumentCode :
605911
Title :
An amalgam KNN to predict diabetes mellitus
Author :
NirmalaDevi, M. ; Appavu, Subramanian ; Swathi, U.V.
Author_Institution :
Dept. of IT, Thiagarajar Coll. of Eng., Madurai, India
fYear :
2013
fDate :
25-26 March 2013
Firstpage :
691
Lastpage :
695
Abstract :
Medical Data mining extracts hidden patterns from medical data. This paper presents the development of an amalgam model for classifying Pima Indian diabetic database (PIDD). This amalgam model combines k-means with k-Nearest Neighbor (KNN) with multi-sep preprocessing. Many researchers have found that the KNN algorithm accomplishes very good performance in their experiments on different data sets. In this amalgam model, the quality of the data is improved by removing noisy data thereby helping to improve the accuracy and efficiency of the KNN algorithm.k-means clustering is used to identify and eliminate incorrectly classified instances. The missing values are replaced by means and medians. A fine tuned classification is done using k-Nearest Neighbor (KNN) by taking the correctly clustered instance with preprocessed subset as inputs for the KNN. The best choice of k depends upon the data. Generally, larger values of k reduce the effect of noise on the classification. A good k is selected by cross-validation technique. The aim of this paper is determining the value of k for PIDD for better classification accuracy using amalgam KNN. Experimental results signify the proposed amalgam KNN along with preprocessing produces best result for different k values. If k value is more the proposed model obtained the classification accuracy of 97.4%. Ten fold cross validation with larger k value produces better classification accuracy for PIDD. The results are also compared with simple KNN and cascaded K-MEANS and KNN for the same k values.
Keywords :
diseases; learning (artificial intelligence); medical computing; pattern classification; pattern clustering; PIDD; Pima Indian diabetic database classification; amalgam KNN algorithm; amalgam model; cross-validation technique; data quality improvement; diabetes mellitus prediction; fine tuned classification; k-means clustering; k-nearest neighbor algorithm; multistep preprocessing; noisy data removal; Accuracy; Classification algorithms; Clustering algorithms; Data mining; Diabetes; Diseases; Prediction algorithms; Amalgam KNN algorithm; Diabetes Mellitus Disease; KNN algorithm; k-means;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Trends in Computing, Communication and Nanotechnology (ICE-CCN), 2013 International Conference on
Conference_Location :
Tirunelveli
Print_ISBN :
978-1-4673-5037-2
Type :
conf
DOI :
10.1109/ICE-CCN.2013.6528591
Filename :
6528591
Link To Document :
بازگشت