Title :
Investigation of effect of reducing dataset´s size on classification algorithms
Author :
Singhal, Neelam ; Ashraf, Mohd
Author_Institution :
Sch. of ICT, Gautam Buddha Univ., Noida, India
Abstract :
Data mining is now one of the most active field of research. Extracting those nuggets of information is becoming crucial and one of its important technique is classification. It helps to group the data in some predefined classes. Various techniques for classification exists which classifies the data using different algorithms. Each algorithm has its own area of best and worst performance. This paper concentrates on the four most famous algorithms, i.e., Decision Tree, Naïve Bayes, K Nearest Neighbour and Genetic Programming and the effect on their performance of time and accuracy when the number of instances are incrementally decreased. This paper will also investigate the difference in result when working with binary class or multiclass datasets and suggest the algorithms to follow when using certain kind of dataset.
Keywords :
Bayes methods; data mining; decision trees; genetic algorithms; pattern classification; binary class datasets; classification algorithms; data mining; decision tree; genetic programming; k nearest neighbour; multiclass datasets; naïve Bayes; Accuracy; Breast cancer; Classification algorithms; Data mining; Decision trees; Genetic programming; Timing; Accuracy; Decision Tree; Genetic Programming; K-Nearest Neighbor; Naïve Bayes;
Conference_Titel :
Computing for Sustainable Global Development (INDIACom), 2015 2nd International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-9-3805-4415-1