Title :
MGI: A New Heuristic for classifying continuous attributes in decision trees
Author :
Anuradha ; Gupta, Gaurav
Author_Institution :
CSE/IT Dept., ITM Univ., Gurgaon, India
Abstract :
A New Heuristic for generating height balanced compact decision trees with reduced path length has been proposed in this paper. The proposed node split measure (MGI) modifies attribute selection process of gini index by giving importance to almost equal sized sub partitions and at same time counts frequently occurring target classes in those sub partitions. This results in less skewed decision trees, while maintaining similar accuracy as provided by gini index. As a part of data pre processing step, MGI gives a new approach to handle missing values and also provides initial data set size reduction. The performance evaluation is done on few sample data sets taken from UCI Machine learning repository. The experimental evaluation of MGI node split measure is done with gini index which shows that the proposed split measure produces height balanced decision tree of reduced height, which in turn reduces antecedents in the rule generated.
Keywords :
decision trees; pattern classification; MGI; UCI machine learning repository; attribute selection process; continuous attributes classification; data preprocessing; data set size reduction; gini index; height balanced compact decision trees; heuristic; missing values; node split measure; reduced path length; subpartitions; Accuracy; Classification algorithms; Decision trees; Equations; Gain; Indexes; Vegetation; Accuracy; Classification; Decision Tree; Gini Index; Height;
Conference_Titel :
Computing for Sustainable Global Development (INDIACom), 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-93-80544-10-6
DOI :
10.1109/IndiaCom.2014.6828146