Title :
REPMAC: A New Hybrid Approach to Highly Imbalanced Classification Problems
Author :
Ahumada, Hernán ; Grinblat, Guillermo L. ; Uzal, Lucas C. ; Granitto, Pablo M. ; Ceccatto, Alejandro
Author_Institution :
CIFASIS, CONICET, Rosario
Abstract :
The class imbalance problem (when one of the classes has much less samples than the others) is of great importance in machine learning, because it corresponds to many critical applications. In this work we introduce the recursive partitioning of the majority class (REPMAC) algorithm, a new hybrid method to solve imbalanced problems. Using a clustering method, REPMAC recursively splits the majority class in several subsets, creating a decision tree, until the resulting sub-problems are balanced or easy to solve. At that point, a classifier is fitted to each sub-problem. We evaluate the new method on 7 datasets from the UCI repository, finding that REPMAC is more efficient than other methods usually applied to imbalanced datasets.
Keywords :
decision trees; learning (artificial intelligence); pattern classification; pattern clustering; REPMAC algorithm; class imbalance problem; clustering method; decision tree; imbalanced classification problem; machine learning; majority class recursive partitioning algorithm; Clustering algorithms; Decision trees; Hybrid intelligent systems; Learning systems; Logistics; Machine learning; Machine learning algorithms; Sampling methods; Support vector machine classification; Support vector machines; class Imbalance; clustering; hybrid method; partitioning;
Conference_Titel :
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-0-7695-3326-1
Electronic_ISBN :
978-0-7695-3326-1
DOI :
10.1109/HIS.2008.142