Title :
A preliminary study on overlapping and data fracture in imbalanced domains by means of Genetic Programming-based feature extraction
Author :
Moreno-Torres, Jose G. ; Herrera, Francisco
Author_Institution :
Dept. of Comput. Sci. & Artificial Intell., Univ. de Granada, Granada, Spain
fDate :
Nov. 29 2010-Dec. 1 2010
Abstract :
The classification of imbalanced data is a well-studied topic in data mining. However, there is still a lack of understanding of the factors that make the problem difficult. In this work, we study the two main reasons that make the classification of imbalanced datasets complex: overlapping and data fracture. We present a Genetic Programming-based feature extraction method driven by Rough Set Theory to help visualize the data in a bidimensional graph, to better understand how the presence of overlapping and data fractures affect classification performance.
Keywords :
data mining; feature extraction; genetic algorithms; pattern classification; rough set theory; bidimensional graph; data fracture; data mining; genetic programming-based feature extraction; imbalanced data classification; rough set theory; data fracture; feature extraction; genetic programming; imbalanced data; overlapping; rough set theory;
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2010 10th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-8134-7
DOI :
10.1109/ISDA.2010.5687214