Title :
Machine learning from imbalanced data sets for astronomical object classification
Author :
de la Calleja, Jorge ; Benitez, Antonio ; Medina, Ma Auxilio ; Fuentes, Olac
Author_Institution :
Dept. de Inf., Univ. Politec. de Puebla, Puebla, Mexico
Abstract :
In this paper we present an experimental study of machine learning from imbalanced data sets applied to the difficult problem of astronomical object classification in multi-spectral wide-field images. The imbalanced data set problem is very common in several domains, and occurs when there are many more examples of some classes than others; therefore, classifiers perform poorly on these data sets. In order to improve the performance of machine learning algorithms over minority class examples, we propose to create new instances using a modification of the well-known SMOTE technique, but only of those misclassified examples given by an ensemble of classifiers. Our preliminary experimental results show that the proposed approach obtain above. 700 using recall, precision and f-measure as metrics for evaluation; using small data sets.
Keywords :
astronomical image processing; data analysis; learning (artificial intelligence); pattern classification; sampling methods; SMOTE technique; astronomical object classification; f-measure; imbalanced data set problem; imbalanced data sets; machine learning algorithms; minority class examples; multispectral wide-field images; Accuracy; Linear regression; Machine learning; Measurement; Neural networks; Pattern recognition; Training; computer applications; learning systems; machine learning;
Conference_Titel :
Soft Computing and Pattern Recognition (SoCPaR), 2011 International Conference of
Conference_Location :
Dalian
Print_ISBN :
978-1-4577-1195-4
DOI :
10.1109/SoCPaR.2011.6089283