DocumentCode :
1800016
Title :
Error signal distribution as an indicator of imbalanced data
Author :
Furundzic, Drasko ; Stankovic, Stevan ; Dimic, Goran
Author_Institution :
Mihajlo Pupin Inst., Belgrade, Serbia
fYear :
2014
fDate :
25-27 Nov. 2014
Firstpage :
189
Lastpage :
194
Abstract :
This paper defines criteria for assessing the imbalance of datasets for training predictive learning models. The most important criterion for evaluating the imbalance is the distribution of the error signal over the space of local measure of distances between the points of the training set. In this paper is presented the analysis of this indicator for the sets of various distributions, and it has been shown that the most information potential for the case of the identical mapping of data sets from the real domain is incorporated within the data whose internal distribution is uniform.
Keywords :
data handling; learning (artificial intelligence); statistical distributions; data sets; error signal distribution; imbalanced data; internal distribution; local measure; predictive learning models; training set; Approximation methods; Data mining; Data models; Electronic mail; Entropy; Predictive models; Training; Imbalanced data; imbalanced learning; predictive models;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Network Applications in Electrical Engineering (NEUREL), 2014 12th Symposium on
Conference_Location :
Belgrade
Print_ISBN :
978-1-4799-5887-0
Type :
conf
DOI :
10.1109/NEUREL.2014.7011503
Filename :
7011503
Link To Document :
بازگشت