DocumentCode
245142
Title
Learning from Imbalanced Data in Relational Domains: A Soft Margin Approach
Author
Shuo Yang ; Khot, Tushar ; Kersting, Kristian ; Kunapuli, Gautam ; Hauser, Kris ; Natarajan, Sriraam
Author_Institution
Sch. of Inf. & Comput., Indiana Univ. - Bloomington, Bloomington, IN, USA
fYear
2014
fDate
14-17 Dec. 2014
Firstpage
1085
Lastpage
1090
Abstract
We consider the problem of learning probabilistic models from relational data. One of the key issues with relational data is class imbalance where the number of negative examples far outnumbers the number of positive examples. The common approach for dealing with this problem is the use of sub-sampling of negative examples. We, on the other hand, consider a soft margin approach that explicitly trades off between the false positives and false negatives. We apply this approach to the recently successful formalism of relational functional gradient boosting. Specifically, we modify the objective function of the learning problem to explicitly include the trade-off between false positives and negatives. We show empirically that this approach is more successful in handling the class imbalance problem than the original framework that weighed all the examples equally.
Keywords
data mining; gradient methods; probability; sampling methods; class imbalance; imbalanced data; probabilistic model; relational data; relational functional gradient boosting; soft margin approach; subsampling method; Boosting; Computational modeling; Cost function; Electronic mail; Measurement; Probabilistic logic; Standards;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2014 IEEE International Conference on
Conference_Location
Shenzhen
ISSN
1550-4786
Print_ISBN
978-1-4799-4303-6
Type
conf
DOI
10.1109/ICDM.2014.152
Filename
7023451
Link To Document