DocumentCode
3228656
Title
A Data Complexity Analysis on Imbalanced Datasets and an Alternative Imbalance Recovering Strategy
Author
Weng, Cheng G. ; Poon, Josiah
Author_Institution
Sch. of Inf. Technol., Sydney Univ., NSW
fYear
2006
fDate
18-22 Dec. 2006
Firstpage
270
Lastpage
276
Abstract
The imbalance dataset problem arises in many domains, such as Web page search, scam sites detection. In this paper, we propose an alternative re-sampling approach to deal with imbalance datasets. We demonstrate this approach with a concrete implementation and it has shown promising results when compared to other standard approaches that deals with imbalance dataset. We have also performed an analysis of the data complexity to help understand imbalanced dataset, which has also shown to be a promising approach
Keywords
data analysis; learning (artificial intelligence); sampling methods; support vector machines; Web page search; data complexity analysis; imbalanced dataset recovering strategy; re-sampling approach; scam sites detection; Australia; Boosting; Cancer detection; Costs; Data analysis; Information analysis; Information technology; Intrusion detection; Support vector machines; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
Conference_Location
Hong Kong
Print_ISBN
0-7695-2747-7
Type
conf
DOI
10.1109/WI.2006.9
Filename
4061376
Link To Document