• DocumentCode
    3228656
  • Title

    A Data Complexity Analysis on Imbalanced Datasets and an Alternative Imbalance Recovering Strategy

  • Author

    Weng, Cheng G. ; Poon, Josiah

  • Author_Institution
    Sch. of Inf. Technol., Sydney Univ., NSW
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    270
  • Lastpage
    276
  • Abstract
    The imbalance dataset problem arises in many domains, such as Web page search, scam sites detection. In this paper, we propose an alternative re-sampling approach to deal with imbalance datasets. We demonstrate this approach with a concrete implementation and it has shown promising results when compared to other standard approaches that deals with imbalance dataset. We have also performed an analysis of the data complexity to help understand imbalanced dataset, which has also shown to be a promising approach
  • Keywords
    data analysis; learning (artificial intelligence); sampling methods; support vector machines; Web page search; data complexity analysis; imbalanced dataset recovering strategy; re-sampling approach; scam sites detection; Australia; Boosting; Cancer detection; Costs; Data analysis; Information analysis; Information technology; Intrusion detection; Support vector machines; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    0-7695-2747-7
  • Type

    conf

  • DOI
    10.1109/WI.2006.9
  • Filename
    4061376