• DocumentCode
    3104845
  • Title

    Active Learning to Maximize Area Under the ROC Curve

  • Author

    Culver, Matt ; Kun, Deng ; Scott, Stephen

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Nebraska, Lincoln, NE
  • fYear
    2006
  • fDate
    18-22 Dec. 2006
  • Firstpage
    149
  • Lastpage
    158
  • Abstract
    In active learning, a machine learning algorithm is given an unlabeled set of examples U, and is allowed to request labels for a relatively small subset of U to use for training. The goal is then to judiciously choose which examples in U to have labeled in order to optimize some performance criterion, e.g. classification accuracy. We study how active learning affects AUC. We examine two existing algorithms from the literature and present our own active learning algorithms designed to maximize the AUC of the hypothesis. One of our algorithms was consistently the top performer, and Closest Sampling from the literature often came in second behind it. When good posterior probability estimates were available, our heuristics were by far the best.
  • Keywords
    learning (artificial intelligence); active learning algorithms; closest sampling from; machine learning algorithm; performance criterion; receiver operating curve analysis; Algorithm design and analysis; Computer science; Labeling; Machine learning; Machine learning algorithms; Robustness; Sampling methods; Support vector machine classification; Support vector machines; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2006. ICDM '06. Sixth International Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1550-4786
  • Print_ISBN
    0-7695-2701-7
  • Type

    conf

  • DOI
    10.1109/ICDM.2006.12
  • Filename
    4053043