• DocumentCode
    19587
  • Title

    Active Learning for Ranking through Expected Loss Optimization

  • Author

    Bo Long ; Jiang Bian ; Chapelle, Olivier ; Ya Zhang ; Inagaki, Yoshiyuki ; Yi Chang

  • Author_Institution
    LinkedIn Inc., Mountain View, CA, USA
  • Volume
    27
  • Issue
    5
  • fYear
    2015
  • fDate
    May 1 2015
  • Firstpage
    1180
  • Lastpage
    1191
  • Abstract
    Learning to rank arises in many data mining applications, ranging from web search engine, online advertising to recommendation system. In learning to rank, the performance of a ranking model is strongly affected by the number of labeled examples in the training set; on the other hand, obtaining labeled examples for training data is very expensive and time-consuming. This presents a great need for the active learning approaches to select most informative examples for ranking learning; however, in the literature there is still very limited work to address active learning for ranking. In this paper, we propose a general active learning framework, expected loss optimization (ELO), for ranking. The ELO framework is applicable to a wide range of ranking functions. Under this framework, we derive a novel algorithm, expected discounted cumulative gain (DCG) loss optimization (ELO-DCG), to select most informative examples. Then, we investigate both query and document level active learning for raking and propose a two-stage ELO-DCG algorithm which incorporate both query and document selection into active learning. Furthermore, we show that it is flexible for the algorithm to deal with the skewed grade distribution problem with the modification of the loss function. Extensive experiments on real-world web search data sets have demonstrated great potential and effectiveness of the proposed framework and algorithms.
  • Keywords
    data mining; learning (artificial intelligence); optimisation; query processing; ELO framework; Web search engine; active learning; data mining applications; document level active learning; document selection; expected discounted cumulative gain loss optimization; expected loss optimization; online advertising; query selection; ranking model; real-world Web search data sets; recommendation system; skewed grade distribution problem; two-stage ELO-DCG algorithm; Bayes methods; Electronic mail; Optimization; Training; Training data; Uncertainty; Web search; Active Learning; Active learning; Expected Loss Optimization; Ranking; expected loss optimization; ranking;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2014.2365785
  • Filename
    6940296