مرکز منطقه ای اطلاع رساني علوم و فناوري - Positive-Unlabeled Learning in the Face of Labeling Bias

DocumentCode :

3739214

Title :

Positive-Unlabeled Learning in the Face of Labeling Bias

Author :

Noah Youngs;Dennis Shasha;Richard Bonneau

Author_Institution :

CY Data Sci., New York, NY, USA

fYear :

2015

Firstpage :

639

Lastpage :

645

Abstract :

Positive-Unlabeled (PU) learning scenarios are a class of semi-supervised learning where only a fraction of the data is labeled, and all available labels are positive. The goal is to assign correct (positive and negative) labels to as much data as possible. Several important learning problems fall into the PU-learning domain, as in many cases the cost and feasibility of obtaining negative examples is prohibitive. In addition to the positive-negative disparity the overall cost of labeling these datasets typically leads to situations where the number of unlabeled examples greatly outnumbers the labeled. Accordingly, we perform several experiments, on both synthetic and real-world datasets, examining the performance of state of the art PU-learning algorithms when there is significant bias in the labeling process. We propose novel PU algorithms and demonstrate that they outperform the current state of the art on a variety of benchmarks. Lastly, we present a methodology for removing the costly parameter-tuning step in a popular PU algorithm.

Keywords :

"Labeling","Training","Proteins","Algorithm design and analysis","Support vector machines","Tagging","Probability"

Publisher :

ieee

Conference_Titel :

Data Mining Workshop (ICDMW), 2015 IEEE International Conference on

Electronic_ISBN :

2375-9259

Type :

conf

DOI :

10.1109/ICDMW.2015.207

Filename :

7395727

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3739214