DocumentCode
28378
Title
Multicategory Crowdsourcing Accounting for Variable Task Difficulty, Worker Skill, and Worker Intention
Author
Kurve, Aditya ; Miller, David J. ; Kesidis, George
Author_Institution
Dept. of Electr. Eng., Pennsylvania State Univ., University Park, PA, USA
Volume
27
Issue
3
fYear
2015
fDate
March 1 2015
Firstpage
794
Lastpage
809
Abstract
Crowdsourcing allows instant recruitment of workers on the web to annotate image, webpage, or document databases. However, worker unreliability prevents taking a worker´s responses at “face value”. Thus, responses from multiple workers are typically aggregated to more reliably infer ground-truth answers. We study two approaches for crowd aggregation on multicategory answer spaces: stochastic modeling-based and deterministic objective function-based. Our stochastic model for answer generation plausibly captures the interplay between worker skills, intentions, and task difficulties and captures a broad range of worker types. Our deterministic objective-based approach aims to maximize the average aggregate confidence of weighted plurality crowd decision making. In both approaches, we explicitly model the skill and intention of individual workers, which is exploited for improved crowd aggregation. Our methods are applicable in both unsupervised and semi-supervised settings, and also when the batch of tasks is heterogeneous, i.e., from multiple domains, with task-dependent answer spaces. As observed experimentally, the proposed methods can defeat “tyranny of the masses”, i.e., they are especially advantageous when there is an (a priori unknown) minority of skilled workers amongst a large crowd of unskilled (and malicious) workers.
Keywords
Internet; database management systems; decision making; document handling; information retrieval; outsourcing; recruitment; unsupervised learning; Webpage annotation; answer generation; average aggregate confidence maximization; crowd aggregation; deterministic objective function-based multicategory answer space; deterministic objective-based approach; document database annotation; face value; ground-truth answers; image annotation; instant worker recruitment; multicategory crowdsourcing accounting; semi supervised settings; stochastic modeling-based multicategory answer space; task-dependent answer spaces; tyranny-of-the-masses; unsupervised settings; variable task difficulty; weighted plurality crowd decision making; worker intention; worker skill; Accuracy; Aggregates; Computational modeling; Data models; Probes; Stochastic processes; Crowdsourcing; ensemble classification; expectation-maximization; inference; multicategory;
fLanguage
English
Journal_Title
Knowledge and Data Engineering, IEEE Transactions on
Publisher
ieee
ISSN
1041-4347
Type
jour
DOI
10.1109/TKDE.2014.2327026
Filename
6823710
Link To Document