Title : 
Recursive Fact-Finding: A Streaming Approach to Truth Estimation in Crowdsourcing Applications
         
        
            Author : 
Dong Wang ; Abdelzaher, Tarek ; Kaplan, Lance ; Aggarwal, Charu C.
         
        
            Author_Institution : 
Dept. of Comput. Sci., Univ. of Illinois, Urbana, IL, USA
         
        
        
        
        
        
            Abstract : 
This paper presents a streaming approach to solve the truth estimation problem in crowdsourcing applications. We consider a category of crowdsourcing applications where a group of individuals volunteer (or are recruited to) share certain observations or measurements about the physical world. Examples include reporting locations of gas stations that remain operational after a natural disaster or reporting locations of potholes on city streets. We call such applications social sensing. Ascertaining the correctness of reported observations is a key challenge in such applications, referred to as the truth estimation problem. This problem is made difficult by the fact that the reliability of individual sources is usually unknown a priori, since any concerned citizen may, in principle, participate. Moreover, the timescales of crowdsourcing campaigns of interest can be as small as a few hours or days, which does not offer enough history for a reputation system to converge. Instead, recent prior work, including our own, developed fact-finding algorithms to solve this problem by iteratively assessing the credibility of sources and their claims in the absence of reputation scores. Such algorithms, however, operate on the entire dataset of reported observations in a batch fashion, which makes them less suited to applications where new observations arrive continually. In this paper, we describe a streaming fact-finder that recursively updates previous estimates based on new data. The recursive algorithm solves an expectation maximization (EM) problem to determine the odds of correctness of different observations. We compare the performance of our recursive EM algorithm to a batch EM algorithm, as well as to several state-of-art fact-finders through extensive simulations. We also demonstrate convergence of the recursive algorithm to the results of the batch version through a real social sensing experiment. Our evaluation shows that the proposed approach can process data streams- much more efficiently while keeping the truth estimation accuracy close to that of the (much slower) batch algorithm. Ours is therefore the first fact-finder developed with explicit consideration to the continuous update needs of crowd-sourcing applications.
         
        
            Keywords : 
data mining; estimation theory; expectation-maximisation algorithm; social networking (online); EM problem; crowdsourcing application; expectation maximization problem; recursive fact-finding; social sensing; streaming fact-finder; truth estimation problem; Atmospheric measurements; Equations; Estimation; Mathematical model; Particle measurements; Reliability; Sensors; real-time; recursive expectation maximization; social sensing; streaming data; truth discovery;
         
        
        
        
            Conference_Titel : 
Distributed Computing Systems (ICDCS), 2013 IEEE 33rd International Conference on
         
        
            Conference_Location : 
Philadelphia, PA
         
        
        
        
            DOI : 
10.1109/ICDCS.2013.54