• DocumentCode
    170256
  • Title

    Reaching Consensus in Crowdsourced Transcription of Biocollections Information

  • Author

    Matsunaga, Andrea ; Mast, Austin ; Fortes, Jose A. B.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Florida, Gainesville, FL, USA
  • Volume
    1
  • fYear
    2014
  • fDate
    20-24 Oct. 2014
  • Firstpage
    57
  • Lastpage
    64
  • Abstract
    Crowdsourcing can be a cost-effective method for tackling the problem of digitizing historical bio collections data, and a number of crowd sourcing platforms have been developed to facilitate interaction with the public and to design simple "Human Intelligence Tasks". However, the problem of reaching consensus on the response of the crowd is still challenging for tasks for which a simple majority vote is inadequate. This paper (a) describes the challenges faced when trying to reach consensus on data transcribed by different workers, (b) offers consensus algorithms for textual data and a consensus-based controller to assign a dynamic number of workers per task, and (c) proposes further enhancements of future crowd sourcing tasks in order to minimize the need for complex consensus algorithms. Experiments using the proposed algorithms show up to a 45-fold increase in ability to reach consensus when compared to majority voting using exact string matching. In addition, the controller is able to decrease the crowd sourcing cost by 55% when compared to a strategy that uses a fixed number of workers.
  • Keywords
    bioinformatics; string matching; biocollections information; complex consensus algorithm; consensus-based controller; crowd sourcing platform; crowd sourcing task; crowdsourced transcription; crowdsourcing; digitizing historical bio collections data; human intelligence task; majority voting; string matching; textual data; Accuracy; Approximation algorithms; Crowdsourcing; Heuristic algorithms; Materials; Optical character recognition software; Reliability; biocollections infrastructure; biodiversity; consensus algorithms; crowdsourcing; digitization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    e-Science (e-Science), 2014 IEEE 10th International Conference on
  • Conference_Location
    Sao Paulo
  • Print_ISBN
    978-1-4799-4288-6
  • Type

    conf

  • DOI
    10.1109/eScience.2014.30
  • Filename
    6972249