• DocumentCode
    635206
  • Title

    Boa: A language and infrastructure for analyzing ultra-large-scale software repositories

  • Author

    Dyer, R. ; Hoan Anh Nguyen ; Rajan, Hridesh ; Nguyen, Tuan N.

  • Author_Institution
    Iowa State Univ., Ames, IA, USA
  • fYear
    2013
  • fDate
    18-26 May 2013
  • Firstpage
    422
  • Lastpage
    431
  • Abstract
    In today´s software-centric world, ultra-large-scale software repositories, e.g. SourceForge (350,000+ projects), GitHub (250,000+ projects), and Google Code (250,000+ projects) are the new library of Alexandria. They contain an enormous corpus of software and information about software. Scientists and engineers alike are interested in analyzing this wealth of information both for curiosity as well as for testing important hypotheses. However, systematic extraction of relevant data from these repositories and analysis of such data for testing hypotheses is hard, and best left for mining software repository (MSR) experts! The goal of Boa, a domain-specific language and infrastructure described here, is to ease testing MSR-related hypotheses. We have implemented Boa and provide a web-based interface to Boa´s infrastructure. Our evaluation demonstrates that Boa substantially reduces programming efforts, thus lowering the barrier to entry. We also see drastic improvements in scalability. Last but not least, reproducing an experiment conducted using Boa is just a matter of re-running small Boa programs provided by previous researchers.
  • Keywords
    Internet; software packages; Alexandria new library; Boa; Boa infrastructure; GitHub; Google code; MSR related hypotheses; SourceForge; domain specific language; mining software repository; software centric world; systematic extraction; ultra-large-scale software repositories analysis; Data mining; Java; Libraries; Protocols; Runtime; Software; ease of use; lower barrier to entry; mining; repository; reproducible; scalable; software;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering (ICSE), 2013 35th International Conference on
  • Conference_Location
    San Francisco, CA
  • Print_ISBN
    978-1-4673-3073-2
  • Type

    conf

  • DOI
    10.1109/ICSE.2013.6606588
  • Filename
    6606588