• DocumentCode
    731529
  • Title

    A Dataset of the Activity of the Git Super-repository of Linux in 2012

  • Author

    German, Daniel M. ; Adams, Bram ; Hassan, Ahmed E.

  • Author_Institution
    Univ. of Victoria, Victoria, BC, Canada
  • fYear
    2015
  • fDate
    16-17 May 2015
  • Firstpage
    470
  • Lastpage
    473
  • Abstract
    This dataset documents the activity in the public portion of the git Super-repository of the Linux kernel during 2012. In a distributed version control system, such as git, the Super-repository is the collection of all the repositories (repos) used for development. In such a Super-repository, some repos will be accessible only by their owners (they are private, and are located in places that are unreachable to other users) while others are available to other members of the team. The latter public repositories are used as avenues through which commits flow from one developer to another. During the last six weeks of 2011, we proceeded to automatically discover the public portion of the Super-repository of Linux. Then, in 2012, every 3 hrs, each of these public repositories was queried to see what new commits it had and what commits had disappeared from it using a process we call continuous mining. This resulted in the identification of 533,513 different commits across 451 different public repositories and how they propagated through the Linux Super-repository, including the repository of Linus Torvalds (i.e., The main repository of the Linux kernel). This information could help us understand how kernel contributors use git, how they collaborate and how commits are integrated into the Linux kernel and into the repositories of organizations that distribute the kernel.
  • Keywords
    Linux; data mining; Linux kernel; continuous mining; distributed version control system; git super-repository; public repositories; Control systems; Data mining; Electronic mail; Kernel; Linux; Metadata; Linux; Mining software repositories; dataset; git;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/MSR.2015.66
  • Filename
    7180120