DocumentCode
3407045
Title
The GHTorent dataset and tool suite
Author
Gousios, Georgios
Author_Institution
Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands
fYear
2013
fDate
18-19 May 2013
Firstpage
233
Lastpage
236
Abstract
During the last few years, GitHub has emerged as a popular project hosting, mirroring and collaboration platform. GitHub provides an extensive REST API, which enables researchers to retrieve high-quality, interconnected data. The GHTorent project has been collecting data for all public projects available on Github for more than a year. In this paper, we present the dataset details and construction process and outline the challenges and research opportunities emerging from it.
Keywords
application program interfaces; groupware; information resources; information retrieval; software engineering; GHTorent dataset; GitHub; collaboration platform; extensive REST API; high-quality interconnected data retrieval; hosting platform; mirroring platform; tool suite; Collaboration; Data collection; Data mining; Databases; History; Organizations; Software engineering; GitHub; dataset; repository;
fLanguage
English
Publisher
ieee
Conference_Titel
Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on
Conference_Location
San Francisco, CA
ISSN
2160-1852
Print_ISBN
978-1-4799-0345-0
Type
conf
DOI
10.1109/MSR.2013.6624034
Filename
6624034
Link To Document