Title :
The Maven repository dataset of metrics, changes, and dependencies
Author :
Raemaekers, Steven ; Van Deursen, Arie ; Visser, Joost
Author_Institution :
Software Improvement Group, Amsterdam, Netherlands
Abstract :
We present the Maven Dependency Dataset (MDD), containing metrics, changes and dependencies of 148,253 jar files. Metrics and changes have been calculated at the level of individual methods, classes and packages of multiple library versions. A complete call graph is also presented which includes call, inheritance, containment and historical relationships between all units of the entire repository. In this paper, we describe our dataset and the methodology used to obtain it. We present different conceptual views of MDD and we also describe limitations and data quality issues that researchers using this data should be aware of.
Keywords :
data mining; software libraries; software metrics; software packages; MDD; Maven repository dataset; complete call graph; data quality issues; jar file changes; jar file dependencies; jar file metrics; library version packages; Indexes; Java; Libraries; Measurement; Software; Supercomputers; Data mining; Dataset; Maven repository;
Conference_Titel :
Mining Software Repositories (MSR), 2013 10th IEEE Working Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
978-1-4799-0345-0
DOI :
10.1109/MSR.2013.6624031