Title :
A process to mining issues of software repositories
Author :
Bautista, Ana Maria ; San Feliu, Tomas
Author_Institution :
Dept. Lenguajes y Sist. Inf. e Ing. de Software, Univ. Politec. de Madrid, Madrid, Spain
Abstract :
Public software repositories offer a great opportunity for researchers. GitHub is a repository with more than 10 million projects. GitHub has an implementation of a defect tracking system. This paper describes the process developed to extract defects from GitHub repository, one of the most widely used public repositories. In this work, besides of the process, it is presented the appeared difficulties, during data mining. With obtained data, it is pretended to apply neural networks to get defects prediction.
Keywords :
data mining; neural nets; program diagnostics; software engineering; GitHub repository; data mining; defect prediction; defect tracking system; issue mining process; neural networks; public software repository; Benchmark testing; Biological neural networks; Data mining; Encoding; Internet; Media; Software; Defect Tracking; Defects Prediction; Repositories;
Conference_Titel :
Information Systems and Technologies (CISTI), 2015 10th Iberian Conference on
Conference_Location :
Aveiro
DOI :
10.1109/CISTI.2015.7170552