Title :
The Firefox Temporal Defect Dataset
Author :
Habayeb, Mayy ; Miranskyy, Andriy ; Murtaza, Syed Shariyar ; Buchanan, Leotis ; Bener, Ayse
Author_Institution :
Data Sci. Lab., Ryerson Univ., Toronto, ON, Canada
Abstract :
The bug tracking repositories of software projects capture initial defect (bug) reports and the history of interactions among developers, testers, and customers. Extracting and mining information from these repositories is time consuming and daunting. Researchers have focused mostly on analyzing the frequency of the occurrence of defects and their attributes (e.g., The number of comments and lines of code changed, count of developers). However, the counting process eliminates information about the temporal alignment of events leading to changes in the attributes count. Software quality teams could plan and prioritize their work more efficiently if they were aware of these temporal sequences and knew their frequency of occurrence. In this paper, we introduce a novel dataset mined from the Fire fox bug repository (Bugzilla) which contains information about the temporal alignment of developer interactions. Our dataset covers eight years of data from the Fire fox project on activities throughout the project´s lifecycle. Some of these activities have not been reported in frequency-based or other temporal datasets. The dataset we mined from the Fire fox project contains new activities, such as reporter experience, file exchange events, code-review process activities, and setting of milestones. We believe that this new dataset will improve analysis of bug reports and enable mining of temporal relationships so that practitioners can enhance their bug-fixing process.
Keywords :
data mining; program debugging; project management; search engines; Bugzilla; Firefox bug repository; Firefox temporal defect dataset; attribute count; bug report analysis improvement; bug tracking repositories; bug-fixing process; code lines; code-review process activities; defect attributes; defect occurrence; developer count; developer interactions; file exchange events; frequency analysis; frequency-based activities; information extraction; information mining; occurrence frequency; project lifecycle; reporter experience; software projects; software quality teams; temporal datasets; temporal event alignment; temporal relationship mining; temporal sequences; Communities; Computer bugs; Data mining; Feature extraction; History; Software; Software engineering; Bug reports; Bug repositories; Dataset; Defect tracking; Temporal activities;
Conference_Titel :
Mining Software Repositories (MSR), 2015 IEEE/ACM 12th Working Conference on
Conference_Location :
Florence
DOI :
10.1109/MSR.2015.73