DocumentCode :
1974899
Title :
On mining data across software repositories
Author :
Anbalagan, Prasanth ; Vouk, Mladen
Author_Institution :
Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC
fYear :
2009
fDate :
16-17 May 2009
Firstpage :
171
Lastpage :
174
Abstract :
Software repositories provide abundance of valuable information about open source projects. With the increase in the size of the data maintained by the repositories, automated extraction of such data from individual repositories, as well as of linked information across repositories, has become a necessity. In this paper we describe a framework that uses web scraping to automatically mine repositories and link information across repositories. We discuss two implementations of the framework. In the first implementation, we automatically identify and collect security problem reports from project repositories that deploy the Bugzilla bug tracker using related vulnerability information from the National Vulnerability Database. In the second, we collect security problem reports for projects that deploy the Launchpad bug tracker along with related vulnerability information from the National Vulnerability Database. We have evaluated our tool on various releases of Fedora, Ubuntu, Suse, RedHat, and Firefox projects. The percentage of security bugs identified using our tool is consistent with that reported by other researchers.
Keywords :
Internet; data mining; program debugging; public domain software; Bugzilla bug tracker; Launchpad bug tracker; Web scraping; data mining; open source projects; software repositories; Computer bugs; Computer science; Data mining; Data security; Databases; Government; Information retrieval; Information security; National security; Open source software;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mining Software Repositories, 2009. MSR '09. 6th IEEE International Working Conference on
Conference_Location :
Vancouver, BC
Print_ISBN :
978-1-4244-3493-0
Type :
conf
DOI :
10.1109/MSR.2009.5069498
Filename :
5069498
Link To Document :
بازگشت