DocumentCode :
240463
Title :
Using regular expressions for mining data in large software repositories
Author :
Awang Abu Bakar, Normi Sham
Author_Institution :
Dept. of Comput. Sci., Int. Islamic Univ. Malaysia, Kuala Lumpur, Malaysia
fYear :
2014
fDate :
17-18 Nov. 2014
Firstpage :
1
Lastpage :
6
Abstract :
The usage of data mining technique in collecting data from software repositories involves the extraction of both basic and value-added information from existing software repositories. Regular Expressions (Regex) provide a mechanism to select specific strings from a set of character strings. In this paper, we discuss how regular expressions are used to create a data mining tool, known as OSSGrab. We developed the mining tool using Python scripting, in combination with Regex, and as a result, the time spent on data collection can be saved significantly.
Keywords :
authoring languages; data mining; formal languages; software engineering; OSSGrab; Python scripting; Regex; basic information extraction; character strings; data collection; data mining technique; large-software repositories; regular expressions; value-added information extraction; Decision support systems; Data mining; empirical software engineering; open source; regular expressions; software repositories;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Communication Technology for The Muslim World (ICT4M), 2014 The 5th International Conference on
Conference_Location :
Kuching
Type :
conf
DOI :
10.1109/ICT4M.2014.7020649
Filename :
7020649
Link To Document :
بازگشت