DocumentCode :
2115913
Title :
Using Association Rules to Identify Similarities between Software Datasets
Author :
Anwar, Sohel ; Rana, Z.A. ; Shamail, Shafay ; Awais, Mian M.
Author_Institution :
SSE, LUMS, Lahore, Pakistan
fYear :
2012
fDate :
3-6 Sept. 2012
Firstpage :
114
Lastpage :
119
Abstract :
A number of V&V datasets are publicly available. These datasets have software measurements and defectiveness information regarding the software modules. To facilitate V&V, numerous defect prediction studies have used these datasets and have detected defective modules effectively. Software developers and managers can benefit from the existing studies to avoid analogous defects and mistakes if they are able to find similarity between their software and the software represented by the public datasets. This paper identifies the similar datasets by comparing association patterns in the datasets. The proposed approach finds association rules from each dataset and identifies the overlapping rules from the 100 strongest rules from each of the two datasets being compared. Afterwards, average support and average confidence of the overlap is calculated to determine the strength of the similarity between the datasets. This study compares eight public datasets and results show that KC2 and PC2 have the highest similarity 83% with 97% support and 100% confidence. Datasets with similar attributes and almost same number of attributes have shown higher similarity than the other datasets.
Keywords :
data mining; software engineering; KC2; PC2; V&V datasets; association patterns; association rules; software datasets; software defectiveness information; software developers; software measurements; association rules; dataset similarity; defect prediction; software measures;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Quality of Information and Communications Technology (QUATIC), 2012 Eighth International Conference on the
Conference_Location :
Lisbon
Print_ISBN :
978-1-4673-2345-1
Type :
conf
DOI :
10.1109/QUATIC.2012.66
Filename :
6511790
Link To Document :
بازگشت