DocumentCode :
3694203
Title :
Evaluating clone detection tools with BigCloneBench
Author :
Jeffrey Svajlenko;Chanchal K. Roy
Author_Institution :
Department of Computer Science, University of Saskatchewan, Canada
fYear :
2015
Firstpage :
131
Lastpage :
140
Abstract :
Many clone detection tools have been proposed in the literature. However, our knowledge of their performance in real software systems is limited, particularly their recall. In this paper, we use our big data clone benchmark, BigCloneBench, to evaluate the recall of ten clone detection tools. BigCloneBench is a collection of eight million validated clones within IJaDataset-2.0, a big data software repository containing 25,000 open-source Java systems. BigCloneBench contains both intra-project and inter-project clones of the four primary clone types. We use this benchmark to evaluate the recall of the tools per clone type and across the entire range of clone syntactical similarity. We evaluate the tools for both single-system and cross-project detection scenarios. Using multiple clone-matching metrics, we evaluate the quality of the tools´ reporting of the benchmark clones with respect to refactoring and automatic clone analysis use-cases. We compare these real-world results against our Mutation and Injection Framework, a synthetic benchmark, to reveal deeper understanding of the tools. We found that the tools have strong recall for Type-1 and Type-2 clones, as well as Type-3 clones with high syntactical similarity. The tools have weaker detection of clones with lower syntactical similarity.
Keywords :
"Cloning","Benchmark testing","Software systems","Big data","Java","Data mining"
Publisher :
ieee
Conference_Titel :
Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICSM.2015.7332459
Filename :
7332459
Link To Document :
بازگشت