DocumentCode
1890738
Title
A Framework for Studying Clones In Large Software Systems
Author
Jiang, Zhen Ming ; Hassan, Ahmed E.
Author_Institution
Univ. of Victoria, Victoria
fYear
2007
fDate
Sept. 30 2007-Oct. 1 2007
Firstpage
203
Lastpage
212
Abstract
Clones are code segments that have been created by copying-and-pasting from other code segments. Clones occur often in large software systems. It is reported that 5 to 50% of the source code of a large software system is cloned. A major challenge when studying code cloning in large software systems is handling the large amount of clone candidates produced by leading edge clone detection tools. For example, the CCFinder, clone detection tool, produces over 7 million pairs of clone candidates for the Linux kernel (which consists of over 4MLOC). Moreover, the output of clone detection tools grows rapidly as a software system evolves. Researchers and developers need tools to help them study the large amount of clone data in order to better understand the clone phenomena in large systems. In this paper, we propose a data mining framework to help researchers cope with the large amount of data produced by clone detection tools. We propose techniques to reduce, abstract and highlight the most interesting data produced by clone detection tools. Our framework also introduces a visualization tool which allows users to query and explore clone data at various abstraction levels. We demonstrate our framework on a case study of the clone phenomena in the Linux kernel.
Keywords
Linux; data mining; data visualisation; Linux kernel; data mining; data visualization tool; edge clone detection tool; large software system; source code segmentation; Cloning; Data mining; Data visualization; Kernel; Linux; Software systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Source Code Analysis and Manipulation, 2007. SCAM 2007. Seventh IEEE International Working Conference on
Conference_Location
Paris
Print_ISBN
978-0-7695-2880-9
Type
conf
DOI
10.1109/SCAM.2007.19
Filename
4362914
Link To Document