Title :
Code clustering workbench
Author :
Annervaz, K.M. ; Kaulgud, Vikrant ; Misra, Janardan ; Sengupta, Sabyasachi ; Titus, Geevarghese ; Munshi, Amit
Author_Institution :
Accenture Technol. Labs., Bangalore, India
Abstract :
Source code clustering is an important technique used in software development and maintenance to understand the modular structure of code. An array of algorithms are available for clustering like simulated annealing based search. Source code have different kinds of features such as structural or textual features. The collection of these different types of source code features and computation of relevant feature metrics is a difficult task. Further, the clustering algorithms can run on metrics based on different types of source code features or their combinations. This flexibility makes it non-trivial to test effectiveness of clustering algorithms on a source code. In this paper, we present a highly configurable clustering workbench that allows the user to collect the various source code features and then to select the code features used for clustering, the clustering algorithm and its various parameters. Clustering quality metrics are computed. They allow comparison of algorithm output based on different combinations of code-features and algorithms. We also present the specific contribution made in multi-dimensional feature analysis and clustering. The tool hides the algorithm complexity from the user, thus allowing complete focus on understanding the ´effect´ of the configuration choices. We have also applied this tool in real-life maintenance projects, where the users found it useful to tweak the clustering techniques for the source-code peculiarities.
Keywords :
pattern clustering; search problems; simulated annealing; software maintenance; software metrics; software quality; algorithm complexity; modular structure; multi-dimensional feature analysis; quality metrics clustering; real-life maintenance projects; simulated annealing based search; software development; software maintenance; source code clustering workbench; source code features; Algorithm design and analysis; Clustering algorithms; Feature extraction; Java; Measurement; Partitioning algorithms; Vectors; Clustering; Experimental Workbench; Semantic Indexing; Source Code Analysis;
Conference_Titel :
Source Code Analysis and Manipulation (SCAM), 2013 IEEE 13th International Working Conference on
Conference_Location :
Eindhoven
DOI :
10.1109/SCAM.2013.6648181