Title :
Clustering software systems to identify subsystem structures using knowledgebase
Author :
Adnan, Md Nasim ; Islam, Md Rashedul ; Hossain, Sazzad
Author_Institution :
CSE Dept., Univ. of Liberal Arts Bangladesh (ULAB), Dhaka, Bangladesh
Abstract :
The structure of a software system deteriorates as a result of continuous maintenance activity. For the purpose of software reengineering or reverse engineering, software engineers often get the original source code as the most updated source of information due to lack of current documentation and limited or nonexistent availability of the original designers. The application of clustering techniques to software systems aiming to discover feature-oriented and meaningful subsystems can help software engineers involved in software reengineering or reverse engineering to understand high-level features provided by those subsystems. Continuous research is going on in the recent years - addressing different issues in the software clustering problem. Our software clustering approach introduces the use of Knowledgebase, which leads to considerable improvement than the existing approaches. Similarity measurement is the key to perform successful clustering. Similarity measurement in the existing approaches has a common drawback that they do not incorporate the diversity of software systems. Our approach uses Knowledgebase which acts as a repository of information about the internal structure of the generic types of the software systems to provide guidelines on similarity measurement criteria and their respective weightages. The final clustering is done by populating automatically generated subsystems along with the known subsystems (provided by Knowledgebase). In our research, we have developed a tool named “ULAB Cluster 1.0” which implements our new clustering approach. This clustering tool has been evaluated by using a benchmark named “MoJo distance” for different well-known software systems. The experimental results show that our approach generates more appropriate subsystems than the other existing clustering approaches and outperforms others in different dimensions of software clustering quality.
Keywords :
feature extraction; knowledge based systems; pattern clustering; pattern matching; reverse engineering; software maintenance; Mojo distance; ULAB cluster 1.0; clustering software system; continuous maintenance activity; feature-oriented subsystem; high-level feature; information repository; knowledge base system; nonexistent availability; reverse engineering; similarity measurement; similarity to measurement criteria; software clustering quality; software reengineering; subsystem structure identification; Clustering algorithms; Libraries; Software algorithms; Software measurement; Software systems; Weight measurement; Knowledgebase; reengineering; reverse engineering; software clustering; software engineering;
Conference_Titel :
Software Engineering (MySEC), 2011 5th Malaysian Conference in
Conference_Location :
Johor Bahru
Print_ISBN :
978-1-4577-1530-3
DOI :
10.1109/MySEC.2011.6140714