• DocumentCode
    748260
  • Title

    Comparative study of clustering algorithms and abstract representations for software remodularisation

  • Author

    Anquetil, N. ; Lethbridge, T.C.

  • Author_Institution
    Catholic Univ. of Brasilia, Brazil
  • Volume
    150
  • Issue
    3
  • fYear
    2003
  • fDate
    6/24/2003 12:00:00 AM
  • Firstpage
    185
  • Lastpage
    201
  • Abstract
    As valuable software systems become older, reverse engineering becomes increasingly important to companies that have to maintain the code. Clustering is a key activity in reverse engineering that is used to discover improved designs of systems or to extract significant concepts from code. Clustering is an old, highly sophisticated, activity which offers many methods to meet different needs. The various methods have been well documented in the past; however, conclusions from general clustering literature may not apply entirely to the reverse engineering domain. In the paper, the authors study three decisions that need to be made when clustering: the choice of (i) abstract descriptions of the entities to be clustered, (ii) metrics to compute coupling between the entities, and (iii) clustering algorithms. For each decision, our objective is to understand which choices are best when performing software remodularisation. The experiments were conducted on three public domain systems (gcc, Linux and Mosaic) and a real world legacy system (2 million LOC). Among other things, the authors confirm the importance of a proper description scheme for the entities being clustered, list a few effective coupling metrics and characterise the quality of different clustering algorithms. They also propose description schemes not directly based on the source code, and advocate better formal evaluation methods for the clustering results.
  • Keywords
    reverse engineering; software engineering; abstract descriptions; clustering; formal evaluation methods; public domain systems; real world legacy system; reverse engineering; software remodularisation; valuable software systems;
  • fLanguage
    English
  • Journal_Title
    Software, IEE Proceedings -
  • Publisher
    iet
  • ISSN
    1462-5970
  • Type

    jour

  • DOI
    10.1049/ip-sen:20030581
  • Filename
    1214742