• DocumentCode
    1968923
  • Title

    A Metrics-Based Data Mining Approach for Software Clone Detection

  • Author

    Abd-El-Hafiz, Salwa K.

  • Author_Institution
    Eng. Math. Dept., Cairo Univ., Cairo, Egypt
  • fYear
    2012
  • fDate
    16-20 July 2012
  • Firstpage
    35
  • Lastpage
    41
  • Abstract
    The detection of function clones in software systems is valuable for the code adaptation and error checking maintenance activities. This paper presents an efficient metrics-based data mining clone detection approach. First, metrics are collected for all functions in the software system. A data mining algorithm, fractal clustering, is then utilized to partition the software system into a relatively small number of clusters. Each of the resulting clusters encapsulates functions that are within a specific proximity of each other in the metrics space. Finally, clone classes, rather than pairs, are easily extracted from the resulting clusters. For large software systems, the approach is very space efficient and linear in the size of the data set. Evaluation is performed using medium and large open source software systems. In this evaluation, the effect of the chosen metrics on the detection precision is investigated.
  • Keywords
    data mining; software metrics; code adaptation; error checking; fractal clustering; metrics based data mining approach; metrics space; open source software systems; software clone detection; software systems; Cloning; Clustering algorithms; Complexity theory; Fractals; Measurement; Software systems; clone detection; clustering; data mining; fractal dimension; software metrics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference (COMPSAC), 2012 IEEE 36th Annual
  • Conference_Location
    Izmir
  • ISSN
    0730-3157
  • Print_ISBN
    978-1-4673-1990-4
  • Electronic_ISBN
    0730-3157
  • Type

    conf

  • DOI
    10.1109/COMPSAC.2012.14
  • Filename
    6340252