• DocumentCode
    635201
  • Title

    Data clone detection and visualization in spreadsheets

  • Author

    Hermans, Frederik ; Sedee, Ben ; Pinzger, Martin ; Van Deursen, Arie

  • Author_Institution
    Software Eng. Res. Group, Delft Univ. of Technol., Delft, Netherlands
  • fYear
    2013
  • fDate
    18-26 May 2013
  • Firstpage
    292
  • Lastpage
    301
  • Abstract
    Spreadsheets are widely used in industry: it is estimated that end-user programmers outnumber programmers by a factor 5. However, spreadsheets are error-prone, numerous companies have lost money because of spreadsheet errors. One of the causes for spreadsheet problems is the prevalence of copy-pasting. In this paper, we study this cloning in spreadsheets. Based on existing text-based clone detection algorithms, we have developed an algorithm to detect data clones in spreadsheets: formulas whose values are copied as plain text in a different location. To evaluate the usefulness of the proposed approach, we conducted two evaluations. A quantitative evaluation in which we analyzed the EUSES corpus and a qualitative evaluation consisting of two case studies. The results of the evaluation clearly indicate that 1) data clones are common, 2) data clones pose threats to spreadsheet quality and 3) our approach supports users in finding and resolving data clones.
  • Keywords
    data visualisation; software performance evaluation; software quality; spreadsheet programs; EUSES corpus; copy-pasting; data clone detection; data clone visualization; end-user programmers; qualitative evaluation; spreadsheet errors; spreadsheet quality; text-based clone detection algorithms; Algorithm design and analysis; Cloning; Clustering algorithms; Companies; Data visualization; Detection algorithms; Educational institutions; clone detection; code smells; spreadsheet smells; spreadsheets;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering (ICSE), 2013 35th International Conference on
  • Conference_Location
    San Francisco, CA
  • Print_ISBN
    978-1-4673-3073-2
  • Type

    conf

  • DOI
    10.1109/ICSE.2013.6606575
  • Filename
    6606575