• DocumentCode
    501656
  • Title

    Temporally Robust Software Features for Authorship Attribution

  • Author

    Burrows, Steven ; Uitdenbogerd, Alexandra L. ; Turpin, Andrew

  • Author_Institution
    Sch. of Comput. Sci. & Inf. Technol., RMIT Univ., Melbourne, VIC, Australia
  • Volume
    1
  • fYear
    2009
  • fDate
    20-24 July 2009
  • Firstpage
    599
  • Lastpage
    606
  • Abstract
    Authorship attribution is used to determine the creator of works among many candidates, playing a vital role in software forensics, authorship disputes and academic integrity investigations. The evolving coding style of individuals may degrade the performance of systems that attribute authorship of source code, and has not been previously studied. This paper uses a collection of six programming assignments with guaranteed relative timestamps from 272 students to examine evolution of coding style. We find that the problem domain of the software developed has a large affect on the ability to attribute authorship, and that coding style does change over time regardless of the requirements that are coded. The outcomes suggest that it takes at least three programming tasks for coding style to settle, and that at least one piece of code in the same problem domain as the code to classify is necessary for accurate authorship attribution. In the final part of the paper we analyze low level code features to discover simple features that appear immune to evolution of coding style, and use them to improve effectiveness of our system from 79% to 82% (p < 0.01, z-test).
  • Keywords
    programming; adversarial information retrieval; authorship attribution; coding style; programming assignment; programming task; software forensics; source code; Algorithm design and analysis; Application software; Computer applications; Information retrieval; Law; Phase detection; Plagiarism; Robustness; Software maintenance; Writing; Adversarial Information Retrieval; Authorship Attribution; Coding Style Evolution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Software and Applications Conference, 2009. COMPSAC '09. 33rd Annual IEEE International
  • Conference_Location
    Seattle, WA
  • ISSN
    0730-3157
  • Print_ISBN
    978-0-7695-3726-9
  • Type

    conf

  • DOI
    10.1109/COMPSAC.2009.85
  • Filename
    5254209