DocumentCode :
147578
Title :
Choosing a profile length in the SCAP method of source code authorship attribution
Author :
Tennyson, Matthew F. ; Mitropoulos, Frank J.
Author_Institution :
Dept. of Comput. Sci. & Inf. Syst., Bradley Univ., Peoria, IL, USA
fYear :
2014
fDate :
13-16 March 2014
Firstpage :
1
Lastpage :
6
Abstract :
Source code authorship attribution is the task of determining the author of source code whose author is not explicitly known. One specific method of source code authorship attribution that has been shown to be extremely effective is the SCAP method. This method, however, relies on a parameter L that has heretofore been quite nebulous. In the SCAP method, each candidate author´s known work is represented as a profile of that author, where the parameter L defines the profile´s maximum length. In this study, alternative approaches for selecting a value for L were investigated. Several alternative approaches were found to perform better than the baseline approach used in the SCAP method. The approach that performed the best was empirically shown to improve the performance from 91.0% to 97.2% measured as a percentage of documents correctly attributed using a data set consisting of 7,231 programs written in Java and C++.
Keywords :
C++ language; Java; source code (software); C++ language; Java language; SCAP method; data set; profile length; source code authorship attribution; Frequency control; Frequency measurement; RNA; authorship attribution; information retrieval; plagiarism detection; software forensics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
SOUTHEASTCON 2014, IEEE
Conference_Location :
Lexington, KY
Type :
conf
DOI :
10.1109/SECON.2014.6950705
Filename :
6950705
Link To Document :
بازگشت