DocumentCode
3277378
Title
AST-based multi-language plagiarism detection method
Author
Li ping Zhang ; Dong sheng Liu
Author_Institution
Coll. of Comput. & Inf. Eng., Inner Mongolia Normal Univ., Hohhot, China
fYear
2013
fDate
23-25 May 2013
Firstpage
738
Lastpage
742
Abstract
To detect plagiarism on programming course, the plagiarism detection method based on Abstract Syntax Tree (AST) is proposed. First, we parse source codes into the corresponding AST by syntax analyzer and a biology sequence matching algorithm is used to calculate the similarities of programs. Second, the AST features of similar parts of the programs are extracted and then space vectors of the features are obtained. Finally, we find “copy cluster” by clustering the vectors. Experimental results show that this method has a good effect on the detection of plagiarism and can also find the “copy cluster” accurately.
Keywords
C language; Java; computational linguistics; computer science education; educational courses; grammars; pattern clustering; program diagnostics; software engineering; AST-based multilanguage plagiarism detection method; abstract syntax tree; biology sequence matching algorithm; copy cluster; feature space vectors; program similarity calculation; programming course; source code parsing; syntax analyzer; vector clustering; Biological information theory; AST; Cluster; Plagiarism detection; Sequence alignment;
fLanguage
English
Publisher
ieee
Conference_Titel
Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on
Conference_Location
Beijing
ISSN
2327-0586
Print_ISBN
978-1-4673-4997-0
Type
conf
DOI
10.1109/ICSESS.2013.6615411
Filename
6615411
Link To Document