• DocumentCode
    3277378
  • Title

    AST-based multi-language plagiarism detection method

  • Author

    Li ping Zhang ; Dong sheng Liu

  • Author_Institution
    Coll. of Comput. & Inf. Eng., Inner Mongolia Normal Univ., Hohhot, China
  • fYear
    2013
  • fDate
    23-25 May 2013
  • Firstpage
    738
  • Lastpage
    742
  • Abstract
    To detect plagiarism on programming course, the plagiarism detection method based on Abstract Syntax Tree (AST) is proposed. First, we parse source codes into the corresponding AST by syntax analyzer and a biology sequence matching algorithm is used to calculate the similarities of programs. Second, the AST features of similar parts of the programs are extracted and then space vectors of the features are obtained. Finally, we find “copy cluster” by clustering the vectors. Experimental results show that this method has a good effect on the detection of plagiarism and can also find the “copy cluster” accurately.
  • Keywords
    C language; Java; computational linguistics; computer science education; educational courses; grammars; pattern clustering; program diagnostics; software engineering; AST-based multilanguage plagiarism detection method; abstract syntax tree; biology sequence matching algorithm; copy cluster; feature space vectors; program similarity calculation; programming course; source code parsing; syntax analyzer; vector clustering; Biological information theory; AST; Cluster; Plagiarism detection; Sequence alignment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Software Engineering and Service Science (ICSESS), 2013 4th IEEE International Conference on
  • Conference_Location
    Beijing
  • ISSN
    2327-0586
  • Print_ISBN
    978-1-4673-4997-0
  • Type

    conf

  • DOI
    10.1109/ICSESS.2013.6615411
  • Filename
    6615411