DocumentCode :
2641429
Title :
Plagiarism Detection Using the Levenshtein Distance and Smith-Waterman Algorithm
Author :
Su, Zhan ; Ahn, Byung-Ryul ; Eom, Ki-yol ; Kang, Min-koo ; Kim, Jin-Pyung ; Kim, Moon-Kyun
Author_Institution :
Dept. of Artificial Intell., Sungkyunkwan Univ., Suwon
fYear :
2008
fDate :
18-20 June 2008
Firstpage :
569
Lastpage :
569
Abstract :
Plagiarism in texts is issues of increasing concern to the academic community. Now most common text plagiarism occurs by making a variety of minor alterations that include the insertion, deletion, or substitution of words. Such simple changes, however, require excessive string comparisons. In this paper, we present a hybrid plagiarism detection method. We investigate the use of a diagonal line, which is derived from Levenshtein distance, and simplified Smith-Waterman algorithm that is a classical tool in the identification and quantification of local similarities in biological sequences, with a view to the application in the plagiarism detection. Our approach avoids globally involved string comparisons and considers psychological factors, which can yield significant speed-up by experiment results. Based on the results, we indicate the practicality of such improvement using Levenshtein distance and Smith-Waterman algorithm and to illustrate the efficiency gains. In the future, it would be interesting to explore appropriate heuristics in the area of text comparison.
Keywords :
text analysis; Levenshtein distance; Smith-Waterman algorithm; academic community; text plagiarism detection; Artificial intelligence; Computer science; Costs; Dynamic programming; Information theory; Intellectual property; Plagiarism; Psychology; Sequences; Writing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Innovative Computing Information and Control, 2008. ICICIC '08. 3rd International Conference on
Conference_Location :
Dalian, Liaoning
Print_ISBN :
978-0-7695-3161-8
Electronic_ISBN :
978-0-7695-3161-8
Type :
conf
DOI :
10.1109/ICICIC.2008.422
Filename :
4603758
Link To Document :
بازگشت