DocumentCode :
260389
Title :
Performance efficiency in plagiarism indication detection system using indexing method with data structure 2–3 tree
Author :
Suryana, Annisa Fitriana ; Wibowo, Agung Toto ; Romadhony, Ade
Author_Institution :
Sch. of Comput., Telkom Univ., Bandung, Indonesia
fYear :
2014
fDate :
28-30 May 2014
Firstpage :
403
Lastpage :
408
Abstract :
Plagiarism is a form of cheating that has been so much happen. One of prevention is to make the anti-plagiarism system. The system that must compare a query document with all documents in the database requires a very long time. The more irrelevant document in database compare with the query that will be matched will waste the time. This paper will discuss a system to detect plagiarism by using indexing method as a way to eliminate irrelevant documents in order to reduce the document database that will be matched with the query document. Matching between a query document and documents in database will be done with Longest Common Subsequence (LCS) algorithm. The system will use inverted index as the form to eliminate irrelevant documents using a 2-3 tree data structure. Indexing is done by inserting the fingerprint of the document. To find the fingerprint this paper will use winnowing algorithm. The results of the system shows to execute 1 query and 10000 documents corpus, most of them are not relevant, takes 59 seconds and 134 seconds with and without respectively. The f-measure value, the average value of precision and recall, is obtained 0.7387 by indexing with 0.15 as the threshold of indexing elimination and 0.000428 without indexing.
Keywords :
copyright; document handling; indexing; query processing; tree data structures; LCS algorithm; antiplagiarism system; data structure 2-3 tree; document database reduction; document fingerprint; indexing elimination threshold; indexing method; inverted index; irrelevant documents elimination; longest common subsequence algorithm; performance efficiency; plagiarism indication detection system; query document; winnowing algorithm; Communications technology; Educational institutions; Fingerprint recognition; Indexing; Plagiarism; 2–3 tree; corpus; fingerprint; indexing; plagiarism detection system;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Communication Technology (ICoICT), 2014 2nd International Conference on
Conference_Location :
Bandung
Type :
conf
DOI :
10.1109/ICoICT.2014.6914096
Filename :
6914096
Link To Document :
بازگشت