Title :
Winnowing-Based Similar Text Positioning Method
Author :
Du Zou ; Long, WeiJiang ; Ling, Zhang
Author_Institution :
Sch. of Comput. Sci. & Eng., South China Univ. of Technol., Guangzhou, China
Abstract :
Similar text positioning is a key step in plagiarism detection to decide the position of similar texts in the documents. A 2-step approximate merging method is proposed as follows: Heuristic approximate merging is used for error reduction in processing the text sampling fingers; Clustering methods is used to reduce the disturbance information influence on text positioning when merging the adjacent segments; Non-overlapping reverse index is used to position the similar texts. The method is applied in the homework plagiarism module of a learning platform for higher education. Taking PAN´09 public plagiarism corpus as benchmark, the principal performance indexes are better than those of reported finger-based methods and commercial software.
Keywords :
further education; merging; pattern clustering; security of data; text analysis; word processing; 2-step approximate merging method; PAN´09 public plagiarism corpus; clustering method; error reduction; finger sampling; higher education; learning platform; performance index; plagiarism detection; text processing; winnowing based similar text positioning method; Computer science; Conferences; Educational institutions; Fingerprint recognition; Merging; Plagiarism; Software;
Conference_Titel :
Internet Technology and Applications, 2010 International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-5142-5
Electronic_ISBN :
978-1-4244-5143-2
DOI :
10.1109/ITAPP.2010.5566138