Title :
Plagiarism detection in Chinese based on chunk and paragraph weight
Author :
Wang, Tao ; Fan, Xiao-Zhong ; Liu, Jie
Author_Institution :
Dept. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing
Abstract :
Aiming at the Chinese academic paper plagiarism detection, proposed chunk based plagiarism detection algorithm with chunk extraction method based on character or word. Taken account of that different part of document has different importance, proposed two paragraph weight algorithms and defined three paragraph weight functions. The best chunk lengths are determined by experiments. Experiments show that using paragraph weight can enhance the detection effect.
Keywords :
natural language processing; text analysis; Chinese language; chunk extraction; paragraph weight; plagiarism detection; Cybernetics; Detection algorithms; Fingers; Information retrieval; Machine learning; Paper technology; Plagiarism; Printing; Probability; Space technology; Paragraph weight; Plagiarism detection; Text chunk;
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
DOI :
10.1109/ICMLC.2008.4620842