DocumentCode :
476209
Title :
Plagiarism detection in Chinese based on chunk and paragraph weight
Author :
Wang, Tao ; Fan, Xiao-Zhong ; Liu, Jie
Author_Institution :
Dept. of Comput. Sci. & Technol., Beijing Inst. of Technol., Beijing
Volume :
5
fYear :
2008
fDate :
12-15 July 2008
Firstpage :
2574
Lastpage :
2579
Abstract :
Aiming at the Chinese academic paper plagiarism detection, proposed chunk based plagiarism detection algorithm with chunk extraction method based on character or word. Taken account of that different part of document has different importance, proposed two paragraph weight algorithms and defined three paragraph weight functions. The best chunk lengths are determined by experiments. Experiments show that using paragraph weight can enhance the detection effect.
Keywords :
natural language processing; text analysis; Chinese language; chunk extraction; paragraph weight; plagiarism detection; Cybernetics; Detection algorithms; Fingers; Information retrieval; Machine learning; Paper technology; Plagiarism; Printing; Probability; Space technology; Paragraph weight; Plagiarism detection; Text chunk;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-2095-7
Electronic_ISBN :
978-1-4244-2096-4
Type :
conf
DOI :
10.1109/ICMLC.2008.4620842
Filename :
4620842
Link To Document :
بازگشت