DocumentCode :
2286431
Title :
Original content extraction oriented to anti-plagiarism
Author :
Shen, Yang ; Cheng, Ming ; Yao, Xing ; Wei, Wei
Author_Institution :
Sch. of Inf. Manage., Wuhan Univ., Wuhan, China
fYear :
2009
fDate :
14-16 Sept. 2009
Firstpage :
17
Lastpage :
22
Abstract :
In order to reduce the impact of inclusion of citations and references during the detection of plagiarism in academic theses, and extract the original content, the author created three ways to extract original content and remove the citation: 1) Removal of normative citations by symbol features; 2) removal tacit citations by Bayesian method based on the minimum risk and thesis structure; 3) removal common knowledge base on domain public knowledge base. The research results show that during the extraction of original content, the precision decreases as the risk coefficient increases, while the recall rate increases with the risk coefficient. When the risk coefficient is 60, the whole performance achieves the optimum. Plagiarism detection after extracting the original content presents a fault rate decrease from 9.09% to 4.52%.
Keywords :
belief networks; citation analysis; information retrieval; Bayesian method; content extraction; normative citations removal; plagiarism detection; removal tacit citations; Conference management; Content management; Data mining; Engineering management; Knowledge management; Plagiarism; Prototypes; Risk management; Software libraries; Web pages; Beyes; citation removal; content extraction; plagiarism; thesis structure;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Management Science and Engineering, 2009. ICMSE 2009. International Conference on
Conference_Location :
Moscow
Print_ISBN :
978-1-4244-3970-6
Electronic_ISBN :
978-1-4244-3971-3
Type :
conf
DOI :
10.1109/ICMSE.2009.5317530
Filename :
5317530
Link To Document :
بازگشت