DocumentCode :
1627597
Title :
Intrinsic Plagiarism Detection Using Latent Semantic Indexing and Stylometry
Author :
Alsallal, Muna ; Iqbal, Rahat ; Amin, Saad ; James, Anne
Author_Institution :
Fac. of Eng. & Comput., Coventry Univ., Coventry, UK
fYear :
2013
Firstpage :
145
Lastpage :
150
Abstract :
Plagiarism is growing increasingly for the last few years due to the rapid proliferation of information through the World Wide Web (WWW). In this paper, we present an integrated approach based on Latent Semantic Indexing (LSI) and Stylometry technique for intrinsic plagiarism detection. LSI is used for the term document matrix of dataset, whereas, stylometry is used for intrinsic approximation of human writing style. We have conducted a series of experiments to investigate the efficiency of dimensionality reduction (DR) parameter as the core for LSI technique in order to gain insights into its effects using a small corpus. Following that, we carried out comparative evaluation of our approach by using the LSI and Stylometry separately, and then applying them together. Our results show that the performance of the proposed approach was improved when an integrated approach consisting of LSI and stylometry was applied.
Keywords :
Internet; document handling; indexing; DR parameter; LSI; WWW; World Wide Web; dimensionality reduction parameter; human writing style; intrinsic approximation; intrinsic plagiarism detection; latent semantic indexing; stylometry; term document matrix; Indexing; Large scale integration; Matrix decomposition; Plagiarism; Semantics; Writing; Extrinsic Plagiarism; Intrinsic Plagiarism; Latent Semantic Indexing (LSI); Plagiarism; Stylometry Technique; Text Misuse;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Developments in eSystems Engineering (DeSE), 2013 Sixth International Conference on
Conference_Location :
Abu Dhabi
ISSN :
2161-1343
Print_ISBN :
978-1-4799-5263-2
Type :
conf
DOI :
10.1109/DeSE.2013.34
Filename :
7041107
Link To Document :
بازگشت