Title :
Copy detection mechanism for documents using position based weighted scheme
Author :
Sharma, Ritu ; Sharma, Divya
Author_Institution :
Dept. of CEA, GLA Univ., Mathura, India
Abstract :
Nowadays, every information is easily available in digital form as documents over the internet. So, it is very easy to copy and access those documents on the internet. When we copy someone´s content or idea without permission or citation then plagiarism occurs. This is a very big problem, as it violates the sharing of important information among valid users. According to this paper, we developed a system to detect piracy or copies regarding documents using position based weighted scheme. This scheme firstly extracts all keywords from all the sentences and then applies weighting for each keyword. And, to cover the whole sentence, we used first and last position based scheme on all extracted keywords. Thereafter, assign the weight of all extracted keywords to related synonyms or antonyms of that extracted keyword with the help of Word Net dataset 2.0. Finally, we apply the cosine similarity measure for similarity detection between two documents by considering a static database. After performing experiments on Pan Plagiarism Corpus-9 datasets, the results represent better time efficiency and best performance in comparison of existing copy detection mechanisms on the basis of Precision, Recall, and F-measure.
Keywords :
Internet; computer crime; copy protection; information retrieval; text analysis; F-measure; Internet; Pan Plagiarism Corpus-9 datasets; Word Net dataset 2.0; antonyms; copy detection mechanism; cosine similarity measure; digital form; documents access; documents copy; information sharing; keywords extraction; piracy detection; plagiarism; position based weighted scheme; sentences; similarity detection; static database; synonyms; Databases; Feature extraction; Information technology; Next generation networking; Plagiarism; Pragmatics; Semantics; Copy detection mechanism; keyword extraction; similarity measure; weighted scheme;
Conference_Titel :
Confluence The Next Generation Information Technology Summit (Confluence), 2014 5th International Conference -
Conference_Location :
Noida
Print_ISBN :
978-1-4799-4237-4
DOI :
10.1109/CONFLUENCE.2014.6949382