Title :
A PROV-O based approach to web content provenance
Author_Institution :
Economic Management Department, Beijing Institute of Petrochemical Technology, 102617, China
fDate :
7/1/2015 12:00:00 AM
Abstract :
Data provenance is currently a hot issue, and many webpages still lack provenance annotation. PROV-O is an emerging W3C recommendation for a provenance data model and language. In this paper, through the analysis of web document derivation, we define a document as an entity and extract a number of semantic properties about document features. A semantic similarity clustering method is used to determine the relationship during the changes of documents. Feature words variation and the responsible person can be found with the aid of PROV-O. Then, taking “genetically modified” news Webpages as test documents, we verify the proposed approach.
Keywords :
"Semantics","Metadata","Vocabulary","Ontologies","Feature extraction","Dictionaries","Clustering algorithms"
Conference_Titel :
Logistics, Informatics and Service Sciences (LISS), 2015 International Conference on
DOI :
10.1109/LISS.2015.7369688