A PROV-O based approach to web content provenance

Author

Ni Jing

Author_Institution

Economic Management Department, Beijing Institute of Petrochemical Technology, 102617, China

fYear

2015

fDate

7/1/2015 12:00:00 AM

Firstpage

Lastpage

Abstract

Data provenance is currently a hot issue, and many webpages still lack provenance annotation. PROV-O is an emerging W3C recommendation for a provenance data model and language. In this paper, through the analysis of web document derivation, we define a document as an entity and extract a number of semantic properties about document features. A semantic similarity clustering method is used to determine the relationship during the changes of documents. Feature words variation and the responsible person can be found with the aid of PROV-O. Then, taking “genetically modified” news Webpages as test documents, we verify the proposed approach.

Keywords

"Semantics","Metadata","Vocabulary","Ontologies","Feature extraction","Dictionaries","Clustering algorithms"

Publisher

ieee

Conference_Titel

Logistics, Informatics and Service Sciences (LISS), 2015 International Conference on

Type

conf

DOI

10.1109/LISS.2015.7369688

Filename

7369688

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3721408