Title :
Big RDF data cleaning
Author_Institution :
Qatar Comput. Res. Inst., Doha, Qatar
Abstract :
Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this paper, we will first revisit data quality problems appeared in RDF data. Although many efforts have been put to clean RDF data, unfortunately, most of them are based on laborious manual evaluation. We will also describe possible solutions that shed lights on (semi-)automatically cleaning (big) RDF data.
Keywords :
Big Data; data analysis; database management systems; decision making; electronic data interchange; semantic Web; big RDF data cleaning; data analytics; data driven decision making; data interchange; data management; data quality problem; information age; resource description framework; semantic Web; Cleaning; Conferences; Data mining; Databases; Knowledge based systems; Ontologies; Resource description framework;
Conference_Titel :
Data Engineering Workshops (ICDEW), 2015 31st IEEE International Conference on
Conference_Location :
Seoul
DOI :
10.1109/ICDEW.2015.7129549