DocumentCode
1804925
Title
Approximate SPARQL for error tolerant queries on the DBpedia knowledge base
Author
Tauer, Gregory ; Rudnicki, Ronald ; Sudit, Moises
Author_Institution
CUBRC, Buffalo, NY, USA
fYear
2013
fDate
9-12 July 2013
Firstpage
850
Lastpage
856
Abstract
The Resource Description Framework (RDF), a language for describing resources, is being used more commonly in information fusion systems. SPARQL is a standard query language that enables knowledge extraction from data encoded in RDF. A SPARQL query is, in essence, an exact subgraph matching problem. Unfortunately, many of the techniques that produce data in RDF (such as manual data entry, social network analysis, natural language processing, etc.) make annotation mistakes, which result in dirty RDF data. SPARQL performs suboptimally on RDF data containing errors since, as an exact graph matching tool, it is not designed to cope with noisy data. To improve knowledge extraction under these conditions, we propose an extension to SPARQL that permits approximate graph matches. This allows queries to cope with errors in the RDF graph, both on the attribute level (such as misspelled names) as well as on the structural level (missing or extra edges). We use the TruST heuristic algorithm to solve the underlying approximate graph matching problem and demonstrate the benefit it brings to answering questions on the DBpedia knowledge base.
Keywords
SQL; database management systems; graph theory; knowledge acquisition; knowledge based systems; pattern matching; sensor fusion; DBpedia knowledge base; RDF graph; RDF language; Resource Description Framework; SPARQL query; TruST heuristic algorithm; annotation mistakes; approximate graph matching problem; attribute level; dirty RDF data; error tolerant queries; graph matching tool; information fusion systems; knowledge extraction; noisy data; standard query language; structural level; subgraph matching problem; Cities and towns; Inductors; Power generation; Resource description framework; Rivers; Sociology; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Fusion (FUSION), 2013 16th International Conference on
Conference_Location
Istanbul
Print_ISBN
978-605-86311-1-3
Type
conf
Filename
6641082
Link To Document