Domain-Specific Knowledge Base Enrichment Using Wikipedia Tables

Author

Chenwei Ran;Wei Shen;Jianyong Wang;Xuan Zhu

Author_Institution

Dept. of Comput. Sci. &

fYear

2015

Firstpage

349

Lastpage

358

Abstract

The knowledge base is a machine-readable set of knowledge. More and more multi-domain and large-scale knowledge bases have emerged in recent years, and they play an essential role in many information systems and semantic annotation tasks. However we do not have a perfect knowledge base yet and maybe we will never have a perfect one, because all the knowledge bases have limited coverage while new knowledge continues to emerge. Therefore populating and enriching the existing knowledge base become important tasks. Traditional knowledge base population task usually leverages the information embedded in the unstructured free text. Recently researchers found that massive structured tables on the Web are high-quality relational data and easier to be utilized than the unstructured text. Our goal of this paper is to enrich the knowledge base using Wikipedia tables. Here, knowledge means binary relations between entities and we focus on the relations in some specific domains. There are two basic types of information can be used in this task: the existing relation instances and the connection between types and relations. We firstly propose two basic probabilistic models based on two types of information respectively. Then we propose a light-weight aggregated model to combine the advantages of basic models. The experimental results show that our method is an effective approach to enriching the knowledge base with both high precision and recall.

Keywords

"Knowledge based systems","Encyclopedias","Electronic publishing","Internet","Semantics","Databases"

Publisher

ieee

Conference_Titel

Data Mining (ICDM), 2015 IEEE International Conference on

ISSN

1550-4786

Type

conf

DOI

10.1109/ICDM.2015.124

Filename

7373339