مرکز منطقه ای اطلاع رساني علوم و فناوري - Inferencing in information extraction: Techniques and applications

DocumentCode :

2721502

Title :

Inferencing in information extraction: Techniques and applications

Author :

Barbosa, Denilson ; Haixun Wang ; Cong Yu

Author_Institution :

Univ. of Alberta, Edmonton, AB, Canada

fYear :

2015

fDate :

13-17 April 2015

Firstpage :

1534

Lastpage :

1537

Abstract :

Information extraction at Web scale has become one of the most important research topics in data management since major commercial search engines started incorporating knowledge in their search results a couple of years ago [1]. Users increasingly expect structured knowledge as answers to their search needs. Using Bing as an example, the result page for “Lionel Messi” is full of structured knowledge facts, such as his birthday and awards. The research efforts towards improving the accuracy and coverage of such knowledge bases have led to significant advances in Information Extraction techniques [2], [3]. As the initial challenge of accurately extracting facts for popular entities are being addressed, more difficult challenges have emerged such as extending knowledge coverage to long tail entities and domains, understanding interestingness and usefulness of facts within a given context, and addressing information-seeking needs more directly and accurately. In this tutorial, we will survey the recent research efforts and provide an introduction to the techniques that address those challenges, and the applications that benefit from the adoption of those techniques. In particular, this tutorial will focus on a variety of techniques that can be broadly viewed as knowledge inferencing, i.e., combining multiple data sources and extraction techniques to verify existing knowledge and derive new knowledge. More specifically, we focus on four main categories of inferencing techniques: 1) deep natural language processing using machine learning techniques, 2) data cleaning using integrity constraints, 3) large-scale probabilistic reasoning, and 4) leveraging human expertise for domain knowledge extraction.

Keywords :

Internet; data integrity; inference mechanisms; information retrieval; learning (artificial intelligence); natural language processing; search engines; Bing; Lionel Messi; Web scale; commercial search engines; data cleaning; data management; deep natural language processing; domain knowledge extraction; fact extraction; human expertise leverage; information extraction techniques; information-seeking needs; integrity constraints; knowledge coverage; knowledge inferencing; large-scale probabilistic reasoning; machine learning techniques; multiple data sources; Cleaning; Data mining; Google; Information retrieval; Knowledge based systems; Knowledge engineering; Tutorials;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Engineering (ICDE), 2015 IEEE 31st International Conference on

Conference_Location :

Seoul

Type :

conf

DOI :

10.1109/ICDE.2015.7113420

Filename :

7113420

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2721502