• DocumentCode
    2721502
  • Title

    Inferencing in information extraction: Techniques and applications

  • Author

    Barbosa, Denilson ; Haixun Wang ; Cong Yu

  • Author_Institution
    Univ. of Alberta, Edmonton, AB, Canada
  • fYear
    2015
  • fDate
    13-17 April 2015
  • Firstpage
    1534
  • Lastpage
    1537
  • Abstract
    Information extraction at Web scale has become one of the most important research topics in data management since major commercial search engines started incorporating knowledge in their search results a couple of years ago [1]. Users increasingly expect structured knowledge as answers to their search needs. Using Bing as an example, the result page for “Lionel Messi” is full of structured knowledge facts, such as his birthday and awards. The research efforts towards improving the accuracy and coverage of such knowledge bases have led to significant advances in Information Extraction techniques [2], [3]. As the initial challenge of accurately extracting facts for popular entities are being addressed, more difficult challenges have emerged such as extending knowledge coverage to long tail entities and domains, understanding interestingness and usefulness of facts within a given context, and addressing information-seeking needs more directly and accurately. In this tutorial, we will survey the recent research efforts and provide an introduction to the techniques that address those challenges, and the applications that benefit from the adoption of those techniques. In particular, this tutorial will focus on a variety of techniques that can be broadly viewed as knowledge inferencing, i.e., combining multiple data sources and extraction techniques to verify existing knowledge and derive new knowledge. More specifically, we focus on four main categories of inferencing techniques: 1) deep natural language processing using machine learning techniques, 2) data cleaning using integrity constraints, 3) large-scale probabilistic reasoning, and 4) leveraging human expertise for domain knowledge extraction.
  • Keywords
    Internet; data integrity; inference mechanisms; information retrieval; learning (artificial intelligence); natural language processing; search engines; Bing; Lionel Messi; Web scale; commercial search engines; data cleaning; data management; deep natural language processing; domain knowledge extraction; fact extraction; human expertise leverage; information extraction techniques; information-seeking needs; integrity constraints; knowledge coverage; knowledge inferencing; large-scale probabilistic reasoning; machine learning techniques; multiple data sources; Cleaning; Data mining; Google; Information retrieval; Knowledge based systems; Knowledge engineering; Tutorials;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2015 IEEE 31st International Conference on
  • Conference_Location
    Seoul
  • Type

    conf

  • DOI
    10.1109/ICDE.2015.7113420
  • Filename
    7113420