• DocumentCode
    658357
  • Title

    Domain Specific Facts Extraction Using Weakly Supervised Active Learning Approach

  • Author

    Pande, Vijae ; Mukherjee, Tridib ; Varma, Vasudeva

  • Author_Institution
    IIIT Hyderabad, Hyderabad, India
  • Volume
    1
  • fYear
    2013
  • fDate
    17-20 Nov. 2013
  • Firstpage
    246
  • Lastpage
    251
  • Abstract
    An ontology is defined using concepts and relationships between the concepts. In this paper, we focus on second problem: relation extraction from plain text. Generic Knowledge Bases like YAGO, Freebase, and DBPedia have made accessible huge collections of facts and their properties from various domains. But acquiring and maintaining various facts and their relations from domain specific corpus becomes very important and challenging task due to low availability of annotated data. Here, we proposed a label propagation based semi-supervised approach for relation extraction by choosing most informative instances for annotation. We also proposed weakly supervised approach for data annotation using generic ontologies like Freebase, which further reduces the cost of annotating data manually. We checked efficiency of our approach by performing experiments on various domain specific corpora.
  • Keywords
    learning (artificial intelligence); ontologies (artificial intelligence); text analysis; Freebase; data annotation; domain specific facts extraction; generic ontologies; label propagation based semisupervised approach; plain text; relation extraction; weakly supervised active learning approach; Data mining; Feature extraction; Knowledge based systems; Labeling; Semantics; Training; Training data; Ontology; Relation Extraction; Weakly Supervised Approach;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Atlanta, GA
  • Print_ISBN
    978-1-4799-2902-3
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2013.36
  • Filename
    6690022