• DocumentCode
    3433661
  • Title

    Document knowledge representation using description logics for information extraction and querying

  • Author

    Manjula, D. ; Aghila, G. ; Geetha, T.V.

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Anna Univ., India
  • fYear
    2003
  • fDate
    28-30 April 2003
  • Firstpage
    189
  • Lastpage
    193
  • Abstract
    The document representation is an important aspect of both information retrieval and information extraction. In this paper the bag of words representation of documents is enriched with lexical, conceptual and contextual relationships. In order to effectively represent and inference with this representation, a mathematical model is required. In this work, the enhanced interrelated set of bag of words that is semantically lattice of words is represented using description logic. The conceptual taxonomy extracted using WordNet can be represented naturally using DL. The DL also provides consistency, satisfiability, instance checking, subsumption services to semantically extend the initial enhanced lattice of words. The contextual knowledge can be extracted by syntactic patterns, dependency relations and heuristics. Tourist and sports domains are taken for the implementation of this work. This knowledge extracted and represented can be used in applications like information retrieval and information extraction.
  • Keywords
    formal logic; information retrieval; knowledge representation; word processing; DL; WordNet; bag of words representation; contextual knowledge; dependency relations; description logics; document knowledge representation; information extraction; information retrieval; instance checking; mathematical model; subsumption services; syntactic patterns; Data mining; Frequency; Information analysis; Information retrieval; Knowledge representation; Lattices; Logic; Mathematical model; Statistics; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: Coding and Computing [Computers and Communications], 2003. Proceedings. ITCC 2003. International Conference on
  • Print_ISBN
    0-7695-1916-4
  • Type

    conf

  • DOI
    10.1109/ITCC.2003.1197524
  • Filename
    1197524