• DocumentCode
    2057923
  • Title

    A New Framework for Textual Information Mining over Parse Trees

  • Author

    Mousavi, Hamid ; Kerr, Deirdre ; Iseli, Markus

  • Author_Institution
    CSD & CRESST, UCLA, Los Angeles, CA, USA
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    185
  • Lastpage
    188
  • Abstract
    This paper introduces a new text mining framework using a tree-based Linguistic Query Language, called LQL. The framework generates more than one parse tree for each sentence using a probabilistic parser, and annotates each node of these parse trees with text main-parts information which is set of key terms from the node´s branch based on the branch´s linguistic structure. Using main-parts-annotated parse trees, the system can efficiently answer individual queries as well as mine the text for a given set of queries. The framework can also support grammatical ambiguity through probabilistic rules and linguistic exceptions.
  • Keywords
    computational linguistics; data mining; grammars; probability; query languages; text analysis; trees (mathematics); grammatical ambiguity; linguistic exception; main-parts-annotated parse trees; probabilistic parser; probabilistic rules; textual information mining; tree-based Linguistic Query Language; Data mining; Database languages; Engines; Pattern matching; Pragmatics; Probabilistic logic; Senior citizens;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
  • Conference_Location
    Palo Alto, CA
  • Print_ISBN
    978-1-4577-1648-5
  • Electronic_ISBN
    978-0-7695-4492-2
  • Type

    conf

  • DOI
    10.1109/ICSC.2011.19
  • Filename
    6061351