• Title of article

    A formal approach to subgrammar extraction for NLP

  • Author/Authors

    Vlado Keselj، نويسنده , , Vlado and Cercone، نويسنده , , Nick، نويسنده ,

  • Issue Information
    روزنامه با شماره پیاپی سال 2007
  • Pages
    10
  • From page
    394
  • To page
    403
  • Abstract
    The problem of subgrammar extraction is precisely defined in the following way: Given a grammar G and a set of words W , find a smallest subgrammar of G that accepts the same set of sentences from W ∗ as G , and for each of them produces the same parse trees. In practical Natural Language Processing applications, the set of words  W is obtained from the text unit. There are practical motivations for doing this operation “just-in-time”, i.e. just before processing the text; hence it is required that this operation be done in an automatic and efficient way. After defining the problem in the general framework, we discuss the problem for context-free grammars (CFG), and give an efficient algorithm for it. We prove that finding the smallest subgrammar for HPSGs is an NP-hard problem, and give an efficient algorithm that solves an easier, approximate version of the problem. We also discuss how the algorithm can be efficiently implemented.
  • Keywords
    Parsing , Subgrammar extraction , Grammars , NLP , HPSG
  • Journal title
    Mathematical and Computer Modelling
  • Serial Year
    2007
  • Journal title
    Mathematical and Computer Modelling
  • Record number

    1594396