• DocumentCode
    2052322
  • Title

    Identifying Patterns in Texts

  • Author

    Huang, Minhua ; Haralick, Robert M.

  • Author_Institution
    Dept. of Comput. Sci., City Univ. of New York, New York, NY, USA
  • fYear
    2009
  • fDate
    14-16 Sept. 2009
  • Firstpage
    59
  • Lastpage
    64
  • Abstract
    We discuss a probabilistic graphical model for recognizing patterns in texts. It is derived from the probability function for a sequence of categories given a sequence of symbols under two reasonable conditional independence assumptions and represented by a product of combinations of conditional and marginal probability functions. The novelty of our model is that it has a mathematical representation which is completely different from existing graphical models such as CRFs, HMMs, and MEMMs. Moreover, it can be used for identifying various patterns in texts. Up to now, we have used this model for recognizing NP chunks and senses of a polysemous word in sentences. This model has achieved very promising results on standard data sets. In the future, we will use this model for extracting semantic roles in a sentence.
  • Keywords
    pattern recognition; text analysis; NP chunks; mathematical representation; pattern identification; polysemous word; probabilistic graphical model; probability function; text patterns; Computer science; Data mining; Graphical models; Hidden Markov models; Labeling; Mathematical model; Pattern recognition; Testing; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing, 2009. ICSC '09. IEEE International Conference on
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-4962-0
  • Electronic_ISBN
    978-0-7695-3800-6
  • Type

    conf

  • DOI
    10.1109/ICSC.2009.22
  • Filename
    5298562