DocumentCode
2052322
Title
Identifying Patterns in Texts
Author
Huang, Minhua ; Haralick, Robert M.
Author_Institution
Dept. of Comput. Sci., City Univ. of New York, New York, NY, USA
fYear
2009
fDate
14-16 Sept. 2009
Firstpage
59
Lastpage
64
Abstract
We discuss a probabilistic graphical model for recognizing patterns in texts. It is derived from the probability function for a sequence of categories given a sequence of symbols under two reasonable conditional independence assumptions and represented by a product of combinations of conditional and marginal probability functions. The novelty of our model is that it has a mathematical representation which is completely different from existing graphical models such as CRFs, HMMs, and MEMMs. Moreover, it can be used for identifying various patterns in texts. Up to now, we have used this model for recognizing NP chunks and senses of a polysemous word in sentences. This model has achieved very promising results on standard data sets. In the future, we will use this model for extracting semantic roles in a sentence.
Keywords
pattern recognition; text analysis; NP chunks; mathematical representation; pattern identification; polysemous word; probabilistic graphical model; probability function; text patterns; Computer science; Data mining; Graphical models; Hidden Markov models; Labeling; Mathematical model; Pattern recognition; Testing; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Semantic Computing, 2009. ICSC '09. IEEE International Conference on
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-4962-0
Electronic_ISBN
978-0-7695-3800-6
Type
conf
DOI
10.1109/ICSC.2009.22
Filename
5298562
Link To Document