DocumentCode
1909404
Title
Measuring Relevance with Named Entity Based Patterns in Topic-Focused Document Summarization
Author
Wei, Furu ; Li, Wenjie ; He, Yanxiang
Author_Institution
Wuhan Univ., Wuhan
fYear
2007
fDate
Aug. 30 2007-Sept. 1 2007
Firstpage
111
Lastpage
118
Abstract
In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.
Keywords
information retrieval; text analysis; information extraction; named entity based pattern; semantic class; text analysis; topic-focused document summarization; Computer science; Current measurement; Data mining; Measurement units; Tree graphs;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-1611-0
Electronic_ISBN
978-1-4244-1611-0
Type
conf
DOI
10.1109/NLPKE.2007.4368020
Filename
4368020
Link To Document