DocumentCode :
2646862
Title :
The Web Information Extraction for Update Summarization Based on Shallow Parsing
Author :
Peng, Min ; Ma, Xiaoxiao ; Tian, Ye ; Yang, Ming ; Long, Hua ; Lin, Quanchen ; Xia, Xiaojun
Author_Institution :
Wuhan Univ., Wuhan, China
fYear :
2011
fDate :
26-28 Oct. 2011
Firstpage :
109
Lastpage :
114
Abstract :
Traditional text information extraction methods mainly act on static documents and are difficult to reflect the dynamic evolvement of information update on the web. To address this challenge, this work proposes a new method based on shallow parsing with rules. The rules are generated according to the syntactic features of English texts, such as the tense of verbs, the usages of modal verbs and so on. The latest novel information in English news texts is extracted correctly, to meet the needs of users for accessing to updated information of the developing events quickly and effectively. Performance results show the improvement of the proposed scheme in this work.
Keywords :
Internet; information retrieval; text analysis; English texts; Web information extraction; information update; shallow parsing; static documents; text information extraction; update summarization; Data mining; Feature extraction; Green products; Natural language processing; Real time systems; Syntactics; Tagging; information extraction; shallow parsing; updated information; web texts;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2011 International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-1-4577-1448-1
Type :
conf
DOI :
10.1109/3PGCIC.2011.26
Filename :
6103146
Link To Document :
بازگشت