Chinese Automatic Text Summarization Based on Keyword Extraction

Author

Jiang Xiao-Yu

Author_Institution

Bus. Sch., Beijing Inst. of Fashion Technol., Beijing, China

fYear

2009

fDate

25-26 April 2009

Firstpage

225

Lastpage

228

Abstract

In order to over the shortcoming of the incomprehensive of summarization, a new lexical-chain-based keywords extraction and automatic summarization algorithm from Chinese texts based on the unknown word recognition using co-occurrence of neighbor words is proposed in this paper, and an algorithm for constructing lexical chains based on Hownet knowledge database is given in the method, lexical chains are firstly constructing by calculating the semantic similarity between terms, then keywords are extracted and the importance of each sentence is calculated according to the lexical chain´s intensity, the terms´ entropy and position. The experimental results show that the summarization generated by the improved algorithm gets better performance than other methods both in recall and precision.

Keywords

database management systems; text analysis; word processing; Chinese automatic text summarization; Hownet knowledge database; entropy; lexical-chain-based keywords extraction; position; semantic similarity; word recognition; Algorithm design and analysis; Character recognition; Clustering algorithms; Computers; Data mining; Databases; Entropy; Frequency; Statistics; Text recognition; automatic summarization; keyword extraction; lexical chain;

fLanguage

English

Publisher

ieee

Conference_Titel

Database Technology and Applications, 2009 First International Workshop on

Conference_Location

Wuhan, Hubei

Print_ISBN

978-0-7695-3604-0

Type

conf

DOI

10.1109/DBTA.2009.9

Filename

5207775