Title :
Chinese Automatic Text Summarization Based on Keyword Extraction
Author_Institution :
Bus. Sch., Beijing Inst. of Fashion Technol., Beijing, China
Abstract :
In order to over the shortcoming of the incomprehensive of summarization, a new lexical-chain-based keywords extraction and automatic summarization algorithm from Chinese texts based on the unknown word recognition using co-occurrence of neighbor words is proposed in this paper, and an algorithm for constructing lexical chains based on Hownet knowledge database is given in the method, lexical chains are firstly constructing by calculating the semantic similarity between terms, then keywords are extracted and the importance of each sentence is calculated according to the lexical chain´s intensity, the terms´ entropy and position. The experimental results show that the summarization generated by the improved algorithm gets better performance than other methods both in recall and precision.
Keywords :
database management systems; text analysis; word processing; Chinese automatic text summarization; Hownet knowledge database; entropy; lexical-chain-based keywords extraction; position; semantic similarity; word recognition; Algorithm design and analysis; Character recognition; Clustering algorithms; Computers; Data mining; Databases; Entropy; Frequency; Statistics; Text recognition; automatic summarization; keyword extraction; lexical chain;
Conference_Titel :
Database Technology and Applications, 2009 First International Workshop on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3604-0
DOI :
10.1109/DBTA.2009.9