Title :
The Chinese keywords extraction algorithm based on association rule mining
Author :
Cui Cheng-yu ; Ran Xiao-min
Author_Institution :
Dept. of Inf. Syst. Eng., Inf. Eng. Univ., Zhengzhou, China
Abstract :
Classical algorithms of keywords extraction can hardly get low computational complexity and high accuracy. The association rule mining based algorithm is proposed in this paper. This algorithm adopts improved FP-Growth algorithm to extract word co-occurrence information, utilizes the similarity algorithm to eliminate synonyms, and removes noisy words and simplified features of candidates, thus reducing the storage space and the amount of calculation in the condition of high precision and recall rate. The experimental results have shown that the average F value of the corpus reaches 61%, which is higher than classical algorithms, and that support degree is the vital influence factor.
Keywords :
computational complexity; data mining; natural language processing; word processing; Chinese keywords extraction algorithm; FP-growth algorithm; association rule mining; computational complexity; word co-occurrence information; Algorithm design and analysis; Association rules; Complex networks; Databases; Feature extraction; Semantics; FP-Growth; association rule mining; keywords extraction; word co-occurrence;
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-3278-8
DOI :
10.1109/ICSESS.2014.6933592