Title :
Keyword Indexing System with HowNet and PageRank
Author :
Wang, Jinghua ; Liu, Jianyi ; Wang, Cong ; Zhang, Ping
Author_Institution :
Beijing Univ. of Posts & Telecommun., Beijing
Abstract :
Keyword indexing is widely used in natural language processing. This paper proposed an unsupervised keyword indexing method based PageRank and HowNet. In the method, a free text is firstly represented as a sememe graph with sememes as vertices and relatedness of sememes as weighted edges based on HowNet. Then UW-PageRank is applied on the sememe graph to score the importance of sememes. Score of each definition of one word can be computed from the score of sememes it contains. Then, the highest scored definition is assigned to the word. A sememes graph is built again only with the exact definition of each words, and use UW-PageRank again to score all the sememes and then deduced the importance of the words. Finally, the highest scored words are indexed as keywords. The experiment results prove practical and effective.
Keywords :
graph theory; indexing; natural language processing; text analysis; HowNet; UW-PageRank; free text representation; keyword indexing system; natural language processing; sememe graph; unsupervised keyword indexing method; Dictionaries; Indexing; Natural language processing; Tagging; Text categorization;
Conference_Titel :
Networking, Sensing and Control, 2008. ICNSC 2008. IEEE International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-1685-1
Electronic_ISBN :
978-1-4244-1686-8
DOI :
10.1109/ICNSC.2008.4525246