DocumentCode :
408367
Title :
KIP: a keyphrase identification program with learning functions
Author :
Wu, Yi-fang Brook ; Li, Quanzhi ; Bot, Razvan Stefan ; Chen, Xin
Author_Institution :
Inf. Syst. Dept., New Jersey Inst. of Technol., Newark, NJ, USA
Volume :
2
fYear :
2004
fDate :
5-7 April 2004
Firstpage :
450
Abstract :
We report a keyphrase identification program (KIP), which uses sample human keyphrases and then learns to identify additional new keyphrases. KIP first populates its database using manually identified keyphrases; each keyphrase is preprocessed and assigned an initial weight. It then extracts noun phrases from documents. All noun phrases will be assigned a score, depending on the weights for words it contains; the ones that have a score higher than the threshold will be selected as keyphrases. Learned new keyphrases will be inserted to the database and weights will be updated. As a result, new keyphrase identification iteration will be triggered. The process stops when no new keyphrases are identified during previous iteration. According to the results of evaluation, the base KIP system´s average recall was 0.7 and precision was 0.44. The augmented KIP with learning functions did produce new keyphrases which were not identified by the base system.
Keywords :
data mining; database management systems; feature extraction; information retrieval; text analysis; KIP system; keyphrase identification program; learning functions; manually identified keyphrase; sample human keyphrase; text mining; Data mining; Databases; Humans; Indexing; Information systems; Natural language processing; Text analysis; Text mining; Thesauri; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. International Conference on
Print_ISBN :
0-7695-2108-8
Type :
conf
DOI :
10.1109/ITCC.2004.1286694
Filename :
1286694
Link To Document :
بازگشت