DocumentCode
408367
Title
KIP: a keyphrase identification program with learning functions
Author
Wu, Yi-fang Brook ; Li, Quanzhi ; Bot, Razvan Stefan ; Chen, Xin
Author_Institution
Inf. Syst. Dept., New Jersey Inst. of Technol., Newark, NJ, USA
Volume
2
fYear
2004
fDate
5-7 April 2004
Firstpage
450
Abstract
We report a keyphrase identification program (KIP), which uses sample human keyphrases and then learns to identify additional new keyphrases. KIP first populates its database using manually identified keyphrases; each keyphrase is preprocessed and assigned an initial weight. It then extracts noun phrases from documents. All noun phrases will be assigned a score, depending on the weights for words it contains; the ones that have a score higher than the threshold will be selected as keyphrases. Learned new keyphrases will be inserted to the database and weights will be updated. As a result, new keyphrase identification iteration will be triggered. The process stops when no new keyphrases are identified during previous iteration. According to the results of evaluation, the base KIP system´s average recall was 0.7 and precision was 0.44. The augmented KIP with learning functions did produce new keyphrases which were not identified by the base system.
Keywords
data mining; database management systems; feature extraction; information retrieval; text analysis; KIP system; keyphrase identification program; learning functions; manually identified keyphrase; sample human keyphrase; text mining; Data mining; Databases; Humans; Indexing; Information systems; Natural language processing; Text analysis; Text mining; Thesauri; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. International Conference on
Print_ISBN
0-7695-2108-8
Type
conf
DOI
10.1109/ITCC.2004.1286694
Filename
1286694
Link To Document