• DocumentCode
    408367
  • Title

    KIP: a keyphrase identification program with learning functions

  • Author

    Wu, Yi-fang Brook ; Li, Quanzhi ; Bot, Razvan Stefan ; Chen, Xin

  • Author_Institution
    Inf. Syst. Dept., New Jersey Inst. of Technol., Newark, NJ, USA
  • Volume
    2
  • fYear
    2004
  • fDate
    5-7 April 2004
  • Firstpage
    450
  • Abstract
    We report a keyphrase identification program (KIP), which uses sample human keyphrases and then learns to identify additional new keyphrases. KIP first populates its database using manually identified keyphrases; each keyphrase is preprocessed and assigned an initial weight. It then extracts noun phrases from documents. All noun phrases will be assigned a score, depending on the weights for words it contains; the ones that have a score higher than the threshold will be selected as keyphrases. Learned new keyphrases will be inserted to the database and weights will be updated. As a result, new keyphrase identification iteration will be triggered. The process stops when no new keyphrases are identified during previous iteration. According to the results of evaluation, the base KIP system´s average recall was 0.7 and precision was 0.44. The augmented KIP with learning functions did produce new keyphrases which were not identified by the base system.
  • Keywords
    data mining; database management systems; feature extraction; information retrieval; text analysis; KIP system; keyphrase identification program; learning functions; manually identified keyphrase; sample human keyphrase; text mining; Data mining; Databases; Humans; Indexing; Information systems; Natural language processing; Text analysis; Text mining; Thesauri; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: Coding and Computing, 2004. Proceedings. ITCC 2004. International Conference on
  • Print_ISBN
    0-7695-2108-8
  • Type

    conf

  • DOI
    10.1109/ITCC.2004.1286694
  • Filename
    1286694