• DocumentCode
    588903
  • Title

    Key-Phrase Extraction Based on a Combination of CRF Model with Document Structure

  • Author

    Feng Yu ; Hong-Wei Xuan ; De-quan Zheng

  • Author_Institution
    Sch. of Comput. & Inf. Eng., Harbin Univ. of Commerce, Harbin, China
  • fYear
    2012
  • fDate
    17-18 Nov. 2012
  • Firstpage
    406
  • Lastpage
    410
  • Abstract
    Key-Phrase should not only reflect the main content of a document, but also reflect the specialty of this document. Key-Phrase extraction is an important technique in the field of text information processing. With the advent of the Internet age, on-line file shows an astonishing increase in geometry and information explosion has became the main character of this age. Searching and making use of network information becomes more difficult. Therefore, automatically extraction on keyword is required. This paper uses the idea of classification to complete the task of Key-Phrase extraction, which uses SVM to build classification model and uses CRF to extract Key-Phrases. The testing result shows that, the mentioned extraction approach has improved dramatically compared with previous methods in precision and recall rate.
  • Keywords
    Internet; information retrieval; pattern classification; random processes; support vector machines; text analysis; CRF model; Internet; SVM; automatic keyword extraction; classification model; conditional random fields; document structure; geometry; information explosion; key-phrase extraction; network information search; online file; support vector machines; text information processing; Data models; Educational institutions; Feature extraction; Information retrieval; Text analysis; Training; Feature Selection; Information Extraction; Inverse Document Frequency; Key-phrase; Term Frequency;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Security (CIS), 2012 Eighth International Conference on
  • Conference_Location
    Guangzhou
  • Print_ISBN
    978-1-4673-4725-9
  • Type

    conf

  • DOI
    10.1109/CIS.2012.97
  • Filename
    6405955