• DocumentCode
    584448
  • Title

    A Domain-Specific Chinese Term Extraction Method Based on Prefix and Suffix

  • Author

    Li, Dongmei ; Wang, Qinglin ; Li, Yuan ; Peng, Qian

  • Author_Institution
    Sch. of Autom., Beijing Inst. of Technol., Beijing, China
  • fYear
    2012
  • fDate
    11-13 Aug. 2012
  • Firstpage
    1356
  • Lastpage
    1359
  • Abstract
    The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.
  • Keywords
    natural language processing; statistical analysis; text analysis; cooccurrence analysis; domain-specific Chinese term extraction method; prefix; suffix; testing corpus; text information processing; word frequency information; word statistics; Algorithm design and analysis; Data mining; Dictionaries; Feature extraction; Frequency domain analysis; Testing; Text recognition; co-occurrence analysis; domain-specific term; term extraction; term recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science & Service System (CSSS), 2012 International Conference on
  • Conference_Location
    Nanjing
  • Print_ISBN
    978-1-4673-0721-5
  • Type

    conf

  • DOI
    10.1109/CSSS.2012.342
  • Filename
    6394580