• DocumentCode
    477748
  • Title

    Automatic Technical Term Extraction Based on Term Association

  • Author

    Wan, Miao ; Liu, Song ; Liu, Jian-Yi ; Wang, Cong

  • Author_Institution
    Center for Intell. Sci. & Technol. Res., Beijing Univ. of Posts & Telecommun., Beijing
  • Volume
    2
  • fYear
    2008
  • fDate
    18-20 Oct. 2008
  • Firstpage
    19
  • Lastpage
    23
  • Abstract
    This paper proposes a new automatic Chinese term extracting algorithm combining both statistics-based and rule-based methods. This algorithm firstly uses a statistical method to extract two-word candidates from raw corpus, and then extends these candidates forward to obtain multi-word candidate terms. We propose a new metric named term association (TA) that can measure the combining degree between words in a string very well. In the second subsystem it filters these candidates to get domain-specific technical terms based on defined rules. Our purpose is to achieve a higher precision of the domain-specific Chinese term extraction task by the hybrid method than the previous approaches. This algorithm implements an extractor with an unprocessed corpus as input for technical papers of ethanol fuels. The results of experiments are analyzed and evaluated, and the precision and recall are 84.26% and 63.86% respectively.
  • Keywords
    data mining; statistical analysis; Chinese term extracting algorithm; automatic technical term extraction; rule-based methods; statistics-based methods; term association; Algorithm design and analysis; Dictionaries; Ethanol; Filters; Frequency; Fuels; Fuzzy systems; Natural languages; Statistical analysis; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
  • Conference_Location
    Shandong
  • Print_ISBN
    978-0-7695-3305-6
  • Type

    conf

  • DOI
    10.1109/FSKD.2008.40
  • Filename
    4666072