• DocumentCode
    3169151
  • Title

    Efficient unsupervised extraction of words categories using symmetric patterns and high frequency words

  • Author

    Rong, Liu ; Zhiping, Zhang ; Ning, Pang

  • Author_Institution
    Foreign Language Coll., Taiyuan Univ. of Technol., Taiyuan, China
  • fYear
    2010
  • fDate
    29-30 Oct. 2010
  • Firstpage
    542
  • Lastpage
    545
  • Abstract
    This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.
  • Keywords
    graph theory; natural language processing; Chinese; POS; graph based measures; high frequency words; unsupervised words categories extraction; Semantics; sharing semantic meaning; symmetric patterns; unsupervised;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Artificial Intelligence and Education (ICAIE), 2010 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-6935-2
  • Type

    conf

  • DOI
    10.1109/ICAIE.2010.5641103
  • Filename
    5641103