DocumentCode
3169151
Title
Efficient unsupervised extraction of words categories using symmetric patterns and high frequency words
Author
Rong, Liu ; Zhiping, Zhang ; Ning, Pang
Author_Institution
Foreign Language Coll., Taiyuan Univ. of Technol., Taiyuan, China
fYear
2010
fDate
29-30 Oct. 2010
Firstpage
542
Lastpage
545
Abstract
This paper presents a novel approach for discovering and extracting sets of words sharing semantic meaning. We utilize meta-patterns of high frequency words and content words in order to discover pattern candidates. Symmetric patterns are then identified using graph-based measures, and word categories are created based on graph clique sets. Our method is the pattern-based method that requires no seed patterns or words provided manually. For Chinese, only POS is carried out in advance. The computation time for large corpora is linear. The result is preferable by manual judgment.
Keywords
graph theory; natural language processing; Chinese; POS; graph based measures; high frequency words; unsupervised words categories extraction; Semantics; sharing semantic meaning; symmetric patterns; unsupervised;
fLanguage
English
Publisher
ieee
Conference_Titel
Artificial Intelligence and Education (ICAIE), 2010 International Conference on
Conference_Location
Hangzhou
Print_ISBN
978-1-4244-6935-2
Type
conf
DOI
10.1109/ICAIE.2010.5641103
Filename
5641103
Link To Document