DocumentCode :
584448
Title :
A Domain-Specific Chinese Term Extraction Method Based on Prefix and Suffix
Author :
Li, Dongmei ; Wang, Qinglin ; Li, Yuan ; Peng, Qian
Author_Institution :
Sch. of Autom., Beijing Inst. of Technol., Beijing, China
fYear :
2012
fDate :
11-13 Aug. 2012
Firstpage :
1356
Lastpage :
1359
Abstract :
The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.
Keywords :
natural language processing; statistical analysis; text analysis; cooccurrence analysis; domain-specific Chinese term extraction method; prefix; suffix; testing corpus; text information processing; word frequency information; word statistics; Algorithm design and analysis; Data mining; Dictionaries; Feature extraction; Frequency domain analysis; Testing; Text recognition; co-occurrence analysis; domain-specific term; term extraction; term recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Service System (CSSS), 2012 International Conference on
Conference_Location :
Nanjing
Print_ISBN :
978-1-4673-0721-5
Type :
conf
DOI :
10.1109/CSSS.2012.342
Filename :
6394580
Link To Document :
بازگشت