DocumentCode
584448
Title
A Domain-Specific Chinese Term Extraction Method Based on Prefix and Suffix
Author
Li, Dongmei ; Wang, Qinglin ; Li, Yuan ; Peng, Qian
Author_Institution
Sch. of Autom., Beijing Inst. of Technol., Beijing, China
fYear
2012
fDate
11-13 Aug. 2012
Firstpage
1356
Lastpage
1359
Abstract
The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.
Keywords
natural language processing; statistical analysis; text analysis; cooccurrence analysis; domain-specific Chinese term extraction method; prefix; suffix; testing corpus; text information processing; word frequency information; word statistics; Algorithm design and analysis; Data mining; Dictionaries; Feature extraction; Frequency domain analysis; Testing; Text recognition; co-occurrence analysis; domain-specific term; term extraction; term recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Service System (CSSS), 2012 International Conference on
Conference_Location
Nanjing
Print_ISBN
978-1-4673-0721-5
Type
conf
DOI
10.1109/CSSS.2012.342
Filename
6394580
Link To Document