DocumentCode :
3058110
Title :
Threshold Determining Method for Feature Selection
Author :
Li, Yanling ; Song, Li
Author_Institution :
Xi´´an Res. Inst. of Hi-Technol., Northwestern Polytech. Univ., Xian, China
Volume :
2
fYear :
2009
fDate :
22-24 May 2009
Firstpage :
273
Lastpage :
277
Abstract :
Feature selection is a key step in text categorization, its results has direct influence on the classification accuracy. Evaluation function is usually adopted in feature selection method to calculate the value of feature words, and the feature words which assessed value is higher than setted threshold are maintained as the final feature subset. So the threshold is the important factors of feature selection. However, the threshold is very difficult to determine. In theory, there is no good solution. In practice, people often use their experience to set a initial value, then debug threshold repeatedly according to the results of the classification. In such case, debugging scope is often too great to be easy to determine the threshold. Aiming at the difficulties of threshold determining, this paper mainly studied threshold determining methods for feature selection. First, based on the analysis of several common feature selection methods for the key questions of threshold determining are defined ,and the idea of threshold determining is put forward. Then,in accordance with the idea, four methods are designed for threshold detemining based on the characteristics of the different feature selection methods. Experimental results show that the proposed methods are effective in improving classification performance. After analyzing the results, this paper gets expressly some useful conclusions .
Keywords :
feature extraction; pattern classification; text analysis; debugging scope; evaluation function; feature selection; text categorization; text classification; threshold determining method; Automatic control; Debugging; Decision making; Design methodology; Electronic commerce; Frequency; Information filtering; Iterative methods; Security; Text categorization; feature selection; iterative process; text classification; threshold; threshold interval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronic Commerce and Security, 2009. ISECS '09. Second International Symposium on
Conference_Location :
Nanchang
Print_ISBN :
978-0-7695-3643-9
Type :
conf
DOI :
10.1109/ISECS.2009.41
Filename :
5209842
Link To Document :
بازگشت