Title :
Feature selection for building cost-effective data stream classifiers
Author :
Gao, Like ; Wang, X. Sean
Author_Institution :
Dept. of Comput. Sci., Vermont Univ., USA
Abstract :
A stream classifier is a decision model that assigns a class label to a data stream, based on its arriving data. Various features of the stream can be used in the classifier, each of which may have different relevance to the classification task and different cost in obtaining its value. As time passes by, some less costly features may become more relevant, but the time needed for decision may be considered as a cost. A challenge is how to balance the different costs when building a cost-effective classifier. This paper proposes a new feature selection strategy that extends the traditional relief algorithm in two aspects: (1) estimate the classification cost associated with each feature, and (2) order all the features with a score that combines both cost estimation and classification relevance. A classifier is then built with the selected features using a traditional classification method. Experimental results show that classifiers constructed with this strategy are indeed cost effective.
Keywords :
data handling; pattern classification; classification cost estimation; classification relevance; cost-effective data stream classifier; feature selection; Buildings; Computer science; Costs; Data mining; Temperature; Training data;
Conference_Titel :
Data Mining, Fifth IEEE International Conference on
Print_ISBN :
0-7695-2278-5
DOI :
10.1109/ICDM.2005.63