DocumentCode :
888517
Title :
Test-cost sensitive classification on data with missing values
Author :
Yang, Qiang ; Ling, Charles ; Chai, Xiaoyong ; Pan, Rong
Author_Institution :
Dept. of Comput. Sci., Hong Kong Univ. of Sci. & Technol., Kowloon, China
Volume :
18
Issue :
5
fYear :
2006
fDate :
5/1/2006 12:00:00 AM
Firstpage :
626
Lastpage :
638
Abstract :
In the area of cost-sensitive learning, inductive learning algorithms have been extended to handle different types of costs to better represent misclassification errors. Most of the previous works have only focused on how to deal with misclassification costs. In this paper, we address the equally important issue of how to handle the test costs associated with querying the missing values in a test case. When an attribute contains a missing value in a test case, it may or may not be worthwhile to take the extra effort in order to obtain a value for that attribute, or attributes, depending on how much benefit the new value bring about in increasing the accuracy. In this paper, we consider how to integrate test-cost-sensitive learning with the handling of missing values in a unified framework that includes model building and a testing strategy. The testing strategies determine which attributes to perform the test on in order to minimize the sum of the classification costs and test costs. We show how to instantiate this framework in two popular machine learning algorithms: decision trees and naive Bayesian method. We empirically evaluate the test-cost-sensitive methods for handling missing values on several data sets.
Keywords :
belief networks; decision trees; learning (artificial intelligence); pattern classification; data classification; decision tree; inductive learning; machine learning; misclassification error; naive Bayesian method; test-cost-sensitive learning; Bayesian methods; Classification tree analysis; Costs; Decision trees; Error correction; Learning systems; Machine learning algorithms; Medical diagnostic imaging; Performance evaluation; Testing; Cost-sensitive learning; decision trees; naive Bayes.;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2006.84
Filename :
1613866
Link To Document :
بازگشت