DocumentCode :
3166818
Title :
Understanding Discrete Classifiers with a Case Study in Gene Prediction
Author :
Subianto, Muhammad ; Siebes, Arno
Author_Institution :
Univ. Utrecht, Utrecht
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
661
Lastpage :
666
Abstract :
The requirement that the models resulting from data mining should be understandable is an uncontroversial requirement. In the data mining literature, however, it plays hardly any role, if at all. In practice, though, understandability is often even more important than, e.g., accuracy. Understandability does not mean that models should be simple. It means that one should be able to understand the predictions of models. In this paper we introduce tools to understand arbitrary classifiers defined on discrete data. More in particular, we introduce Explanations that provide insight at a local level. They explain why a classifier classifies a data point as it does. For global insight, we introduce attribute weights. The higher the weight of an attribute, the more often it is decisive in the classification of a data point. To illustrate our tools, we describe a case study in the prediction of small genes. This is a notoriously hard problem in bioinformatics.
Keywords :
biology computing; data mining; arbitrary classifiers; bioinformatics; data mining; discrete classifiers; gene prediction; Area measurement; Bioinformatics; Biological system modeling; Classification tree analysis; Data analysis; Data mining; Databases; Laboratories; Predictive models; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.40
Filename :
4470307
Link To Document :
بازگشت