DocumentCode :
2772777
Title :
Permutation Tests for Studying Classifier Performance
Author :
Ojala, Markus ; Garriga, Gemma C.
Author_Institution :
Dept. of Inf. & Comput. Sci., Helsinki Univ. of Technol., Helsinki, Finland
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
908
Lastpage :
913
Abstract :
We explore the framework of permutation-based p-values for assessing the behavior of the classification error. In this paper we study two simple permutation tests. The first test estimates the null distribution by permuting the labels in the data; this has been used extensively in classification problems in computational biology. The second test produces permutations of the features within classes, inspired by restricted randomization techniques traditionally used in statistics. We study the properties of these tests and present an extensive empirical evaluation on real and synthetic data. Our analysis shows that studying the classification error via permutation tests is effective; in particular, the restricted permutation test clearly reveals whether the classifier exploits the interdependency between the features in the data.
Keywords :
data mining; learning (artificial intelligence); pattern classification; probability; computational biology; null distribution estimation; pattern classification; permutation tests; Computational biology; Computer errors; Computer science; Data analysis; Data mining; Information technology; Machine learning; Statistical analysis; Statistical distributions; System testing; classification; labeled data; permutation tests; restricted randomization; significance testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.108
Filename :
5360332
Link To Document :
بازگشت