Title :
Statistical bias and variance of gene selection and cross validation methods: A case study on hypertension prediction
Author :
Gormez, Zeliha ; Kursun, Olcay ; Sertbas, Ahmet ; Aydin, Nizamettin ; Seker, Huseyin
Author_Institution :
Comput. Eng. Dept., Univ. of Istanbul, Istanbul, Turkey
Abstract :
In exploratory association studies of genes with certain diseases, a single or a small number of genes (features) related with the diseases are selected1 among many thousands investigated. We investigate the statistical bias and variance of simple yet common (correlation and mutual information based) feature selection algorithms using well-known cross-validation methods (leave-one-out and k-fold) on a gene finding study for hypertension prediction. Our findings show that selected genes are different for different methods and different cross-validation runs for both single gene selection and gene subset selection.
Keywords :
learning (artificial intelligence); medical computing; statistical analysis; cross validation methods; feature selection algorithms; gene subset selection; hypertension prediction; single gene selection; statistical bias; statistical variance; Prediction algorithms;
Conference_Titel :
Biomedical and Health Informatics (BHI), 2012 IEEE-EMBS International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4577-2176-2
Electronic_ISBN :
978-1-4577-2175-5
DOI :
10.1109/BHI.2012.6211658