Title :
Finding the best ridge regression subset by genetic algorithms: applications to multilocus quantitative trait mapping
Author :
Zhang, Bin ; Horvath, Steve
Author_Institution :
Dept. of Human Genetics, California Univ., Los Angeles, CA, USA
Abstract :
Genetic algorithms (GAs) are increasingly used in large and complex optimization problems. Here we use GAs to optimize fitness functions related to ridge regression, which is a classical statistical procedure for dealing with a large number of features in a multivariable, linear regression setting. The algorithm avoids overfitting, gracefully handles collinearity, and leads to easily interpretable results. We use the method to model the relationship between a quantitative trait and genetic markers in a mouse cross involving 69 F2 mice. The approach will be useful in the context of many genomic data sets where the number of features far exceeds the number of observations and where features can be highly correlated.
Keywords :
biology computing; genetic algorithms; genetics; regression analysis; F2 mice; fitness functions; genetic algorithms; genetic markers; genomic data sets; linear regression setting; multilocus quantitative trait mapping; multivariable setting; optimization; ridge regression subset; statistical procedure; Accuracy; Bioinformatics; Genetic algorithms; Genomics; Humans; Least squares approximation; Linear regression; Mice; Predictive models; Stochastic processes;
Conference_Titel :
Engineering in Medicine and Biology Society, 2004. IEMBS '04. 26th Annual International Conference of the IEEE
Conference_Location :
San Francisco, CA
Print_ISBN :
0-7803-8439-3
DOI :
10.1109/IEMBS.2004.1403798