Title :
Feature selection and ensemble of regression models for predicting the protein macromolecule dissolution profile
Author :
Ojha, Varun Kumar ; Jackowski, Konrad ; Abraham, Ajith ; Snasel, Vaclav
Author_Institution :
IT4Innovations, VrB Tech. Univ. of Ostrava, Ostrava, Czech Republic
fDate :
July 30 2014-Aug. 1 2014
Abstract :
Predicting the dissolution rate of proteins plays a significant role in pharmaceutical/medical applications. The rate of dissolution of Poly Lactic-co-Glycolic Acid (PLGA) micro- and nanoparticles is influenced by several factors. Considering all factors leads to a dataset with three hundred features, making the prediction difficult and inaccurate. Our present study consists of three phases. Firstly, dimensionality reduction techniques are applied in order to simplify the task and eliminate irrelevant and redundant attributes. Subsequently, a heterogeneous pool of several classical regression algorithms is created and evaluated. Regression algorithms in the pool are independently trained to identify the problem at hand. Finally, we test several ensemble methods in order to elevate the accuracy of the prediction. The Evolutionary Weighted Ensemble method proposed in this paper offered the lowest RMSE and significantly outperformed competing classical algorithms and other ensemble techniques.
Keywords :
dissolving; feature selection; mean square error methods; medical computing; polymer blends; proteins; regression analysis; PLGA nanoparticles; RMSE; dissolution rate; evolutionary weighted ensemble method; feature selection; medical application; pharmaceutical application; polylactic-co-glycolic acid; protein macromolecule dissolution profile prediction; regression models; Proteins; ensemble; feature selection; protein dissolution; regression models;
Conference_Titel :
Nature and Biologically Inspired Computing (NaBIC), 2014 Sixth World Congress on
Conference_Location :
Porto
Print_ISBN :
978-1-4799-5936-5
DOI :
10.1109/NaBIC.2014.6921864