Title of article :
Application of SIMPLISMA purity function for variable selection in multivariate regression analysis: A case study of protein secondary structure determination from infrared spectra
Author/Authors :
Bogomolov، نويسنده , , Andrey and Hachey، نويسنده , , Michel، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2007
Pages :
11
From page :
132
To page :
142
Abstract :
A novel approach for the pre-selection of wavelengths, to be used in combination with Partial Least Squares (PLS) or other multivariate regression techniques, is presented. This variable selection method makes use of the purity function, originally suggested in the SIMPLe-to-use Interactive Self-modeling Mixture Analysis (SIMPLISMA) algorithm, to map up the regions of potentially influential variables. The selected intervals are then individually tested in practical modeling and prediction, and an optimal subset of variables is obtained. The algorithm is simple and intuitive and does not rely on iterative variable searches. The method was tested on a set of infrared protein spectra in order to improve the quantitative determination of the fractions of two secondary structure elements, α-helices and β-strands (β-sheets) in the protein polypeptide chain. Comparable results to those obtained through interval PLS (iPLS), an exhaustive search-based algorithm, were achieved in this study. Our method was shown to be particularly beneficial in combination with variable weighting by their inverse standard deviation.
Keywords :
variable selection , PLS , SIMPLISMA , Purity function , protein secondary structure
Journal title :
Chemometrics and Intelligent Laboratory Systems
Serial Year :
2007
Journal title :
Chemometrics and Intelligent Laboratory Systems
Record number :
1461995
Link To Document :
بازگشت