DocumentCode :
29036
Title :
Dimension Reduction for p53 Protein Recognition by Using Incremental Partial Least Squares
Author :
Xue-Qiang Zeng ; Guo-Zheng Li
Author_Institution :
Key Lab. of Embedded Syst. & Service Comput., Tongji Univ., Shanghai, China
Volume :
13
Issue :
2
fYear :
2014
fDate :
Jun-14
Firstpage :
73
Lastpage :
79
Abstract :
As an important tumor suppressor protein, reactivating mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to z. In recent years, more and more data extracted from biophysical simulations, which makes the modelling of mutant p53 transcriptional activity suffering from the problems of huge amount of instances and high feature dimension. Incremental feature extraction is effective to facilitate analysis of large-scale data. However, most current incremental feature extraction methods are not suitable for processing big data with high feature dimension. Partial Least Squares (PLS) has been demonstrated to be an effective dimension reduction technique for classification. In this paper, we design a highly efficient and powerful algorithm named Incremental Partial Least Squares (IPLS), which conducts a two-stage extraction process. In the first stage, the PLS target function is adapted to be incremental with updating historical mean to extract the leading projection direction. In the last stage, the other projection directions are calculated through equivalence between the PLS vectors and the Krylov sequence. We compare IPLS with some state-of-the-arts incremental feature extraction methods like Incremental Principal Component Analysis, Incremental Maximum Margin Criterion and Incremental Inter-class Scatter on real p53 proteins data. Empirical results show IPLS performs better than other methods in terms of balanced classification accuracy.
Keywords :
bioinformatics; cancer; feature extraction; least squares approximations; molecular biophysics; principal component analysis; proteins; tumours; IPLS; Krylov sequence; PLS target function; balanced classification accuracy; dimension reduction; human cancers; incremental feature extraction; incremental interclass scatter; incremental maximum margin criterion; incremental partial least squares; incremental principal component analysis; large-scale data analysis; mutant p53 transcriptional activity; mutated p53; p53 protein recognition; tumor regression; tumor suppressor protein; Algorithm design and analysis; Cancer; Covariance matrices; Feature extraction; Proteins; Tumors; Vectors; Big data; feature extraction; incremental learning; p53 protein; partial least squares;
fLanguage :
English
Journal_Title :
NanoBioscience, IEEE Transactions on
Publisher :
ieee
ISSN :
1536-1241
Type :
jour
DOI :
10.1109/TNB.2014.2319234
Filename :
6823766
Link To Document :
بازگشت