DocumentCode :
3167044
Title :
Transfer Learning across Cancers on DNA Copy Number Variation Analysis
Author :
Huanan Zhang ; Ze Tian ; Rui Kuang
Author_Institution :
Dept. of Comput. Sci. & Eng., Univ. of Minnesota Twin Cities, Minneapolis, MN, USA
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
1283
Lastpage :
1288
Abstract :
DNA copy number variations (CNVs) are prevalent in all types of tumors. It is still a challenge to study how CNVs play a role in driving tumorgenic mechanisms that are either universal or specific in different cancer types. To address the problem, we introduce a transfer learning framework to discover common CNVs shared across different tumor types as well as CNVs specific to each tumor type from genome-wide CNV data measured by array CGH and SNP genotyping array. The proposed model, namely Transfer Learning with Fused LASSO (TLFL), detects latent CNV components from multiple CNV datasets of different tumor types to distinguish the CNVs that are common across the datasets and those that are specific in each dataset. Both the common and type-specific CNVs are detected as latent components in matrix factorization coupled with fused LASSO on adjacent CNV probe features. TLFL considers the common latent components underlying the multiple datasets to transfer knowledge across different tumor types. In simulations and experiments on real cancer CNV datasets, TLFL detected better latent components that can be used as features to improve classification of patient samples in each individual dataset compared with the model without the knowledge transfer. In cross-dataset analysis on bladder cancer and cross-domain analysis on breast cancer and ovarian cancer, TLFL also learned latent CNV components that are both predictive of tumor stages and correlate with known cancer genes.
Keywords :
DNA; cancer; learning (artificial intelligence); matrix decomposition; medical computing; pattern classification; tumours; CNVs; DNA copy number variation analysis; SNP genotyping array; TLFL; adjacent CNV probe features; array CGH; bladder cancer; breast cancer; cancer genes; common latent components; cross-dataset analysis; cross-domain analysis; genome-wide CNV data; latent components; matrix factorization; ovarian cancer; patient sample classification; transfer learning with fused LASSO; tumorgenic mechanisms; Arrays; Biological cells; Cancer; Genomics; Hidden Markov models; Probes; Tumors; Cancer Genomics; DNA Copy Number; Fused LASSO Components; Transfer Learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
ISSN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2013.58
Filename :
6729635
Link To Document :
بازگشت