• DocumentCode
    58116
  • Title

    Supervised Multi-View Canonical Correlation Analysis (sMVCCA): Integrating Histologic and Proteomic Features for Predicting Recurrent Prostate Cancer

  • Author

    Lee, Gene ; Singanamalli, Asha ; Haibo Wang ; Feldman, Michael D. ; Master, Stephen R. ; Shih, Natalie N. C. ; Spangler, Elaine ; Rebbeck, Timothy ; Tomaszewski, John E. ; Madabhushi, Anant

  • Author_Institution
    Dept. of Biomed. Eng., Case Western Reserve Univ., Cleveland, OH, USA
  • Volume
    34
  • Issue
    1
  • fYear
    2015
  • fDate
    Jan. 2015
  • Firstpage
    284
  • Lastpage
    297
  • Abstract
    In this work, we present a new methodology to facilitate prediction of recurrent prostate cancer (CaP) following radical prostatectomy (RP) via the integration of quantitative image features and protein expression in the excised prostate. Creating a fused predictor from high-dimensional data streams is challenging because the classifier must 1) account for the “curse of dimensionality” problem, which hinders classifier performance when the number of features exceeds the number of patient studies and 2) balance potential mismatches in the number of features across different channels to avoid classifier bias towards channels with more features. Our new data integration methodology, supervised Multi-view Canonical Correlation Analysis (sMVCCA), aims to integrate infinite views of highdimensional data to provide more amenable data representations for disease classification. Additionally, we demonstrate sMVCCA using Spearman´s rank correlation which, unlike Pearson´s correlation, can account for nonlinear correlations and outliers. Forty CaP patients with pathological Gleason scores 6-8 were considered for this study. 21 of these men revealed biochemical recurrence (BCR) following RP, while 19 did not. For each patient, 189 quantitative histomorphometric attributes and 650 protein expression levels were extracted from the primary tumor nodule. The fused histomorphometric/proteomic representation via sMVCCA combined with a random forest classifier predicted BCR with a mean AUC of 0.74 and a maximum AUC of 0.9286. We found sMVCCA to perform statistically significantly (p <; 0.05) better than comparative state-of-the-art data fusion strategies for predicting BCR. Furthermore, Kaplan-Meier analysis demonstrated improved BCR-free survival prediction for the sMVCCA-fused classifier as compared to histology or proteomic features alone.
  • Keywords
    biomedical MRI; cancer; correlation methods; image representation; medical image processing; proteins; proteomics; tumours; BCR-free survival prediction; Kaplan-Meier analysis; Pearsons correlation; Spearmans rank correlation; biochemical recurrence; data fusion strategy; data representation; dimensionality curse problem; disease classification; high-dimensional data stream; histomorphometric representation; nonlinear correlation; primary tumor nodule; protein expression level; proteomic feature; proteomic representation; quantitative image feature; radical prostatectomy; random forest classifier; recurrent prostate cancer prediction; sMVCCA; supervised multiview canonical correlation analysis; Correlation; Feature extraction; Optimization; Prostate cancer; Proteins; Proteomics; Vectors; Data fusion; digital pathology; dimensionality reduction; mass spectrometry; prostate cancer; proteomics;
  • fLanguage
    English
  • Journal_Title
    Medical Imaging, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0278-0062
  • Type

    jour

  • DOI
    10.1109/TMI.2014.2355175
  • Filename
    6893009