• DocumentCode
    47362
  • Title

    Kernelized Bayesian Matrix Factorization

  • Author

    Gonen, Mehmet ; Kaski, Samuel

  • Author_Institution
    Sage Bionetworks, Seatle, WA, USA
  • Volume
    36
  • Issue
    10
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    2047
  • Lastpage
    2060
  • Abstract
    We extend kernelized matrix factorization with a full-Bayesian treatment and with an ability to work with multiple side information sources expressed as different kernels. Kernels have been introduced to integrate side information about the rows and columns, which is necessary for making out-of-matrix predictions. We discuss specifically binary output matrices but extensions to realvalued matrices are straightforward. We extend the state of the art in two key aspects: (i) A full-conjugate probabilistic formulation of the kernelized matrix factorization enables an efficient variational approximation, whereas full-Bayesian treatments are not computationally feasible in the earlier approaches. (ii) Multiple side information sources are included, treated as different kernels in multiple kernel learning which additionally reveals which side sources are informative. We then show that the framework can also be used for supervised and semi-supervised multilabel classification and multi-output regression, by considering samples and outputs as the domains where matrix factorization operates. Our method outperforms alternatives in predicting drug-protein interactions on two data sets. On multilabel classification, our algorithm obtains the lowest Hamming losses on 10 out of 14 data sets compared to five state-of-the-art multilabel classification algorithms. We finally show that the proposed approach outperforms alternatives in multi-output regression experiments on a yeast cell cycle data set.
  • Keywords
    approximation theory; biology computing; learning (artificial intelligence); matrix decomposition; pattern classification; probability; Hamming losses; drug-protein interactions; full-Bayesian treatment; full-conjugate probabilistic formulation; kernelized Bayesian matrix factorization; multioutput regression; out-of-matrix predictions; real-valued matrix; semisupervised multilabel classification; side information sources; supervised multilabel classification; variational approximation; yeast cell cycle data set; Approximation methods; Bayes methods; Computational modeling; Covariance matrices; Kernel; Prediction algorithms; Probabilistic logic; Automatic relevance determination; biological interaction networks; large margin learning; matrix factorization; multilabel classification; multiple kernel learning; multiple output regression; variational approximation;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2014.2313125
  • Filename
    6777351