• DocumentCode
    2031
  • Title

    Data Fusion by Matrix Factorization

  • Author

    Zitnik, Marinka ; Zupan, Blaz

  • Author_Institution
    Fac. of Comput. & Inf. Sci., Univ. of Ljubljana, Ljubljana, Slovenia
  • Volume
    37
  • Issue
    1
  • fYear
    2015
  • fDate
    Jan. 1 2015
  • Firstpage
    41
  • Lastpage
    53
  • Abstract
    For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data fusion. Fusion can focus on a specific target relation and exploit directly associated data together with contextual data and data about system´s constraints. In the paper we describe a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes data matrices to reveal hidden associations. The approach can directly consider any data that can be expressed in a matrix, including those from feature-based representations, ontologies, associations and networks. We demonstrate the utility of DFMF for gene function prediction task with eleven different data sources and for prediction of pharmacologic actions by fusing six data sources. Our data fusion algorithm compares favorably to alternative data integration approaches and achieves higher accuracy than can be obtained from any single data source alone.
  • Keywords
    biology computing; data integration; genetics; matrix decomposition; ontologies (artificial intelligence); sensor fusion; DFMF; contextual data; data fusion; data integration approaches; feature-based representations; gene function prediction task; heterogeneous data sets; matrix factorization; ontologies; penalized matrix trifactorization; pharmacologic actions; target relation; Approximation methods; Convergence; Data integration; Data models; Diseases; Linear programming; Predictive models; Data fusion; bioinformatics; cheminformatics; data mining; intermediate data integration; matrix factorization;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2014.2343973
  • Filename
    6867358