• DocumentCode
    46149
  • Title

    Structured Learning from Heterogeneous Behavior for Social Identity Linkage

  • Author

    Siyuan Liu ; Shuhui Wang ; Feida Zhu

  • Author_Institution
    Heinz Coll., Carnegie Mellon Univ., Pittsburgh, PA, USA
  • Volume
    27
  • Issue
    7
  • fYear
    2015
  • fDate
    July 1 2015
  • Firstpage
    2005
  • Lastpage
    2019
  • Abstract
    Social identity linkage across different social media platforms is of critical importance to business intelligence by gaining from social data a deeper understanding and more accurate profiling of users. In this paper, we propose a solution framework, HYDRA, which consists of three key steps: (I) we model heterogeneous behavior by long-term topical distribution analysis and multi-resolution temporal behavior matching against high noise and information missing, and the behavior similarity are described by multi-dimensional similarity vector for each user pair; (II) we build structure consistency models to maximize the structure and behavior consistency on users´ core social structure across different platforms, thus the task of identity linkage can be performed on groups of users, which is beyond the individual level linkage in previous study; and (III) we propose a normalized-margin-based linkage function formulation, and learn the linkage function by multi-objective optimization where both supervised pair-wise linkage function learning and structure consistency maximization are conducted towards a unified Pareto optimal solution. The model is able to deal with drastic information missing, and avoid the curse-of-dimensionality in handling high dimensional sparse representation. Extensive experiments on 10 million users across seven popular social networks platforms demonstrate that HYDRA correctly identifies real user linkage across different platforms from massive noisy user behavior data records, and outperforms existing state-of-the-art approaches by at least 20 percent under different settings, and four times better in most settings.
  • Keywords
    competitive intelligence; learning (artificial intelligence); social networking (online); vectors; HYDRA; behavior similarity; business intelligence; heterogeneous behavior; long-term topical distribution analysis; multidimensional similarity vector; multiobjective optimization; multiresolution temporal behavior matching; normalized-margin-based linkage function formulation; social data; social identity linkage; social media platforms; structure consistency maximization; structure consistency models; structured learning; supervised pair-wise linkage function learning; unified Pareto optimal solution; user core social structure; Couplings; Face; Media; Optimization; Social network services; Trajectory; Vectors; Social identity linkage; heterogeneous behavior; multi-resolution temporal information matching; structured Learning;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2015.2397434
  • Filename
    7029102