• DocumentCode
    1823000
  • Title

    A statistical framework for streaming graph analysis

  • Author

    Fairbanks, James ; Ediger, David ; McColl, R. ; Bader, David A. ; Gilbert, Eric

  • Author_Institution
    Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2013
  • fDate
    25-28 Aug. 2013
  • Firstpage
    341
  • Lastpage
    347
  • Abstract
    In this paper we propose a new methodology for gaining insight into the temporal aspects of social networks. In order to develop higher-level, large-scale data analysis methods for classification, prediction, and anomaly detection, a solid foundation of analytical techniques is required. We present a novel approach to the analysis of these networks that leverages time series and statistical techniques to quantitatively describe the temporal nature of a social network. We report on the application of our approach toward a real data set and successfully visualize high-level changes to the network as well as discover outlying vertices. The real-time prediction of new connections given the previous connections in a graph is a notoriously difficult task. The proposed technique avoids this difficulty by modeling statistics computed from the graph over time. Vertex statistics summarize topological information as real numbers, which allows us to leverage the existing fields of computational statistics and machine learning. This creates a modular approach to analysis in which methods can be developed that are agnostic to the metrics and algorithms used to process the graph. We demonstrate these techniques using a collection of Twitter posts related to Hurricane Sandy. We study the temporal nature of betweenness centrality and clustering coefficients while producing multiple visualizations of a social network dataset with 1.2 million edges. We successfully detect vertices whose triangle-forming behavior is anomalous.
  • Keywords
    data analysis; learning (artificial intelligence); network theory (graphs); social networking (online); time series; Twitter posts; anomaly detection; betweenness centrality; clustering coefficients; computational statistics; hurricane Sandy; large-scale data analysis; machine learning; network analysis; real-time connection prediction; social network dataset visualizations; social networks; streaming graph analysis; time series; triangle-forming behavior; vertex statistics; Correlation; Hurricanes; Kernel; Measurement; Time series analysis; Twitter;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on
  • Conference_Location
    Niagara Falls, ON
  • Type

    conf

  • Filename
    6785729