• DocumentCode
    140786
  • Title

    Keyword-based correlated network computation over large social media

  • Author

    Jianxin Li ; Chengfei Liu ; Islam, Md Saiful

  • Author_Institution
    Swinburne Univ. of Technol., Melbourne, VIC, Australia
  • fYear
    2014
  • fDate
    March 31 2014-April 4 2014
  • Firstpage
    268
  • Lastpage
    279
  • Abstract
    Recent years have witnessed an unprecedented proliferation of social media, e.g., millions of blog posts, micro-blog posts, and social networks on the Internet. This kind of social media data can be modeled in a large graph where nodes represent the entities and edges represent relationships between entities of the social media. Discovering keyword-based correlated networks of these large graphs is an important primitive in data analysis, from which users can pay more attention about their concerned information in the large graph. In this paper, we propose and define the problem of keyword-based correlated network computation over a massive graph. To do this, we first present a novel tree data structure that only maintains the shortest path of any two graph nodes, by which the massive graph can be equivalently transformed into a tree data structure for addressing our proposed problem. After that, we design efficient algorithms to build the transformed tree data structure from a graph offline and compute the γ-bounded keyword matched subgraphs based on the pre-built tree data structure on the fly. To further improve the efficiency, we propose weighted shingle-based approximation approaches to measure the correlation among a large number of γ-bounded keyword matched subgraphs. At last, we develop a merge-sort based approach to efficiently generate the correlated networks. Our extensive experiments demonstrate the efficiency of our algorithms on reducing time and space cost. The experimental results also justify the effectiveness of our method in discovering correlated networks from three real datasets.
  • Keywords
    Internet; data analysis; graph theory; merging; social networking (online); sorting; tree data structures; γ-bounded keyword matched subgraphs; Internet; data analysis; graph nodes shortest path; keyword-based correlated network computation; large social media; massive graph; merge-sort based approach; social networks; tree data structure; weighted single-based approximation approach; Blogs; Correlation; Keyword search; Measurement; Media; Social network services; Tree data structures; Correlated Networks; Keyword Query; Large Graph; Social Media;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2014 IEEE 30th International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/ICDE.2014.6816657
  • Filename
    6816657