Title :
A probabilistic approach towards modeling email network with realistic features
Author :
Quangang Li ; Jinqiao Shi ; Tingwen Liu ; Li Guo ; Zhiguang Qin
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
Email plays a very important role in our daily life. Much work have been put into practice on email network. Those studies mostly require real email network datasets and reliable models to analyze user information and understand the mechanisms of network evolution. However, much research work is constrained by the absence of real large-scale email datasets. Although email communication is ubiquitous, there are very few large-scale available email datasets satisfied different research purposes. Due to privacy policy and restricted permissions, it is arduous to collect a real large-scale email dataset in a short time. Various social network models are usually used to create synthetic email networks. However, these models focus on modeling several structural properties of network without considering user behaviour patterns. They are not appropriate to generate large-scale realistic synthetic email network datasets. Towards this end, we propose a probabilistic model by which we can construct large-scale synthetic email datasets with a small captured email log. What is more important is that the generated synthetic dataset matches real email network properties and individual communication patterns. Moreover, it has linear complexity, and can be paralleled easily. Experimental results on Enron dataset demonstrate the above benefits of our model.
Keywords :
computational complexity; data privacy; electronic mail; probability; social networking (online); Enron dataset; captured email log; email communication; email network property; individual communication pattern; large-scale email dataset; large-scale realistic synthetic email network dataset; linear complexity; network evolution; privacy policy; probabilistic approach; probabilistic model; realistic features; social network model; structural property; user behaviour pattern; user information; Analytical models; Communities; Complexity theory; Computational modeling; Electronic mail; Social network services; Training; Dirichlet; Email network; generative model; simulation; snapshot;
Conference_Titel :
Computer Communication and Networks (ICCCN), 2014 23rd International Conference on
Conference_Location :
Shanghai
DOI :
10.1109/ICCCN.2014.6911760