DocumentCode :
116478
Title :
On the endogenesis of Twitter´s Spritzer and Gardenhose sample streams
Author :
Kergl, Dennis ; Roedler, Robert ; Seeber, Sebastian
fYear :
2014
fDate :
17-20 Aug. 2014
Firstpage :
357
Lastpage :
364
Abstract :
Many recent publications deal with trend analysis, event detection or opinion mining on social media data. Twitter, as the most important microblogging service, is often in the focus of these works, as it offers free access to big volumes of data. The free access, on that many publications rely, is composed of a random subset of the complete public status stream. Publications rely particularly on the uniform distribution of tweets in this sample stream, and therefore, till today, one has to trust in the statement of Twitter that the sample data is indeed uniformly distributed1. In our research on the technical properties of Twitter´s streaming data, we found evidence for discovering the method used by Twitter to decide which tweets will show up in the random sample streams. A deeper insight into this process leads to the possible reasons of why Twitter chose the presented sampling method. For this purpose we provide an overview of how Twitter´s unique tweet IDs are generated and explain the regularities of each part of a tweet ID. This results also in some information about the tweet ID generating infrastructure of Twitter and what kind of knowledge can possibly be derived from small features like the tweet ID.
Keywords :
data mining; sampling methods; social networking (online); Gardenhose sample streams; Twitter spritzer endogenesis; Twitter streaming data; event detection; microblogging service; opinion mining; public status stream; random sample streams; sampling method; social media data; trend analysis; tweet ID; Conferences; Data mining; Instruction sets; Market research; Media; Twitter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Social Networks Analysis and Mining (ASONAM), 2014 IEEE/ACM International Conference on
Conference_Location :
Beijing
Type :
conf
DOI :
10.1109/ASONAM.2014.6921610
Filename :
6921610
Link To Document :
بازگشت