• DocumentCode
    3157176
  • Title

    Twitter vs. printed English: An information-theoretic comparison

  • Author

    Glennon, Emma ; Sankar, Lalitha ; Poor, H. Vincent

  • Author_Institution
    Dept. of Electr. Eng., Princeton Univ., Princeton, NJ, USA
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    3069
  • Lastpage
    3072
  • Abstract
    The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter´s entropy is overall higher than that of printed English, and 2) individual users´ entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.
  • Keywords
    computer mediated communication; entropy; linguistics; social networking (online); Tweetspeak; Twitter; digitally-mediated communication; information-theoretic comparison; letter-based n-gram entropies; linguistic differences; microblogging service; printed English; social networking service; Educational institutions; Entropy; Handicapped aids; Radio access networks; Redundancy; Standards; Twitter; Twitter; computer mediated communication; information entropy; information theory; redundancy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288563
  • Filename
    6288563