• DocumentCode
    32404
  • Title

    Characterizing Web Page Complexity and Its Impact

  • Author

    Butkiewicz, Michael ; Madhyastha, Harsha V. ; Sekar, Vyas

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of California, Riverside, Riverside, CA, USA
  • Volume
    22
  • Issue
    3
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    943
  • Lastpage
    956
  • Abstract
    Over the years, the Web has evolved from simple text content from one server to a complex ecosystem with different types of content from servers spread across several administrative domains. There is anecdotal evidence of users being frustrated with high page load times. Because page load times are known to directly impact user satisfaction, providers would like to understand if and how the complexity of their Web sites affects the user experience. While there is an extensive literature on measuring Web graphs, Web site popularity, and the nature of Web traffic, there has been little work in understanding how complex individual Web sites are, and how this complexity impacts the clients´ experience. This paper is a first step to address this gap. To this end, we identify a set of metrics to characterize the complexity of Web sites both at a content level (e.g., number and size of images) and service level (e.g., number of servers/origins). We find that the distributions of these metrics are largely independent of a Web site´s popularity rank. However, some categories (e.g., News) are more complex than others. More than 60% of Web sites have content from at least five non-origin sources, and these contribute more than 35% of the bytes downloaded. In addition, we analyze which metrics are most critical for predicting page render and load times and find that the number of objects requested is the most important factor. With respect to variability in load times, however, we find that the number of servers is the best indicator.
  • Keywords
    Internet; Web sites; human computer interaction; software metrics; Internet; Web load time variability; Web load times prediction; Web page rendering prediction; Web server number; Web site content level; Web site popularity rank; Web site service level; web page complexity characterization; Browsers; Complexity theory; Loading; Measurement; Servers; Web pages; Browsers; Internet; Web sites; World Wide Web; performance evaluation;
  • fLanguage
    English
  • Journal_Title
    Networking, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6692
  • Type

    jour

  • DOI
    10.1109/TNET.2013.2269999
  • Filename
    6557094