• DocumentCode
    814630
  • Title

    Web User-Session Inference by Means of Clustering Techniques

  • Author

    Bianco, Andrea ; Mardente, Gianluca ; Mellia, Marco ; Munafò, Maurizio ; Muscariello, Luca

  • Author_Institution
    Dipt. di Elettron., Politec. di Torino, Torino
  • Volume
    17
  • Issue
    2
  • fYear
    2009
  • fDate
    4/1/2009 12:00:00 AM
  • Firstpage
    405
  • Lastpage
    416
  • Abstract
    This paper focuses on the definition and identification of ldquoWeb user-sessionsrdquo, aggregations of several TCP connections generated by the same source host. The identification of a user-session is non trivial. Traditional approaches rely on threshold based mechanisms. However, these techniques are very sensitive to the value chosen for the threshold, which may be difficult to set correctly. By applying clustering techniques, we define a novel methodology to identify Web user-sessions without requiring an a priori definition of threshold values. We define a clustering based approach, we discuss pros and cons of this approach, and we apply it to real traffic traces. The proposed methodology is applied to artificially generated traces to evaluate its benefits against traditional threshold based approaches. We also analyze the characteristics of user-sessions extracted by the clustering methodology from real traces and study their statistical properties. Web user-sessions tend to be Poisson, but correlation may arise during periods of network/hosts anomalous behavior.
  • Keywords
    Internet; pattern clustering; statistical analysis; transport protocols; TCP connections; Web user-session inference; clustering techniques; statistical properties; threshold based mechanisms; user-session identification; Clustering methods; traffic measurement; web traffic characterization;
  • fLanguage
    English
  • Journal_Title
    Networking, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6692
  • Type

    jour

  • DOI
    10.1109/TNET.2008.927009
  • Filename
    4578709