• DocumentCode
    2157264
  • Title

    Session Identification Algorithm for Web Log Mining

  • Author

    Peng Zhu ; Zhao, Ming-sheng

  • Author_Institution
    Dept. of Inf. Manage., Nanjing Univ., Nanjing, China
  • fYear
    2010
  • fDate
    24-26 Aug. 2010
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    This paper takes session identification in web log mining as research object, proposes an improved algorithm based on average time threshold value. By calculating the average intervals dynamically among request records in the session, adjusting the time threshold value individually, and compared to the traditional algorithm that defines a uniform threshold value for all users´ web pages, the algorithm in this paper can identify the long session more accurately. At last, the algorithm re-identifies the generated sets of candidate session, which make the identified session more reasonable and effective. Experiment result shows that the quality of session identification is improved.
  • Keywords
    Internet; data mining; Web log mining; Web pages; average time threshold value; research object; session identification; uniform threshold value; Algorithm design and analysis; Cleaning; Data mining; Heuristic algorithms; IP networks; Indexes; Knowledge engineering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Management and Service Science (MASS), 2010 International Conference on
  • Conference_Location
    Wuhan
  • Print_ISBN
    978-1-4244-5325-2
  • Electronic_ISBN
    978-1-4244-5326-9
  • Type

    conf

  • DOI
    10.1109/ICMSS.2010.5576547
  • Filename
    5576547