Title :
Active User-Based and Ontology-Based Web Log Data Preprocessing for Web Usage Mining
Author :
Khasawneh, Natheer ; Chan, Chien-Chung
Author_Institution :
Dept. of Comput. Eng., Jordan Univ. of Sci. & Technol., Irbid
Abstract :
User identification and session identification are two major steps in preprocessing Web log data for Web usage mining. This paper introduces a fast active user-based user identification algorithm with time complexity O(n). The algorithm uses both an IP address and a finite users´ inactive time to identify different users in the Web log. Web site ontology is useful for identifying Web site structure and break points for browsing behavior. For session identification, we present an ontology-based method that utilizes the Web site structure and functionalities to identify different sessions
Keywords :
Internet; computational complexity; data mining; ontologies (artificial intelligence); Web usage mining; active user-based user identification algorithm; ontology-based Web log data preprocessing; session identification; time complexity; Application software; Cleaning; Computer science; Data engineering; Data preprocessing; Data security; HTML; Navigation; Ontologies; Web mining;
Conference_Titel :
Web Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2747-7