DocumentCode :
597729
Title :
A novel Preprocessing Mixed Ancestral Graph technique for session construction
Author :
Chitra, S. ; Kalpana, B.
Author_Institution :
Comput. Sci. Gov. Arts Coll. (Autonomous), Coimbatore, India
fYear :
2013
fDate :
4-6 Jan. 2013
Firstpage :
1
Lastpage :
7
Abstract :
In the web usage mining process, the techniques of data mining are applied so as to discover the trends and the patterns in the browsing nature of the visitors of the website. There is extraction of the navigation patterns as the browsing patterns could be traced and the structure of the website can be designed accordingly. This information can be extracted from the log file. Only these log files record the session information about the web pages. It is the fact that the normal Log data is very noisy and unclear and it is vital to preprocess the log data for efficient web usage mining process. The Preprocessing process comprises of three phases which includes data cleaning, user identification and session construction. Data cleaning is the process the entries made in the log file for the unwanted view of images, graphics; Multimedia etc., made by the users are removed. The Session Identification is done by using the time stamp details of the web pages. The total time used by each user of each web page. This can also be done by noting down the user id those who have visited the web page and had traversed through the links of the web page. Session is the time duration spent in the web page. Session construction is very vital and numerous real world problems can be modeled as traversals on graph and mining from these traversals would provide the requirement for preprocessing phase. On the other hand, the traversals on unweighted graph have been taken into consideration in existing works. The proposed method constructs sessions as a Mixed Ancestral Graph which contains pages with calculated weights with BIC score. This will help site administrators to find the interesting pages for users and to redesign their web pages. After weighting each page according to browsing time a MAG structure is constructed for each user session.
Keywords :
Internet; Web sites; data mining; feature extraction; graph theory; system monitoring; BIC score; MAG structure; Web page redesign; Web usage mining process; Website; browsing nature patterns; browsing nature trends; data cleaning; data mining techniques; log file information extraction; mixed ancestral graph; navigation pattern extraction; normal log data; preprocessing mixed ancestral graph technique; preprocessing process; session construction; session information; time stamp details; unweighted graph; user id; user identification; Cleaning; Computers; Data mining; Web pages; Web servers; Mixed Ancestral Graph (MAG); Preprocessing; Robots Cleaning; Session Construction; Web Usage Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Communication and Informatics (ICCCI), 2013 International Conference on
Conference_Location :
Coimbatore
Print_ISBN :
978-1-4673-2906-4
Type :
conf
DOI :
10.1109/ICCCI.2013.6466161
Filename :
6466161
Link To Document :
بازگشت