DocumentCode :
3419803
Title :
Optimized data preprocessing technology for web log mining
Author :
Zheng, Ling ; Gui, Hui ; Li, Feng
Author_Institution :
Sch. of Control & Comput. Eng., North China Electr. Power Univ., Beijing, China
Volume :
1
fYear :
2010
fDate :
25-27 June 2010
Abstract :
In order to solve some existing problems in traditional data preprocessing technology for web log mining, an improved data preprocessing technology is used in this article. The identification strategy based on the referred web page is adopted at the stage of user identification, which is more effective than the traditional one based on web site topology. At stage of Session Identification, the strategy based on fixed priori threshold combined with session reconstruction is introduced. First, the initial session set is developed by the method of fixed priori threshold, and then the initial session set is optimized by using session reconstruction. Experiments have proved that advanced data preprocessing technology can enhance the quality of data preprocessing results.
Keywords :
Internet; data mining; Web log mining; fixed priori threshold; optimized data preprocessing technology; session identification; session reconstruction; user identification; Cleaning; Control engineering computing; Data engineering; Data mining; Data preprocessing; Design engineering; Power engineering and energy; Power engineering computing; Topology; Web server; User Identification; session; threshold; web log mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Design and Applications (ICCDA), 2010 International Conference on
Conference_Location :
Qinhuangdao
Print_ISBN :
978-1-4244-7164-5
Electronic_ISBN :
978-1-4244-7164-5
Type :
conf
DOI :
10.1109/ICCDA.2010.5540924
Filename :
5540924
Link To Document :
بازگشت