Title :
Study of data ming based on Apriori algorithm
Author_Institution :
Zhejiang Sci-Tech Univ., Hangzhou, China
Abstract :
According to the current characteristics of Internet users to search information, this article provides preprocessing steps and algorithms of log mining for non-registered and registered users. Focus on the study of the user´s access pattern generated after preprocessing logs, and generated frequent access sets. It mainly studies the algorithm of web site users´ personalized information recommendation and compares current various algorithms about the personalized recommendation. As the amount of Internet users log is very large, and contains a lot of useless information, this article improves the existing Apriori algorithm, by using the division and governance strategy to introduce the method of the Hash technology to complete the compressed candidate sets, reducing the times of frequently scanning the database, overcoming problems of the data mining algorithm of original association rules generating relatively large frequent sets, and needing repeatedly scanning the database.
Keywords :
Internet; data mining; Apriori Algorithm; Hash technology; Internet; association rules; data mining; log mining; personalized information recommendation; personalized recommendation; Algorithm design and analysis; Association rules; Internet; Software algorithms; Transaction databases; Web Data Mining; Web Personalized Information Recommendation; association rules;
Conference_Titel :
Software Technology and Engineering (ICSTE), 2010 2nd International Conference on
Conference_Location :
San Juan, PR
Print_ISBN :
978-1-4244-8667-0
Electronic_ISBN :
978-1-4244-8666-3
DOI :
10.1109/ICSTE.2010.5608780