مرکز منطقه ای اطلاع رساني علوم و فناوري - Mining and Predicting Duplication over Peer-to-Peer Query Streams

DocumentCode :

3261050

Title :

Mining and Predicting Duplication over Peer-to-Peer Query Streams

Author :

Meng, Shicong ; Shao, Yifeng ; Shi, Cong ; Han, Dingyi ; Yu, Yong

Author_Institution :

Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ.

fYear :

2006

fDate :

Dec. 2006

Firstpage :

648

Lastpage :

652

Abstract :

Many previous works of data mining user queries in peer-to-peer systems focused their attention on the distribution of query contents. However, few has been done towards a better understanding of the time series distribution of these queries, which is vital for system performance. To remedy this situation, this paper mines query steams by using automatic time series analysis to evaluate different linear models (Box-Jenkins models and some simple windowed-mean models) for predicting the number of duplicated queries from 10 minutes to 2 hours into the future. Both the predictive power and the computational costs of these models are evaluated over 318,942,450 real world Gnutella queries collected over 3 months. We find the number of duplicated queries is consistently predictable. Simple, practical models like AR perform well on prediction

Keywords :

data mining; peer-to-peer computing; query processing; replicated databases; time series; automatic time series analysis; data mining; duplicated queries; peer-to-peer query streams; predicting duplication; Computational efficiency; Crawlers; Data mining; Knowledge management; Load management; Peer to peer computing; Power system modeling; Predictive models; System performance; Time series analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on

Conference_Location :

Hong Kong

Print_ISBN :

0-7695-2702-7

Type :

conf

DOI :

10.1109/ICDMW.2006.109

Filename :

4063705

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3261050