DocumentCode :
2506884
Title :
SWAT: hierarchical stream summarization in large networks
Author :
Bulut, Ahmet ; Singh, Ambuj K.
Author_Institution :
Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
fYear :
2003
fDate :
5-8 March 2003
Firstpage :
303
Lastpage :
314
Abstract :
The problem of statistics and aggregate maintenance over data streams has gained popularity in recent years especially in telecommunications network monitoring, trend-related analysis, Web-click streams, stock tickers, and other time-variant data. The amount of data generated in such applications can become too large to store, or if stored too large to scan multiple times. We consider queries over data streams that are biased towards the more recent values. We develop a technique that summarizes a dynamic stream incrementally at multiple resolutions. This approximation can be used to answer point queries, range queries, and inner product queries. Moreover, the precision of answers can be changed adoptively by a client. Later, we extend the above technique to work in a distributed setting, specifically in a large network where a central site summarizes the stream and clients ask queries. We minimize the message overhead by deciding what and where to replicate by using an adaptive replication scheme. We maintain a hierarchy of approximations that change adoptively based on the query and update rates. We show experimentally that our technique performs better than existing techniques: up to 50 times better in terms of approximation quality, up to four orders of magnitude times better in response time, and up to five times better in terms of message complexity.
Keywords :
approximation theory; communication complexity; query processing; telecommunication networks; very large databases; SWAT; Web-click stream; adaptive replication scheme; approximation quality; hierarchical data stream summarization; inner product query; message complexity; point query; range query; telecommunications network monitoring; time-variant data; trend-related analysis; Aggregates; Computer science; Computerized monitoring; Data processing; Delay; Intelligent networks; Statistical analysis; Switches; Telecommunication switching; Time factors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2003. Proceedings. 19th International Conference on
Print_ISBN :
0-7803-7665-X
Type :
conf
DOI :
10.1109/ICDE.2003.1260801
Filename :
1260801
Link To Document :
بازگشت