Title :
Weighted Random sampling based hierarchical amnesic synopses for data streams
Author :
Chen Hua-Hui ; Liao Kang-Li
Author_Institution :
Coll. of Inf. Sci. & Eng., Ningbo Univ., Ningbo, China
Abstract :
Maintaining a synopsis structure dynamically from data stream is vital for a variety of streaming data applications, such as approximate query or data mining. In many cases, the significance of data item in streams decays with age: this item perhaps conveys critical information first, but, as time goes by, it gets less and less important until it eventually becomes useless. This characteristic is termed amnesic. Random Sampling is often used in construction of synopsis for streaming data. This paper proposed a Weighted Random Sampling based Hierarchical Amnesic Synopses which includes the amnesic characteristic of data stream in the generation of synopsis. The construction methods for weighted random sampling with and without replacement are discussed. We experimentally evaluate the proposed synopsis structure.
Keywords :
data handling; data mining; random processes; sampling methods; data mining; data streams; hierarchical amnesic synopses; weighted random sampling; Additives; Complexity theory; Data mining; Heuristic algorithms; Maintenance engineering; Presses; Reservoirs; amnesic; data streams; sampling; synopses;
Conference_Titel :
Computer Science and Education (ICCSE), 2010 5th International Conference on
Conference_Location :
Hefei
Print_ISBN :
978-1-4244-6002-1
DOI :
10.1109/ICCSE.2010.5593801