DocumentCode :
2551809
Title :
Study of sampling techniques and algorithms in data stream environments
Author :
Hu, Wenyu ; Zhang, Baili
Author_Institution :
Dept. of Comput. & Inf. Sci., Fujian Univ. of Technol., Fuzhou, China
fYear :
2012
fDate :
29-31 May 2012
Firstpage :
1028
Lastpage :
1034
Abstract :
Sampling is the most versatile approximation technique available and is still one of the most powerful methods for building a one-pass synopsis of a data set in a streaming environment. Throughout the detailed review, a kind of taxonomic frame of sampling algorithms was presented; meanwhile, discussions and comparisons of representative sampling algorithms were performed. Due to the limitations of uniform sampling in some applications, the importance of using biased sampling methods in these scenarios was fully dissertated. Subsequently, we surveyed the application and development of sampling techniques, especially those traditional sampling techniques in data stream model. Finally, we discussed the research challenges and future directions of sampling problem in the context of data streams.
Keywords :
data mining; data structures; sampling methods; approximation technique; biased sampling method; data stream mining; data structure; one-pass synopsis; representative sampling algorithm; taxonomic frame; Algorithm design and analysis; Approximation algorithms; Approximation methods; Classification algorithms; Data mining; Heuristic algorithms; Reservoirs; biased sampling; data stream mining; data-stream sampling; synopsis data structure; uniform sampling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
Type :
conf
DOI :
10.1109/FSKD.2012.6234278
Filename :
6234278
Link To Document :
بازگشت