DocumentCode
2551809
Title
Study of sampling techniques and algorithms in data stream environments
Author
Hu, Wenyu ; Zhang, Baili
Author_Institution
Dept. of Comput. & Inf. Sci., Fujian Univ. of Technol., Fuzhou, China
fYear
2012
fDate
29-31 May 2012
Firstpage
1028
Lastpage
1034
Abstract
Sampling is the most versatile approximation technique available and is still one of the most powerful methods for building a one-pass synopsis of a data set in a streaming environment. Throughout the detailed review, a kind of taxonomic frame of sampling algorithms was presented; meanwhile, discussions and comparisons of representative sampling algorithms were performed. Due to the limitations of uniform sampling in some applications, the importance of using biased sampling methods in these scenarios was fully dissertated. Subsequently, we surveyed the application and development of sampling techniques, especially those traditional sampling techniques in data stream model. Finally, we discussed the research challenges and future directions of sampling problem in the context of data streams.
Keywords
data mining; data structures; sampling methods; approximation technique; biased sampling method; data stream mining; data structure; one-pass synopsis; representative sampling algorithm; taxonomic frame; Algorithm design and analysis; Approximation algorithms; Approximation methods; Classification algorithms; Data mining; Heuristic algorithms; Reservoirs; biased sampling; data stream mining; data-stream sampling; synopsis data structure; uniform sampling;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location
Sichuan
Print_ISBN
978-1-4673-0025-4
Type
conf
DOI
10.1109/FSKD.2012.6234278
Filename
6234278
Link To Document