DocumentCode :
3717389
Title :
Finding banded patterns in big data using sampling
Author :
Fatimah B Abdullahi;Frans Coenen;Russell Martin
Author_Institution :
Department of Computer Science, University of Liverpool, Ashton Street Liverpool, L69 3BX United Kingdom
fYear :
2015
Firstpage :
2233
Lastpage :
2242
Abstract :
A mechanism for identifying bandings in large "zero-one" N-dimensional data sets, using a sampling technique, is presented. The challenge of identifying bandings in data is the large number of potential permutations that need to be considered. To circumvent this a banding score mechanism is proposed that avoids the need to consider large numbers of permutations. This has been incorporated into a proposed banded pattern mining algorithm, the Exact ND Banded Pattern Mining (END BPM) algorithm. Although this operates well on reasonably sized datasets, there is still a challenge with respect to large N-dimensional data sets that cannot be held in primary storage. To this end a sampling technique is also proposed. The approach is fully described and evaluated using the GB cattle movement database, a "real life" database that records all movements of cattle in GB.
Keywords :
"Indexes","Sparse matrices","Big data","Data mining","Cows","Context"
Publisher :
ieee
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BigData.2015.7364012
Filename :
7364012
Link To Document :
بازگشت