DocumentCode :
249374
Title :
Reducing the Search Space for Big Data Mining for Interesting Patterns from Uncertain Data
Author :
Leung, Carson Kai-Sang ; MacKinnon, Richard Kyle ; Fan Jiang
Author_Institution :
Dept. of Comput. Sci., Univ. of Manitoba, Winnipeg, MB, Canada
fYear :
2014
fDate :
June 27 2014-July 2 2014
Firstpage :
315
Lastpage :
322
Abstract :
Many existing data mining algorithms search interesting patterns from transactional databases of precise data. However, there are situations in which data are uncertain. Items in each transaction of these probabilistic databases of uncertain data are usually associated with existential probabilities, which express the likelihood of these items to be present in the transaction. When compared with mining from precise data, the search space for mining from uncertain data is much larger due to the presence of the existential probabilities. This problem is worsened as we are moving to the era of Big data. Furthermore, in many real-life applications, users may be interested in a tiny portion of this large search space for Big data mining. Without providing opportunities for users to express the interesting patterns to be mined, many existing data mining algorithms return numerous patterns -- out of which only some are interesting. In this paper, we propose an algorithm that (i) allows users to express their interest in terms of constraints and (ii) uses the MapReduce model to mine uncertain Big data for frequent patterns that satisfy the user-specified constraints. By exploiting properties of the constraints, our algorithm greatly reduces the search space for Big data mining of uncertain data, and returns only those patterns that are interesting to the users for Big data analytics.
Keywords :
Big Data; data analysis; data mining; Big Data analytics; MapReduce model; frequent patterns; search space reduction; uncertain Big Data mining; uncertain data; user-specified constraints; Big data; Computational modeling; Data mining; Data models; Databases; Partitioning algorithms; Program processors; Big data analytics; Big data models and algorithms; Big data search and mining; algorithms and programming techniques for Big data processing; constraints; frequent patterns; uncertain data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data (BigData Congress), 2014 IEEE International Congress on
Conference_Location :
Anchorage, AK
Print_ISBN :
978-1-4799-5056-0
Type :
conf
DOI :
10.1109/BigData.Congress.2014.53
Filename :
6906796
Link To Document :
بازگشت