DocumentCode
245801
Title
IOMRA - A High Efficiency Frequent Itemset Mining Algorithm Based on the MapReduce Computation Model
Author
Sheng-Hui Liu ; Shi-Jia Liu ; Shi-Xuan Chen ; Kun-Ming Yu
Author_Institution
Sch. of Software, Harbin Univ. of Sci. & Technol., Harbin, China
fYear
2014
fDate
19-21 Dec. 2014
Firstpage
1290
Lastpage
1295
Abstract
The goal of Frequent Item set Mining (FIM) is to find the biggest number of frequently used subsets from a big transaction database. In previous studies, using the advantage of multicore computing, the execution time of an Apriori algorithm was sharply decreased: when the size of a data set was more than TBs and a single host had been unable to afford a large number of operations by using a number of computers connected into a super computer to speed up execution as being the obvious solution. Some parallel Apriori algorithms, based on the MapReduce framework, have been proposed. However, with these algorithms, memory would be quickly exhausted and communication cost would rise sharply. This would greatly reduce execution efficiency. In this paper, we present an improved reformative Apriori algorithm that uses the length of each transaction to determine the size of the maximum merge candidate item sets. By reducing the production of low frequency item sets in Map function, memory exhaustion is ameliorated, greatly improving execution efficiency.
Keywords
data mining; parallel algorithms; FIM; IOMRA algorithm; Map function; MapReduce computation model; high efficiency frequent itemset mining algorithm; memory exhaustion; multicore computing; parallel Apriori algorithm; transaction database; Algorithm design and analysis; Computers; Data mining; Educational institutions; Itemsets; Memory management; Parallel processing; Aprior; Frequent Itemset Mining; Hadoop; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Science and Engineering (CSE), 2014 IEEE 17th International Conference on
Conference_Location
Chengdu
Print_ISBN
978-1-4799-7980-6
Type
conf
DOI
10.1109/CSE.2014.247
Filename
7023757
Link To Document