Abstract :
Data mining is a hot research field which has been studied by a lot of scientists and technicians for many years. Unfortunately, it is still a very difficult problem to mine huge data sets efficiently. Many researchers are working on developing fast data mining technologies and methods for processing huge data sets efficiently. The basic idea of quick sort is the divide and conquer method. It represents the idea of granular computing (GrC). The average time complexity of quick sort for an m dimensions table containing n records were usually considered to be mXnXlogn since the average time complexity of quick sort for a one detention array with n records is nXlogn. However, we find that it is just nX(m+logn), while not mXnXlogn. Based on this finding, there is an assumption that divide and conquer method can be used to improve the existed knowledge reduction algorithms in rough set theory and granular computing. It may be a good way to solve the problem of huge data mining. In this paper, we present our research plan about huge data mining based on rough set theory and granular computing. Besides, we also present our recent achievements.
Keywords :
computational complexity; data mining; rough set theory; sorting; divide and conquer method; granular computing; huge data mining; knowledge reduction algorithm; quick sort; rough set theory; time complexity; Algebra; Algorithm design and analysis; Computer science; Data mining; Decision trees; Information entropy; Information science; Intelligent agent; Set theory; Telecommunication computing;