DocumentCode
3579243
Title
Issues in data mining: A comprehensive survey
Author
Purwar, Archana ; Singh, Sandeep Kumar
Author_Institution
Department of Computer Science/Information technology, Jaypee Institute of Information Technology, Noida India
fYear
2014
Firstpage
1
Lastpage
6
Abstract
Data mining has attained marvelous triumph in almost every domain such as health care, wireless sensor network, social network etc with development of its various algorithms. Every data mining algorithm has its inherent limitations. The application domain and the actual data, both together, heavily influence the particular choice as well as performance of any data mining, machine learning or statistical algorithm. The contribution that this paper makes is that it elaborates a number of data mining issues along with the metrics to measure the data quality and algorithm performance under a single hood. This paper has elaborated seven most vital issues in data mining, i.e., Missing Value Imputation, Feature Selection, Outlier Detection, Cluster Analysis of high dimensional data, Imbalanced classes in classification, Privacy of data, Mining from complex/distributed data. It not only presents these issues but also discusses their existing solutions. Survey also throws light on the limitations and research gaps for prospective researchers. These issues are identified after an extensive study on 50 different papers. Selection of these papers is carefully done so as to investigate core issues in data mining that still need to be addressed. This work has also grouped thirty data quality and algorithm performance metrics used in literature into three categories. This comprehensive understanding of the issues and metrics can be a treat for beginners in the research of data mining. Survey shows that the most frequently used algorithm performance measures are accuracy and time complexity.
Keywords
Data mining; Imbalance classification; Missing value imputation (MVI); clustering; privacy data mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Computing Research (ICCIC), 2014 IEEE International Conference on
Print_ISBN
978-1-4799-3974-9
Type
conf
DOI
10.1109/ICCIC.2014.7238447
Filename
7238447
Link To Document