Title :
Data quality improvement using fuzzy association rules
Author :
Alizamini, Fatemeh Ghorbanpour ; Pedram, Mir Mohsen ; Alishahi, Mohammad ; Badie, Kambiz
Author_Institution :
Comput. Eng. Dept., Islamic Azad Univ., Tehran, Iran
Abstract :
The activities and decisions of organizations and companies are based on data and the information obtained from data analysis. Data quality plays a crucial role in data analysis, because the incorrect data leads to wrong decisions. Nowadays, improving the data quality manually is very difficult and in many cases is impossible as data quality is one of the complicated and non-structured concepts and data refinement process can not be done without the help of professional domain experts, and detection and correction of errors require a thorough knowledge in the related domain of the data. Thus, the necessity of using (semi-)automatic methods is discussed to find data defects and errors and solve them. Because data mining methods are designed to discover interesting patterns in datasets, we can use them efficiently to improve different dimensions of data quality. In this paper, a new method is presented to measure the accuracy dimension of data quality using fuzzy association rules. Finally, Experimental results of the proposed method show the effectiveness of the proposed method to find incorrect values in datasets.
Keywords :
data analysis; data mining; error correction; error detection; fuzzy set theory; quality management; data analysis; data mining; data quality improvement; data refinement process; error correction; error detection; fuzzy association rule; professional domain expert; Accuracy; Association rules; Classification algorithms; Databases; Fuzzy sets; Runtime; data mining; data quality; fuzzy association rules;
Conference_Titel :
Electronics and Information Engineering (ICEIE), 2010 International Conference On
Conference_Location :
Kyoto
Print_ISBN :
978-1-4244-7679-4
Electronic_ISBN :
978-1-4244-7681-7
DOI :
10.1109/ICEIE.2010.5559676