DocumentCode
424335
Title
The equivalence theory based on fuzzy theory
Author
Li, Hua-Yang ; Liu, Yu-Bao ; Li, You-Kui ; Gui, Hao
Author_Institution
Sch. of Software, Jiangxi Univ. of Finance & Econ., China
Volume
2
fYear
2004
fDate
26-29 Aug. 2004
Firstpage
1272
Abstract
Data cleaning is an important work during the building process of data warehouse and data mining. The equivalence theory means the theory on how to define two records to be equivalent or duplicated. It is an important problem of data cleaning. The paper addressed a new equivalence theory and equivalence degree concept based on fuzzy theory, and put forward the corresponding calculation method of equivalence degrees. Moreover on the basis of the equivalence theory, the key word "report" is introduced and the method of clustering and handling duplicated records is presented. Compared with traditional equivalence theory, the new one is more convenient to generating rules, clustering and handling duplicated records, and reduces user\´s time of dealing with single LOG files. In addition, the paper put forward an interactive method based on clustering, which saved much of users\´ labor.
Keywords
data handling; data mining; data warehouses; fuzzy set theory; pattern clustering; data cleaning; data clustering; data handling; data mining; data warehouse; equivalence theory; fuzzy theory; Cleaning; Containers; Data mining; Data warehouses; Educational institutions; Electronic mail; Finance; Forward contracts; Graphical user interfaces; Tiles;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
Print_ISBN
0-7803-8403-2
Type
conf
DOI
10.1109/ICMLC.2004.1382388
Filename
1382388
Link To Document