• DocumentCode
    424335
  • Title

    The equivalence theory based on fuzzy theory

  • Author

    Li, Hua-Yang ; Liu, Yu-Bao ; Li, You-Kui ; Gui, Hao

  • Author_Institution
    Sch. of Software, Jiangxi Univ. of Finance & Econ., China
  • Volume
    2
  • fYear
    2004
  • fDate
    26-29 Aug. 2004
  • Firstpage
    1272
  • Abstract
    Data cleaning is an important work during the building process of data warehouse and data mining. The equivalence theory means the theory on how to define two records to be equivalent or duplicated. It is an important problem of data cleaning. The paper addressed a new equivalence theory and equivalence degree concept based on fuzzy theory, and put forward the corresponding calculation method of equivalence degrees. Moreover on the basis of the equivalence theory, the key word "report" is introduced and the method of clustering and handling duplicated records is presented. Compared with traditional equivalence theory, the new one is more convenient to generating rules, clustering and handling duplicated records, and reduces user\´s time of dealing with single LOG files. In addition, the paper put forward an interactive method based on clustering, which saved much of users\´ labor.
  • Keywords
    data handling; data mining; data warehouses; fuzzy set theory; pattern clustering; data cleaning; data clustering; data handling; data mining; data warehouse; equivalence theory; fuzzy theory; Cleaning; Containers; Data mining; Data warehouses; Educational institutions; Electronic mail; Finance; Forward contracts; Graphical user interfaces; Tiles;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2004. Proceedings of 2004 International Conference on
  • Print_ISBN
    0-7803-8403-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2004.1382388
  • Filename
    1382388