DocumentCode :
1659290
Title :
Granulating data on non-scalar attribute values
Author :
Mazlack, Lawrence ; Coppock, Sarah
Author_Institution :
Dept. of Comput. Sci., Cincinnati Univ., OH, USA
Volume :
2
fYear :
2002
fDate :
6/24/1905 12:00:00 AM
Firstpage :
944
Lastpage :
949
Abstract :
Data mining discovers interesting information from a data set. Mining incorporates different methods and considers different kinds of information. Granulation is an important aspect of mining. The data sets can be extremely large with multiple kinds of data in high dimensionality. Without granulation, large data sets often are computationally infeasible; and, the generated results may be overly fine grained. Most available algorithms work with quantitative data. However, many data sets contain a mixture of quantitative and qualitative data. Our goal is to group records containing multiple data varieties: quantitative (discrete, continuous) and qualitative (ordinal, nominal). Grouping based on different quantitative metrics can be difficult. Incorporating various qualitative elements is not simple. There are partially successful strategies as well as several differential geometries. We expect to use a mixture of scalar methods and soft computing methods (rough sets, fuzzy sets), as well as methods using other metrics. To cluster whole records in a data set, it would be useful to have a general similarity metric or a set of integrated similarity metrics that would allow record to record similarity comparisons. There are methods to granulate data items belonging to a single attribute. Few methods exist that might meaningfully handle a combination of many data varieties in a single metric. This paper is an initial consideration of strategies for integrating multiple metrics in the task of granulating records
Keywords :
computational complexity; data mining; pattern clustering; computational infeasibility; data granulation; data mining; differential geometries; fuzzy sets; nonscalar attribute values; quantitative metrics; rough sets; scalar methods; similarity metric; soft computing methods; Association rules; Clustering algorithms; Computer science; Data mining; Fuzzy sets; Geometry; Partitioning algorithms; Purification; Rough sets;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7803-7280-8
Type :
conf
DOI :
10.1109/FUZZ.2002.1006631
Filename :
1006631
Link To Document :
بازگشت