DocumentCode
1659290
Title
Granulating data on non-scalar attribute values
Author
Mazlack, Lawrence ; Coppock, Sarah
Author_Institution
Dept. of Comput. Sci., Cincinnati Univ., OH, USA
Volume
2
fYear
2002
fDate
6/24/1905 12:00:00 AM
Firstpage
944
Lastpage
949
Abstract
Data mining discovers interesting information from a data set. Mining incorporates different methods and considers different kinds of information. Granulation is an important aspect of mining. The data sets can be extremely large with multiple kinds of data in high dimensionality. Without granulation, large data sets often are computationally infeasible; and, the generated results may be overly fine grained. Most available algorithms work with quantitative data. However, many data sets contain a mixture of quantitative and qualitative data. Our goal is to group records containing multiple data varieties: quantitative (discrete, continuous) and qualitative (ordinal, nominal). Grouping based on different quantitative metrics can be difficult. Incorporating various qualitative elements is not simple. There are partially successful strategies as well as several differential geometries. We expect to use a mixture of scalar methods and soft computing methods (rough sets, fuzzy sets), as well as methods using other metrics. To cluster whole records in a data set, it would be useful to have a general similarity metric or a set of integrated similarity metrics that would allow record to record similarity comparisons. There are methods to granulate data items belonging to a single attribute. Few methods exist that might meaningfully handle a combination of many data varieties in a single metric. This paper is an initial consideration of strategies for integrating multiple metrics in the task of granulating records
Keywords
computational complexity; data mining; pattern clustering; computational infeasibility; data granulation; data mining; differential geometries; fuzzy sets; nonscalar attribute values; quantitative metrics; rough sets; scalar methods; similarity metric; soft computing methods; Association rules; Clustering algorithms; Computer science; Data mining; Fuzzy sets; Geometry; Partitioning algorithms; Purification; Rough sets;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems, 2002. FUZZ-IEEE'02. Proceedings of the 2002 IEEE International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
0-7803-7280-8
Type
conf
DOI
10.1109/FUZZ.2002.1006631
Filename
1006631
Link To Document