DocumentCode :
3605476
Title :
Clustering Data of Mixed Categorical and Numerical Type With Unsupervised Feature Learning
Author :
Dao Lam ; Mingzhen Wei ; Wunsch, Donald
Author_Institution :
Dept. of Electr. & Comput. Eng., Missouri Univ. of Sci. & Technol., Rolla, MO, USA
Volume :
3
fYear :
2015
fDate :
7/7/1905 12:00:00 AM
Firstpage :
1605
Lastpage :
1613
Abstract :
Mixed-type categorical and numerical data are a challenge in many applications. This general area of mixed-type data is among the frontier areas, where computational intelligence approaches are often brittle compared with the capabilities of living creatures. In this paper, unsupervised feature learning (UFL) is applied to the mixed-type data to achieve a sparse representation, which makes it easier for clustering algorithms to separate the data. Unlike other UFL methods that work with homogeneous data, such as image and video data, the presented UFL works with the mixed-type data using fuzzy adaptive resonance theory (ART). UFL with fuzzy ART (UFLA) obtains a better clustering result by removing the differences in treating categorical and numeric features. The advantages of doing this are demonstrated with several real-world data sets with ground truth, including heart disease, teaching assistant evaluation, and credit approval. The approach is also demonstrated on noisy, mixed-type petroleum industry data. UFLA is compared with several alternative methods. To the best of our knowledge, this is the first time UFL has been extended to accomplish the fusion of mixed data types.
Keywords :
ART neural nets; fuzzy neural nets; pattern clustering; unsupervised learning; UFL methods; UFLA; credit approval; fuzzy ART; fuzzy adaptive resonance theory; ground truth; heart disease; homogeneous data; mixed data type fusion; mixed-type categorical data clustering; mixed-type numerical data clustering; noisy-mixed-type petroleum industry data; real-world data sets; sparse representation; teaching assistant evaluation; unsupervised feature learning; Clustering algorithms; Diseases; Education; Heart; Neurons; Subspace constraints; Unsupervised learning; Clustering; Fuzzy ART; clustering; fuzzy ART; mixed-type data; unsupervised feature learning;
fLanguage :
English
Journal_Title :
Access, IEEE
Publisher :
ieee
ISSN :
2169-3536
Type :
jour
DOI :
10.1109/ACCESS.2015.2477216
Filename :
7244165
Link To Document :
بازگشت