DocumentCode :
1798313
Title :
A new distance metric for unsupervised learning of categorical data
Author :
Hong Jia ; Yiu-ming Cheung
Author_Institution :
Dept. of Comput. Sci., Hong Kong Baptist Univ., Hong Kong, China
fYear :
2014
fDate :
6-11 July 2014
Firstpage :
1893
Lastpage :
1899
Abstract :
Distance metric is the basis of many learning algorithms and its effectiveness usually has significant influence on the learning results. Generally, measuring distance for numerical data is a tractable task, but for categorical data sets, it could be a nontrivial problem. This paper therefore presents a new distance metric for categorical data based on the characteristics of categorical values. Specifically, the distance between two values from one attribute measured by this metric is determined by both of the frequency probabilities of these two values and the values of other attributes which have high interdependency with the calculated one. Promising experimental results on different real data sets have shown the effectiveness of proposed distance metric.
Keywords :
data analysis; unsupervised learning; categorical data; categorical values; distance metric; frequency probabilities; unsupervised learning; Frequency measurement; Hamming distance; Indexes; Joints; Probability; Redundancy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), 2014 International Joint Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-6627-1
Type :
conf
DOI :
10.1109/IJCNN.2014.6889890
Filename :
6889890
Link To Document :
بازگشت