Title :
Unsupervised classification via decision trees: an information-theoretic perspective
Author :
Karakos, Damianos ; Khudanpur, Sanjeev ; Eisner, Jason ; Priebe, Carey E.
Author_Institution :
Center for Language & Speech Process., Johns Hopkins Univ., MD, USA
Abstract :
Integrated sensing and processing decision trees (ISPDT) (Priebe et al. (2004)) were introduced as a tool for supervised classification of high-dimensional data. In this paper, we consider the problem of unsupervised classification, through a recursive construction of ISPDT, where at each internal node the data (i) are split into clusters, and (ii) are transformed independently of other clusters, guided by some optimization objective. We show that the maximization of information-theoretic quantities such as mutual information and α-divergences is theoretically justified for growing ISPDT, assuming that each data point is generated by a finite-memory random process given the class label. Furthermore, we present heuristics that perform the maximization in a greedy manner, and we demonstrate their effectiveness with empirical results from multispectral imaging.
Keywords :
decision trees; geophysics computing; greedy algorithms; image classification; image denoising; optimisation; recursive estimation; remote sensing; unsupervised learning; α-divergences; data clusters; finite-memory random process; greedy heuristics; information theory; integrated sensing and processing decision trees; maximization; multispectral imaging; mutual information; optimization; recursive construction; unsupervised classification; Classification tree analysis; Clustering algorithms; Decision trees; Mathematics; Multispectral imaging; Mutual information; Natural languages; Random processes; Speech processing; Statistics;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Print_ISBN :
0-7803-8874-7
DOI :
10.1109/ICASSP.2005.1416495