Title :
A novel feature extraction algorithm
Author :
Ding, Shi-fei ; Shi, Zhong-zhi ; Wang, Yun-Cheng ; Li, Shu-Shan
Author_Institution :
Coll. of Inf. Sci. & Eng., Shandong Agric. Univ., Taian, China
Abstract :
Feature extraction or selection is one of the most important steps in pattern recognition or pattern classification, data mining, machine learning and so on. In this paper, we introduce the information theory, propose a new concept of probability information distance (PID) and prove that the PID satisfies four requests of axiomatization of the distance. So the PID is a kind of distance measure, which can be used to measure the degree of variation between two random variables. We make the PID be separability criterion of the classes for information feature extraction, and call it PID criterion (PIDC). Based on PIDC, we design a novel algorithm for information feature extraction. Compared with principal components analysis (PCA), correlation analysis etc., the algorithm put forward in this paper had regarded for the class information, and so it is a kind of supervised algorithm of feature extraction. The experimental results demonstrate that the algorithm is valid and reliable, and it provides a new research approach for feature extraction, data mining and pattern recognition.
Keywords :
data mining; feature extraction; information theory; pattern classification; pattern recognition; PID criterion; correlation analysis; cross entropy; data mining; information feature extraction; information theory; machine learning; new research approach; pattern classification; pattern recognition; principal components analysis; probability information distance; separability criterion; Algorithm design and analysis; Data mining; Feature extraction; Information theory; Machine learning; Machine learning algorithms; Pattern classification; Pattern recognition; Principal component analysis; Random variables; Information theory; PID criterion (PIDC); information feature extraction; pattern recognition; probability information distance (PID) cross entropy; symmetry;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527230