Title :
Feature selection and gene clustering from gene expression data
Author :
Mitra, Pabitra ; Majumder, Dwijesh Dutta
Author_Institution :
Machine Intelligence Unit, Indian Stat. Inst., Kolkata, India
Abstract :
In This work we describe an algorithm for feature selection and gene clustering from high dimensional gene expression data. The method is based on measuring similarity between features/genes whereby redundancy therein is removed. This does not need any search and therefore is fast. A novel feature similarity measure, called maximum information compression index, is used. The feature selection algorithm also obtains gene clusters in a multiscale fashion. The superiority of the algorithm, in terms of speed and performance, is established on a real life molecular cancer classification dataset.
Keywords :
biology computing; feature extraction; genetics; optimisation; pattern clustering; feature selection; feature similarity measure; gene clustering; gene expression data; maximum information compression index; molecular cancer classification dataset; Cancer; Clustering algorithms; Data mining; Entropy; Gene expression; Inference algorithms; Machine intelligence; Partitioning algorithms; Random variables; Reactive power;
Conference_Titel :
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
Print_ISBN :
0-7695-2128-2
DOI :
10.1109/ICPR.2004.1334213