DocumentCode :
2369039
Title :
Frequent sub-structure-based approaches for classifying chemical compounds
Author :
Deshpande, Mukund ; Kuramochi, Michihiro ; Karypis, George
Author_Institution :
Dept. of Comput. Sci., Minnesota Univ., Minneapolis, MN, USA
fYear :
2003
fDate :
19-22 Nov. 2003
Firstpage :
35
Lastpage :
42
Abstract :
We study the problem of classifying chemical compound datasets. We present a substructure-based classification algorithm that decouples the substructure discovery process from the classification model construction and uses frequent subgraph discovery algorithms to find all topological and geometric substructures present in the dataset. The advantage of our approach is that during classification model construction, all relevant substructures are available allowing the classifier to intelligently select the most discriminating ones. The computational scalability is ensured by the use of highly efficient frequent subgraph discovery algorithms coupled with aggressive feature selection. Our experimental evaluation on eight different classification problems shows that our approach is computationally scalable and on the average, outperforms existing schemes by 10% to 35%.
Keywords :
chemical structure; graph theory; pattern classification; support vector machines; chemical compound dataset classification; geometric substructure; subgraph discovery algorithm; substructure discovery process; Biology computing; Chemical compounds; Classification algorithms; Computational intelligence; Computer displays; Computer science; Drugs; High temperature superconductors; Scalability; Solid modeling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2003. ICDM 2003. Third IEEE International Conference on
Print_ISBN :
0-7695-1978-4
Type :
conf
DOI :
10.1109/ICDM.2003.1250900
Filename :
1250900
Link To Document :
بازگشت