DocumentCode :
1921959
Title :
Data mining for building neural protein sequence classification systems with improved performance
Author :
Wang, Dianhui ; Lee, Nung Kion ; Dillon, Tharam S.
Author_Institution :
Dept. of Comput. Sci. & Comput. Eng., La Trobe Univ., Melbourne, VIC, Australia
Volume :
3
fYear :
2003
fDate :
20-24 July 2003
Firstpage :
1746
Abstract :
Traditionally, two protein sequences are classified into the same class if their feature patterns have high homology. These feature patterns were originally extracted by sequence alignment algorithms, which measure similarity between an unseen protein sequence and identified protein sequences. Neural network approaches, while reasonably accurate at classification, give no information about the relationship between the unseen case and the classified items that is useful to biologist. In contrast, in this paper we use a generalized radial basis function (GRBF) neural network architecture that generates fuzzy classification rules that could be used for further knowledge discovery. Our proposed techniques were evaluated using protein sequences with ten classes of super-families downloaded from a public domain database, and the results compared favorably with other standard machine learning techniques.
Keywords :
data mining; learning (artificial intelligence); neural net architecture; pattern classification; proteins; radial basis function networks; GRBF; biologist; data mining; feature patterns; fuzzy classification rules; generalized radial basis function; identified protein sequence; machine learning; neural network architecture; neural protein sequence; sequence alignment algorithms; unseen protein sequence; Computer science; Data engineering; Data mining; Feature extraction; Fuzzy neural networks; High performance computing; Neural networks; Protein engineering; Protein sequence; Spatial databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2003. Proceedings of the International Joint Conference on
ISSN :
1098-7576
Print_ISBN :
0-7803-7898-9
Type :
conf
DOI :
10.1109/IJCNN.2003.1223671
Filename :
1223671
Link To Document :
بازگشت