DocumentCode :
464283
Title :
Operon Prediction in Microbial Genomes Using Decision Tree Approach
Author :
Che, Dongsheng ; Zhao, Jizhen ; Cai, Liming ; Xu, Ying
Author_Institution :
Dept. of Comput. Sci., Georgia Univ., Athens, GA
fYear :
2007
fDate :
1-5 April 2007
Firstpage :
135
Lastpage :
142
Abstract :
Identifying operons at the whole genome scale of microbial organisms can facilitate deciphering of transcriptional regulation, biological networks and pathways. A number of computational methods, such as naive Bayesian and neural network approaches, have been employed for operon prediction to whole genome sequences of a number of prokaryotic organisms, based on features known to be associated with operons, such as intergenic distance, microarray expression data, phylogenetic profiles, clusters of orthologous groups (COG). In this paper, we introduce a decision tree approach to predict operon structures using three effective types of genomic data: intergenic distance, gene order conservation and COG. We calculated and analyzed frequency distributions of each attribute of known operons and non-operons of Escherichia coli (E. coli) K12 and Bacillus subtilis (R subtilis) 168, and constructed decision trees based on training examples to predict operons. The overall prediction accuracy is 94.1% for E. coli K12 and 91.0% for B. subtilis 168. We also applied four other classifiers, logistic regression, naive Bayesian, neural network and support vector machines on both organisms. The results indicate that the decision tree approach is the best classifier for operon prediction. The software package operonDT is freely available at http://www.cs.uga.edn/~che/OperonT
Keywords :
biology computing; decision trees; genetics; microorganisms; Operon prediction; biological networks; biological pathways; decision tree; gene order conservation; intergenic distance; microbial genomes; operon structures; operonDT; orthologous groups clusters; transcriptional regulation; Bayesian methods; Bioinformatics; Biology computing; Computer networks; Decision trees; Frequency; Genomics; Neural networks; Organisms; Phylogeny;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Bioinformatics and Computational Biology, 2007. CIBCB '07. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0710-9
Type :
conf
DOI :
10.1109/CIBCB.2007.4221215
Filename :
4221215
Link To Document :
بازگشت