Title : 
Benchmarking of gene prediction programs for metagenomic data
         
        
            Author : 
Yok, Non ; Rosen, Gail
         
        
            Author_Institution : 
Electr. & Comput. Eng. Dept., Drexel Univ., Philadelphia, PA, USA
         
        
        
            fDate : 
Aug. 31 2010-Sept. 4 2010
         
        
        
        
            Abstract : 
This manuscript presents the most rigorous benchmarking of gene annotation algorithms for metagenomic datasets to date. We compare three different programs: GeneMark, MetaGeneAnnotator (MGA) and Orphelia. The comparisons are based on their performances over simulated fragments from one hundred species of diverse lineages. We defined four different types of fragments; two types come from the inter- and intra-coding regions and the other types are from the gene edges. Hoff et al. used only 12 species in their comparison; therefore, their sample is too small to represent an environmental sample. Also, no predecessors has separately examined fragments that contain gene edges as opposed to intra-coding regions. General observations in our results are that performances of all these programs improve as we increase the length of the fragment. On the other hand, intra-coding fragments of our data show low annotation error in all of the programs if compared to the gene edge fragments. Overall, we found an upper-bound performance by combining all the methods.
         
        
            Keywords : 
bioinformatics; genetics; genomics; GeneMark; MetaGeneAnnotator; Orphelia; benchmarking; gene annotation algorithms; gene edge fragments; gene prediction programs; intra-coding fragments; intracoding regions; metagenomic data; Benchmark testing; Bioinformatics; Encoding; Genomics; Hidden Markov models; Measurement uncertainty; Sensitivity; Algorithms; Benchmarking; Databases, Genetic; Metagenomics; Molecular Sequence Annotation; ROC Curve;
         
        
        
        
            Conference_Titel : 
Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE
         
        
            Conference_Location : 
Buenos Aires
         
        
        
            Print_ISBN : 
978-1-4244-4123-5
         
        
        
            DOI : 
10.1109/IEMBS.2010.5627744