DocumentCode :
653959
Title :
Automatic Outlier Detection for Genome Assembly Quality Assessment
Author :
Samak, Taghrid ; Egan, Renate ; Bushnell, Brian ; Gunter, Dan ; Copeland, Alex ; Zhong Wang
Author_Institution :
Lawrence Berkeley Nat. Lab., Berkeley, CA, USA
fYear :
2013
fDate :
22-25 Oct. 2013
Firstpage :
45
Lastpage :
52
Abstract :
In this work we describe a method to automatically detect errors in de novo assembled genomes. The method extends a Bayesian assembly quality evaluation framework, ALE, which computes the likelihood of an assembly given a set of unassembled data. Starting from ALE output, this method applies outlier detection algorithms to identify the precise locations of assembly errors. We show results from a microbial genome with manually curated assembly errors. Our method detects all deletions, 82.3% of insertions, and 88.8% of single base substitutions. It was also able to detect an inversion error that spans more than 400 bases.
Keywords :
Bayes methods; biology computing; genetics; genomics; security of data; ALE output; Bayesian assembly quality evaluation framework; assembly error location; automatic error detection; automatic outlier detection; de novo assembled genomes; genome assembly quality assessment; microbial genome; Assembly; Bioinformatics; Diseases; Genomics; Libraries; Lungs; Sensitivity; Genome Assembly Evaluation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
eScience (eScience), 2013 IEEE 9th International Conference on
Conference_Location :
Beijing
Type :
conf
DOI :
10.1109/eScience.2013.49
Filename :
6683890
Link To Document :
بازگشت