DocumentCode
3393939
Title
A novel LDA and PCA-based hierarchical scheme for metagenomic fragment binning
Author
Zheng, Hao ; Wu, Hongwei
Author_Institution
Dept. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA
fYear
2009
fDate
March 30 2009-April 2 2009
Firstpage
53
Lastpage
59
Abstract
Metagenomics is to study microorganisms by directly extracting and cloning their DNAs from the environment without lab cultivation or isolation of individual genomes. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed oligonucleotide frequencies as the features and developed a hierarchical scheme for the challenging task of binning short metagenome fragments, in which principal component analysis (PCA) was implemented to reduce the high dimensionality of the feature space, and linear discriminant analysis (LDA) was used for the local classifier design. Simulation results and comparisons with a non-hierarchical classifier in silico were presented to demonstrate the effectiveness and performance of the proposed PCA and LDA-based hierarchical scheme. The HIER package for this study is available upon request.
Keywords
DNA; biology computing; genomics; microorganisms; principal component analysis; DNA cloning; HIER package; LDA based hierarchical scheme; PCA based hierarchical scheme; genomes; lab cultivation; linear discriminant analysis; metagenomic fragment binning; metagenomics; microorganisms; principal component analysis; Assembly; Bioinformatics; Cloning; DNA; Frequency; Genomics; Linear discriminant analysis; Microorganisms; Packaging; Principal component analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB '09. IEEE Symposium on
Conference_Location
Nashville, TN
Print_ISBN
978-1-4244-2756-7
Type
conf
DOI
10.1109/CIBCB.2009.4925707
Filename
4925707
Link To Document