Title :
A novel LDA and PCA-based hierarchical scheme for metagenomic fragment binning
Author :
Zheng, Hao ; Wu, Hongwei
Author_Institution :
Dept. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA
fDate :
March 30 2009-April 2 2009
Abstract :
Metagenomics is to study microorganisms by directly extracting and cloning their DNAs from the environment without lab cultivation or isolation of individual genomes. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed oligonucleotide frequencies as the features and developed a hierarchical scheme for the challenging task of binning short metagenome fragments, in which principal component analysis (PCA) was implemented to reduce the high dimensionality of the feature space, and linear discriminant analysis (LDA) was used for the local classifier design. Simulation results and comparisons with a non-hierarchical classifier in silico were presented to demonstrate the effectiveness and performance of the proposed PCA and LDA-based hierarchical scheme. The HIER package for this study is available upon request.
Keywords :
DNA; biology computing; genomics; microorganisms; principal component analysis; DNA cloning; HIER package; LDA based hierarchical scheme; PCA based hierarchical scheme; genomes; lab cultivation; linear discriminant analysis; metagenomic fragment binning; metagenomics; microorganisms; principal component analysis; Assembly; Bioinformatics; Cloning; DNA; Frequency; Genomics; Linear discriminant analysis; Microorganisms; Packaging; Principal component analysis;
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2756-7
DOI :
10.1109/CIBCB.2009.4925707