• DocumentCode
    3393939
  • Title

    A novel LDA and PCA-based hierarchical scheme for metagenomic fragment binning

  • Author

    Zheng, Hao ; Wu, Hongwei

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Georgia Inst. of Technol., Atlanta, GA
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    53
  • Lastpage
    59
  • Abstract
    Metagenomics is to study microorganisms by directly extracting and cloning their DNAs from the environment without lab cultivation or isolation of individual genomes. Assembling of metagenomic DNA fragments is very much like the overlap-layout-consensus procedure for assembling isolated genomes, but is augmented by an additional binning step to differentiate scaffolds, contigs and unassembled reads into various taxonomic groups. In this paper, we employed oligonucleotide frequencies as the features and developed a hierarchical scheme for the challenging task of binning short metagenome fragments, in which principal component analysis (PCA) was implemented to reduce the high dimensionality of the feature space, and linear discriminant analysis (LDA) was used for the local classifier design. Simulation results and comparisons with a non-hierarchical classifier in silico were presented to demonstrate the effectiveness and performance of the proposed PCA and LDA-based hierarchical scheme. The HIER package for this study is available upon request.
  • Keywords
    DNA; biology computing; genomics; microorganisms; principal component analysis; DNA cloning; HIER package; LDA based hierarchical scheme; PCA based hierarchical scheme; genomes; lab cultivation; linear discriminant analysis; metagenomic fragment binning; metagenomics; microorganisms; principal component analysis; Assembly; Bioinformatics; Cloning; DNA; Frequency; Genomics; Linear discriminant analysis; Microorganisms; Packaging; Principal component analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Bioinformatics and Computational Biology, 2009. CIBCB '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2756-7
  • Type

    conf

  • DOI
    10.1109/CIBCB.2009.4925707
  • Filename
    4925707