Title :
Inferring Functional Groups from Microbial Gene Catalogue with Probabilistic Topic Models
Author :
Chen, Xin ; He, Tingting ; Hu, Xiaohua ; An, Yuan ; Wu, Xindong
Author_Institution :
Coll. of Inf. Sci. & Technol., Drexel Univ., Philadelphia, PA, USA
Abstract :
In this paper, based on the functional elements derived from non-redundant CDs catalogue, we show that the configuration of functional groups in meta-genome samples can be inferred by probabilistic topic modeling. The probabilistic topic modeling is a Bayesian method that is able to extract useful topical information from unlabeled data. When used to study microbial samples (assuming that relative abundance of functional elements is already obtained by a homology-based approach), each sample can be considered as a ´document´, which has a mixture of functional groups, while each functional group (also known as a ´latent topic´) is a weight mixture of functional elements (including taxonomic levels, and indicators of gene orthologous groups and KEGG pathway mappings). The functional elements bear an analogy with ´words´. Estimating the probabilistic topic model can uncover the configuration of functional groups (the latent topic) in each sample. The experimental results demonstrate the effectiveness of our proposed method.
Keywords :
Bayes methods; bioinformatics; genetics; information retrieval; Bayesian method; KEGG pathway mappings; functional groups; gene orthologous groups; homology-based approach; information extraction; latent topic; microbial gene catalogue; probabilistic topic models; taxonomic levels; topical information extraction; unlabeled data; Bioinformatics; Biological system modeling; Data models; Databases; Genomics; Probabilistic logic; Vocabulary; Bioinformatics databases; Biological data mining; Metagenomics; Probabilistic topic model;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1799-4
DOI :
10.1109/BIBM.2011.12