Title :
Microbial abundance patterns of host obesity inferred by the structural incorporation of association measures into interpretable classifiers
Author :
Huang, Nicole ; Yen-Jen Oyang
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Univ., Taipei, Taiwan
Abstract :
Obesity is a prevalent disease with severe complications. In recent years, interests have been geared towards exploring the relationship between gut microbial factors and the subject´s degree of, or propensity to, obesity. With the relative abundance values of phylotypes as features, machine learning algorithms could be applied to identify predictive microbes that distinguish between subjects of different types, and infer the corresponding rules that describe “how” the differentiation is made. However, the relative abundance is influenced by a number of upstream factors, and the inherent information content can often times limit the validity of the inferred rules. We addressed this issue by structurally incorporating association measures into interpretable classifiers. The resulting model renders microbial abundance patterns that are both statistically significant and predictively valid. Although we concentrated on obesity in this paper, the proposed approach is applicable on 16S rRNA datasets of other domains as well. The inferred patterns are in line with current knowledge of the microbial world, while providing new insights on the interactions between microbial factors, and their effects on the host. As such, they are believed to constitute credible starting points for further research.
Keywords :
RNA; bioinformatics; diseases; learning (artificial intelligence); microorganisms; molecular biophysics; molecular configurations; pattern classification; 16S rRNA datasets; association measures; disease; gut microbial factors; host obesity; inherent information content; interpretable classifiers; machine learning algorithms; microbial abundance patterns; microbial interaction factors; phylotypes; predictive microbes; structural incorporation; structurally incorporating association measures; upstream factors; Communities; Decision trees; Machine learning algorithms; Microorganisms; Obesity; Radio frequency; Vegetation; decision tree; microbiota; random forest; risk ratio;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
Conference_Location :
Belfast
DOI :
10.1109/BIBM.2014.6999176