DocumentCode :
2767769
Title :
Literature based Bayesian analysis of gene expression data
Author :
Xu, Lijing ; Homayouni, Ramin ; George, E. Olusegun
Author_Institution :
Bioinf. Program, Univ. of Memphis, Memphis, TN, USA
fYear :
2011
fDate :
12-15 Nov. 2011
Firstpage :
1032
Lastpage :
1032
Abstract :
Recent research has focused on incorporating biological function and pathway information into the analysis of gene expression data, partly as a means of compensating for insufficient experimental replications, low signal to noise, lack of reproducibility and/or multiple testing confounds. A Bayesian approach seems to be ideal for incorporating functional information into gene expression data analysis. In this study, we tested the feasibility of using literature derived gene relationships in a Bayesian model to analyze gene expression data. Prior distributions were constructed based on gene associations derived from the biomedical literature using Latent Semantic Indexing (LSI). The LSI model was built using more than 1 million Medline abstracts corresponding to 22,000 human and mouse genes. A key advantage of LSI is that both explicit and implicit gene relationships can be derived from the literature. Gene neighborhoods were determined using latent Gaussian Markov random fields and logistic transformation of the latent variables. We tested the procedure on a microarray dataset for interferon-stimulated genes in mouse embryonic fibroblasts. By integrating functional information from literature, Bayesian approach identified relevant genes that previously did not meet the 0.05 significance level. In comparison to a standard mixture model, spatial mixture model has more power for identifying direct and indirect interferon regulated genes. The spatial model enhanced the ranks of some genes which are known to be affected by interferon treatment, such as Nmi (NMI N-myc and STAT interactor) and ifi35 (interferon-induced protein 35). It also identified some genes that previously were ignored because of the marginal p-values, such as dpysl2, map2k1, msn, Psck5, and Il6st. Interestingly, these genes appear to be indirectly related to interferon treatment. In summary, we show that our procedure increases statistical power and produces more biologically meaningful gene lists. T- ese results suggest that Bayesian methods which incorporate functional information from the literature may improve analysis of gene expression data.
Keywords :
Bayes methods; Markov processes; bioinformatics; genetics; Il6st; Latent Semantic Indexing; Medline; Psck5; biological function; dpysl2; gene expression data; interferon stimulated genes; latent Gaussian Markov random fields; literature based Bayesian analysis; logistic transformation; map2k1; microarray dataset; mouse embryonic fibroblasts; msn; pathway information; reproducibility; spatial model; Bayesian methods; Bioinformatics; Educational institutions; Gene expression; Indexing; Large scale integration; Bayesian Modeling; Latent Semantic Indexing; Micorarray; Text-mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
Type :
conf
DOI :
10.1109/BIBMW.2011.6112549
Filename :
6112549
Link To Document :
بازگشت