Title :
TIGERA: A New Tool for Illumina Gene Expression Reads Analysis
Author :
Bai, Xiaodong ; Grewal, Parwinder S.
Author_Institution :
Dept. of Entomology, Ohio State Univ., Wooster, OH, USA
Abstract :
Next-generation sequencing platforms, including Illumina, 454, and SOLiD are emerging as easier, faster, and cheaper alternatives to traditional sequencing platforms. Illumina digital gene expression (DGE) tag profiling allows comprehensive analysis of differentially expressed genes in organisms. Computer programs are necessary to handle the overwhelming amount of data generated by the Illumina genome analyzer. Here we report the design and implementation of a program for the analysis of differential gene expression based on Illumina data. The program TIGERA (tool for Illumina gene expression reads analysis) was written in perl utilizing newly-implemented and preexisting algorithms with a simple graphical user interface. The program performs the following tasks automatically after the required inputs are provided. The expression levels of high-quality Illumina tags for each of the two groups of libraries are determined and normalized as transcript per million (TPM). The Illumina tags are mapped to the annotated reference sequences to identify uniquely mapped tags. The mapping results are validated using information generated by digital restriction enzyme digestion of the reference sequences. Based on whether the tags matched to unique or multiple reference sequences after validation, the tags are grouped in three categories: one tag-one reference, one tag-one gene, and one tag-multiple genes. The tags within the first two categories are analyzed further to determine the reference sequences that contain unique expression levels or have potential alternative transcript splicing products. A Poisson mixture model is applied to analyze the differential expression of reference sequences with unique expression levels and the tags not being matched to the reference sequences. The progress of the analysis is monitored and reported. The analysis results are presented as text files and also deposited in a MySQL database that can be visualized and searched in Internet browsers.- Two biological replicates of the DGE tag libraries of the infective juveniles of the entomopathogenic nematode Heterorhabditis bacteriophora TT01 and GPS11 strains were sequenced using Illumina platform to demonstrate the performance of the program.
Keywords :
Internet; SQL; bioinformatics; data analysis; enzymes; genetics; graphical user interfaces; information retrieval; microorganisms; molecular biophysics; online front-ends; DGE tag profiling; GPS11 strains; Heterorhabditis bacteriophora TT01; Illumina digital gene expression; Illumina genome analyzer; Internet browser; MySQL database; Poisson mixture model; TIGERA; computer program; data generation; digital restriction enzyme digestion; entomopathogenic nematode; graphical user interface; one tag-multiple gene category; one tag-one gene category; one tag-one reference category; tool for Illumina gene expression reads analysis; transcript splicing product; Algorithm design and analysis; Biochemistry; Bioinformatics; Gene expression; Genomics; Graphical user interfaces; Organisms; Software libraries; Solids; Splicing; Illumina DGE tag profiling; TIGERA; differential gene expression; perl program;
Conference_Titel :
Bioinformatics, 2009. OCCBIO '09. Ohio Collaborative Conference on
Conference_Location :
Cleveland, OH
Print_ISBN :
978-0-7695-3685-9
DOI :
10.1109/OCCBIO.2009.14