Title :
Reconstruction of ancestral gene order after segmental duplication and gene loss
Author :
Huan, Jun ; Prins, Jan ; Wang, Wei ; Vision, Todd
Author_Institution :
Dept. of Comput. Sci., North Carolina Univ., Chapel Hill, NC, USA
Abstract :
As gene order evolves through a variety of chromosomal rearrangements, conserved segments provide important insight into evolutionary relationships and functional roles of genes. However, gene loss within otherwise conserved segments, as typically occurs following large-scale genome duplication, has received limited algorithmic study. This has been a major impediment to comparative genomics in certain taxa, such as plants and fish. We propose a heuristic algorithm/or the inference of ancestral gene order in a set of related genomes that have undergone large-scale duplication and gene loss. First, approximately conserved (i.e. homologous) segments are identified using pairwise local genome alignment. Second, homologous segments are iteratively clustered under the control of two parameters, (1) the minimal required number of shared genes between two clusters and (2) the maximal allowed number of rearrangement breakpoints along the lineage leading to each descendant segment. Finally, we compute an estimated ancestral gene order for each cluster that is optimal in some sense. We evaluate the performance of this algorithm on simulated data that models a genome evolving by large-scale duplication, duplicate gene loss, transposition, translocation, and inversion. The results suggest that long segments of ancestral gene order may be reconstructed following moderate levels of rearrangement with only minor loss of accuracy.
Keywords :
biology computing; cellular biophysics; evolution (biological); genetic engineering; genetics; heuristic programming; pattern clustering; program assemblers; ancestral gene reconstruction; chromosomal rearrangement; evolution; gene loss; genomics; heuristic algorithm; homologous segment; pairwise local genome alignment; segmental duplication; Bioinformatics; Biological cells; Clustering algorithms; Genomics; Heuristic algorithms; Impedance; Inference algorithms; Iterative algorithms; Large-scale systems; Marine animals;
Conference_Titel :
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Print_ISBN :
0-7695-2000-6
DOI :
10.1109/CSB.2003.1227382