• DocumentCode
    105153
  • Title

    Study of the Paired Change Points in Bacterial Genes

  • Author

    Suvorova, Yulia M. ; Korotkova, Maria A. ; Korotkov, Eugene V.

  • Author_Institution
    Bioinf. Lab., Centre of Bioeng., Moscow, Russia
  • Volume
    11
  • Issue
    5
  • fYear
    2014
  • fDate
    Sept.-Oct. 1 2014
  • Firstpage
    955
  • Lastpage
    964
  • Abstract
    It is known that nucleotide sequences are not totally homogeneous and this heterogeneity could not be due to random fluctuations only. Such heterogeneity poses a problem of making sequence segmentation into a set of homogeneous parts divided by the points called “change points”. In this work we investigated a special case of change points-paired change points (PCP). We used a well-known property of coding sequences-triplet periodicity (TP). The sequences that we are especially interested in consist of three successive parts: the first and the last parts have similar TP while the middle part has different TP type. We aimed to find the genes with PCP and provide explanation for this phenomenon. We developed a mathematical method for the PCP detection based on the new measure of similarity between TP matrices. We investigated 66,936 bacterial genes from 17 bacterial genomes and revealed 2,700 genes with PCP and 6,459 genes with single change point (SCP). We developed a mathematical approach to visualize the PCP cases. We suppose that PCP could be associated with double fusion or insertion events. The results of investigating the sequences with artificial insertions/fusions and distribution of TP inside the genome support the idea that the real number of genes formed by insertion/ fusion events could be 5-7 times greater than the number of genes revealed in the present work.
  • Keywords
    DNA; RNA; fluctuations; genetics; genomics; microorganisms; molecular biophysics; molecular configurations; PCP detection; artificial insertions-fusions; bacterial genes; bacterial genomes; coding sequences; double fusion; insertion events; mathematical method; nucleotide sequences; paired change points; random fluctuations; sequence segmentation; single change point; triplet periodicity; Bioinformatics; DNA; Encoding; Equations; Genomics; Mathematical model; Microorganisms; Biology and genetics Triplet periodicity; change points; genes; sequence analysis;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2321154
  • Filename
    6810010