• DocumentCode
    2379362
  • Title

    A re-sequencing tool for high mismatch-tolerant short read alignment based on Burrows-Wheeler Transform

  • Author

    Lu, Chen Hua ; Lin, Chun Yuan ; Tang, Chuan Yi

  • Author_Institution
    Dept. of Comput. Sci., Nat. Tsing Hua Univ., Hsinchu, Taiwan
  • fYear
    2010
  • fDate
    18-18 Dec. 2010
  • Firstpage
    549
  • Lastpage
    554
  • Abstract
    After the reference genomes of many organisms are sequenced in this post-genetic era, it has become an extremely important issue that how to do the re-sequencing and assembly for individual genomes from very large amount of reads. In this paper, we will present a re-sequencing tool designed for the Next Generation Sequencing (NGS) data. And these data are composed of a huge amount of short reads which will be aligned onto a reference genome. We modified and implemented the algorithm of Burrows-Wheeler Transform and FM-index to build the genome index of human, and proposed an idea to segment each short read into multiple non-overlapping seeds, which let us align short reads with large Hamming distance. Finally, we used the simulated datasets and real datasets from 1000 Genome Project to demonstrate the performance of our tool on a personal computer, and compared the results with widely used tools, bowtie and SOAPv2.
  • Keywords
    DNA; bioinformatics; genomics; macromolecules; microcomputers; molecular biophysics; Burrows-Wheeler transform; SOAPv2; align short reads; bowtie; high mismatch-tolerant short read alignment; large Hamming distance; multiple nonoverlapping seeds; next generation sequencing data; organisms; personal computer; post-genetic era; reference genomes; resequencing tool; simulated datasets; Alignment; BWT; NGS; Resequence; Short Reads;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshops (BIBMW), 2010 IEEE International Conference on
  • Conference_Location
    Hong, Kong
  • Print_ISBN
    978-1-4244-8303-7
  • Electronic_ISBN
    978-1-4244-8304-4
  • Type

    conf

  • DOI
    10.1109/BIBMW.2010.5703860
  • Filename
    5703860