• DocumentCode
    2396625
  • Title

    Prediction of the optimum combination of solexa sequencing libraries in genome projects

  • Author

    Zhao, You-Jie ; Jiao, Jun-ying ; Hu, Kun-Rong ; Cao, Yong ; Zhou, Kai-Lai

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Southwest Forestry Univ., Kunming, China
  • fYear
    2012
  • fDate
    19-20 May 2012
  • Firstpage
    2264
  • Lastpage
    2267
  • Abstract
    DNA sequencing technology has played an important role on life sciences, especially Illumina´s solexa sequencer. It was used for more and more genome projects. Solexa libraries were usually constructed with insert sizes of 200bp, 500bp, 2k, 5k and 10k in genome projects. It is a problem how to find the optimum combination of different insert sizes and different depth of solexa sequencing libraries. In this paper, we took the wild rice genome sequencing project for example. One tool SRSD was explored to simulate random solexa libraries based on cultivated rice genome sequence. Different depth of 200bp, 500bp, 2k, 5k and 10k solexa libraries were produced by the tool. After assembling and calculating their contig N50 and scaffold N50, the optimum combination of solexa libraries was predicted. It mainly includes 24X-depth 500bp, 6X-depth 2k, 4X-depth 5k and 4X-depth 10k libraries. These sequences would assemble 320Mbp rice genome with contig N50 7.8k and scaffold N50 185.3k by SOAPdenovo. And the result suggests 500bp library is more useful than 200bp library for sequence assembly. It provides effective guide for genome projects by solexa sequencer. And it would be able to greatly reduce cost and improve the quality of genome assembly.
  • Keywords
    DNA; agriculture; genomics; DNA sequencing technology; Illumina solexa sequencer; SOAPdenovo; SRSD; cultivated rice genome sequence; genome assembly; genome projects; optimum combination; random solexa libraries; solexa sequencing libraries; wild rice genome sequencing project; Assembly; Bioinformatics; DNA; Error analysis; Gaussian distribution; Genomics; Libraries; Genome project; Next-generation sequencing; Optimum combination; Solexa library;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems and Informatics (ICSAI), 2012 International Conference on
  • Conference_Location
    Yantai
  • Print_ISBN
    978-1-4673-0198-5
  • Type

    conf

  • DOI
    10.1109/ICSAI.2012.6223504
  • Filename
    6223504