• DocumentCode
    419340
  • Title

    The SSAHA trace server

  • Author

    Ning, Zemin ; Spooner, William ; Spargo, Adam ; Leonard, Steven ; Rae, Mark ; Cox, Antony

  • Author_Institution
    Wellcome Trust Sanger Inst., Cambridge, UK
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    544
  • Lastpage
    545
  • Abstract
    We present a client/server database system with the potential to make all DNA sequences searchable. The database estimated to be approximately 200 Gbps, contains various types of sequences, including WGS and clone reads, draft assemblies, finished sequences, refSeq etc. The search engine will be SSAHA2, a package combined SSAHA (sequence search and alignment by hashing algorithm) with cross_match. Matching seeds of a few kmer words are detected by the SSAHA algorithm. Both query and subject sequences are cut off according to the locations of the matching seeds and then passed to cross_match for full alignment. We aim to develop a platform-independent client/server system which can provide a near real-time (under 10 seconds) search service for a clustered database.
  • Keywords
    DNA; biology computing; client-server systems; file organisation; molecular biophysics; search engines; DNA sequences; SSAHA trace server; SSAHA2; WGS sequences; client/server database system; clone reads; clustered database; cross_match; draft assemblies; finished sequences; hashing algorithm; matching seeds; query sequences; refSeq sequences; search engine; sequence alignment; sequence search; subject sequences; Assembly; Bioinformatics; Cloning; DNA; Database systems; Encoding; Genomics; Packaging; Search engines; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332490
  • Filename
    1332490