DocumentCode :
3316884
Title :
A method to verify originality of sequences secretly on distributed computing environment
Author :
Kurata, Ken-Ichi ; Nakamura, Hiroshi ; Breton, Vincent
Author_Institution :
Res. Center for Adv. Sci. & Technol., Tokyo Univ., Japan
fYear :
2004
fDate :
20-22 July 2004
Firstpage :
310
Lastpage :
319
Abstract :
In the field of molecular biology, it is important to find gene sequences related to some phenomena, such as disease and chemical reaction. Once a target gene has been sequenced, it must be confirmed whether the sequence is already known or not in the world. If the sequence is not yet revealed on databases, it is a novel and valuable sequence. In general, this comparison process is done by comparing exact sequence data with each other by using a homology search program. In this case, the exact sequences of not only genomic databases but also newly sequenced genes must be opened in public. Therefore, if we don´t like to open the databases and/or the new sequences on public networks, we must purchase them and search in local. We propose a method to verify the originality of gene sequences secretly on public networks. At first, target raw sequences are manipulated to prevent them from being reconstructed. Next, this method hashes all the genomic sequences. Only the processed data are opened on public networks. Finally, the hashed files are compared in parallel to each other by the sorting method that we proposed (Kurata et al., 2003). The hashed files are stored on genomic databases in a distributed form. We describe how to implement this method upon a grid computing environment and show the calculation results on a world-wide grid environment between Japan, Switzerland and France. This method successfully verified the originality of the sequence SSB against E. coli K-12 and B. subtilis.
Keywords :
biology computing; data privacy; distributed databases; genetics; grid computing; pattern matching; public information systems; sorting; B subtilis; E coli K-12; European Data Grid; SSB; distributed computing; gene sequences; genomic databases; grid computing; hashed files; homology search; molecular biology; public networks; secret gene sequence originality verification; sorting; Amplitude modulation; Bioinformatics; Chemicals; Diseases; Distributed computing; Distributed databases; Genomics; Grid computing; Sequences; Sorting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing and Grid in Asia Pacific Region, 2004. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-2138-X
Type :
conf
DOI :
10.1109/HPCASIA.2004.1324051
Filename :
1324051
Link To Document :
بازگشت