DocumentCode :
3321860
Title :
BioSeek: exploiting source-capability information for integrated access to multiple bioinformatics data sources
Author :
Liu, Ling ; Buttler, David ; Critchlow, Terence ; Han, Wei ; Paques, Henrique ; Pu, Calton ; Rocco, Dan
Author_Institution :
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2003
fDate :
10-12 March 2003
Firstpage :
263
Lastpage :
271
Abstract :
Modern Bioinformatics data sources are widely used by molecular biologists for homology searching and new drug discovery. User-friendly and yet responsive access is one of the most desirable properties for integrated access to the rapidly growing, heterogeneous, and distributed collection of data sources. The increasing volume and diversity of digital information related to bioinformatics (such as genomes, protein sequences, protein structures, etc.) have led to a growing problem that conventional data management systems do not have, namely finding which information sources out of many candidate choices are the most relevant and most accessible to answer a given user query. We refer to this problem as the query routing problem. In this paper we introduce the notation and issues of query routing, and present a practical solution for designing a scalable query routing system based on multi-level progressive pruning strategies. The key idea is to create and maintain source capability profiles independently, and to provide algorithms that can dynamically discover relevant information sources for a given query through the smart use of source profiles. Compared to the keyword-based indexing techniques adopted in most of the search engines and software, our approach offers fine-granularity of interest matching, thus it is more powerful and effective for handling queries with complex conditions.
Keywords :
medical computing; molecular biophysics; patient treatment; query processing; BioSeek; Modern Bioinformatics data sources; complex conditions; conventional data management systems; digital information; genomes; homology searching; keyword-based indexing techniques; multilevel progressive pruning strategies; multiple bioinformatics data sources; new drug discovery; protein sequences; protein structures; queries handling; scalable query routing system; source profiles; source-capability information; user-friendly responsive access; Bioinformatics; Biomedical engineering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Bioengineering, 2003. Proceedings. Third IEEE Symposium on
Print_ISBN :
0-7695-1907-5
Type :
conf
DOI :
10.1109/BIBE.2003.1188961
Filename :
1188961
Link To Document :
بازگشت