Title :
GeneWebEx: gene annotation Web extraction, aggregation, and from Web-based biomolecular databanks
Author :
Masseroli, Marco ; Stella, Andrea ; Meani, Natalia ; Alcalay, Myriam ; Pinciroli, Francesco
Author_Institution :
Bioeng. Dept., Politecnico di Milano, Milan, Italy
Abstract :
Numerous genomic annotations are currently stored in different Web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integrate the diverse mined data in suitable user-customizable working environments. Unfortunately, to date, most accessible databanks can be interrogated only for a single gene or protein at a time and generally the data retrieved are available in HTML page format only. We developed GeneWebEx to effectively mine data of interest in different HTML pages of Web-based databanks, and organize extracted data for further analyses. Gene WebEx utilizes user-defined templates to identify data to extract, and aggregates and structures them in a database designed to allocate the various extractions from distinct biomolecular databanks. Moreover, a template-based module enables automatic updating of extracted data. Validations performed on GeneWebEx allowed us to efficiently gather relevant annotations from various sources, and comprehensively query them to highlight significant biological characteristics.
Keywords :
Internet; data mining; deductive databases; genetics; medical computing; molecular biophysics; GeneWebEx; Web extraction; Web-based biomolecular databanks; aggregation; data retrieval; databank mining; gene annotation; user-defined queries; Bioinformatics; Biological information theory; Biomedical engineering; Data mining; Genomics; HTML; Information retrieval; Oncology; Proteins; Relational databases;
Conference_Titel :
Bioinformatics and Bioengineering, 2004. BIBE 2004. Proceedings. Fourth IEEE Symposium on
Print_ISBN :
0-7695-2173-8
DOI :
10.1109/BIBE.2004.1317343