Title :
A data mining approach to PubMed query refinement
Author :
Berardi, Margherita ; Lapi, Michele ; Leo, Pietro ; Malerba, Donato ; Marinelli, Caterina ; Scioscia, Gaetano
Author_Institution :
Dipt. di Informatica, Universita degli Studi di Bari, Italy
fDate :
30 Aug.-3 Sept. 2004
Abstract :
Finding disease relationships requires laborious examination of hundreds of possible candidate heterogeneous factors. Much of the related information is currently contained in biological and medical journals, making biomedical text mining a central bioinformatic problem. More than 14 million abstracts of such papers are contained in the Medline collection and are available online. In this paper we present a data mining engine, namely MeSH Terms Associator (MTA), that has been employed in a distributed architecture to refine a generic PubMed query by means of discovery of concept relations in the form of association rules. However, the number of discovered association rules is usually high and the interest of most of them does not fulfil user expectations. In addition, the presentation of thousands of rules can discourage users from interpreting them. To overcome this problem we investigate the application of some filtering techniques. Experimental results on datasets corresponding to real-world biomedical queries are discussed and future directions are drawn.
Keywords :
data mining; medical information systems; query processing; MeSH Terms Associator; PubMed query refinement; association rules; biomedical queries; data mining engine; distributed architecture; filtering technique; Abstracts; Association rules; Bioinformatics; Data mining; Diseases; Drugs; Filtering; Java; Marine technology; Text mining;
Conference_Titel :
Database and Expert Systems Applications, 2004. Proceedings. 15th International Workshop on
Print_ISBN :
0-7695-2195-9
DOI :
10.1109/DEXA.2004.1333507