Title :
Distributed processing of queries for XML documents in an agent based information retrieval system
Author :
Czejdo, Bogdan ; Miller, Ruth ; Taylor, Malcolm ; Rusinkiewicz, Marek
Author_Institution :
Microelectron. & Comput. Technol. Corp., Austin, TX, USA
Abstract :
The paper addresses the problem of efficiently querying large numbers of text documents using parallel processing methods. The optimization criteria are somewhat different from those used in querying heterogeneous databases, largely because the extraction of ontological information from documents is the dominant component of query execution time. We assume that each document has been previously annotated using XML. The authors describe the architecture of a system to process ontology based queries for XML annotated documents. We have introduced two basic strategies for query processing: simple strategy, and semi-join strategy, and their possible extensions using pipelining and longer lists for keyword search. Different levels of parallelism for these strategies are discussed. An evaluation model is created and used to derive optimal replication of resource agents. The theoretical and experimental results are compared
Keywords :
hypermedia markup languages; information retrieval systems; parallel processing; query processing; XML annotated documents; XML documents; agent based information retrieval system; distributed query processing; evaluation model; heterogeneous databases; keyword search; ontological information; ontology based queries; optimal replication; optimization criteria; parallel processing methods; parallelism; pipelining; query execution time; query processing; resource agents; semi-join strategy; simple strategy; text documents; Data mining; Databases; Distributed processing; Information resources; Information retrieval; Keyword search; Ontologies; Parallel processing; Query processing; XML;
Conference_Titel :
Digital Libraries: Research and Practice, 2000 Kyoto, International Conference on.
Conference_Location :
Kyoto
Print_ISBN :
0-7695-1022-1
DOI :
10.1109/DLRP.2000.942181