Title :
Caching techniques for XML message filtering
Author :
Cao, Yang ; Majumdar, Shikharesh ; Lung, Chung-Horng
Author_Institution :
Sch. of Comput. Sci., Carleton Univ., Ottawa, ON, Canada
Abstract :
An XML publish/subscribe system is based on filtering XML message streams for a large number of subscriptions expressed in XPath. A major issue on an XML-based publish/subscribe system is its performance. As the number of XML documents and XPath-based subscriptions increases in the system, to provide XML filtering efficiently becomes a challenging problem. Hence, there is an urgent need for optimization techniques to meet this challenge. There are many existing approaches on designing efficient XML filtering engine. Most existing research efforts focus on efficient filtering algorithms for achieving a high system performance or supporting more complex XPath syntax. Each proposed scheme has its advantages and limitations. Not much research, however, has considered using caching in the context of XML filtering. In this paper, we propose two caching schemes to be used in conjunction with an XML filtering engine. First, we present a complete message caching algorithm that is a strict caching policy to reduce the computation cost that accrues from multiple filtering of the same messages, by reusing results of previously processed messages. Second, we investigate a structure-based caching method that is an approximate caching policy for messages sharing the same structure. Performance evaluation for synthetic data and real data both show that complete message caching and structure-based caching schemes are able to achieve significantly better filtering performance (up to 80% for both caching schemes for the message streams experimented with).
Keywords :
XML; cache storage; document handling; information filtering; message passing; middleware; optimisation; performance evaluation; XML documents; XML filtering engine; XML message filtering; XML message streams; XML publish/subscribe system; XML-based publish/subscribe system; XPath syntax; XPath-based subscriptions; caching policy; caching techniques; filtering algorithms; message caching algorithm; messages sharing; optimization techniques; performance evaluation; structure-based caching method; Automata; Computer science; Doped fiber amplifiers; Engines; Filtering algorithms; Lungs; Matched filters; Subscriptions; Systems engineering and theory; XML; XML; caching; performance evaluation; publish/subscribe;
Conference_Titel :
Performance Computing and Communications Conference (IPCCC), 2009 IEEE 28th International
Conference_Location :
Scottsdale, AZ
Print_ISBN :
978-1-4244-5737-3
DOI :
10.1109/PCCC.2009.5403839