Title :
A Parallel Approach to XML Parsing
Author :
Lu, Wei ; Chiu, Kenneth ; Pan, Yinfei
Author_Institution :
Comput. Sci. Dept., Indiana Univ., Bloomington, IN
Abstract :
A language for semi-structured documents, XML has emerged as the core of the Web services architecture, and is playing crucial roles in messaging systems, databases, and document processing. However, the processing of XML documents has a reputation for poor performance, and a number of optimizations have been developed to address this performance problem from different perspectives, none of which have been entirely satisfactory. In this paper, we present a seemingly quixotic, but novel approach: parallel XML parsing. Parallel XML parsing leverages the growing prevalence of multicore architectures in all sectors of the computer market, and yields significant performance improvements. This paper presents our design and implementation of parallel XML parsing. Our design consists of an initial preparsing phase to determine the structure of the XML document, followed by a full, parallel parse. The results of the preparsing phase are used to help partition the XML document for data parallel processing. Our parallel parsing phase is a modification of the libxml2 in Veillard, D. (2004) XML parser, which shows that our approach applies to real-world, production quality parsers. Our empirical study shows our parallel XML parsing algorithm can improved the XML parsing performance significantly and scales well
Keywords :
Web services; XML; document handling; grammars; parallel processing; Web services architecture; XML document; XML documents; XML parsing algorithm; data parallel processing; document processing; libxml2; messaging systems; multicore architectures; parallel XML parsing; parallel parsing phase; preparsing phase; production quality parsers; semistructured documents; Computer architecture; Concurrent computing; Databases; Multicore processing; Parallel processing; Partitioning algorithms; Production; Service oriented architecture; Web services; XML;
Conference_Titel :
Grid Computing, 7th IEEE/ACM International Conference on
Conference_Location :
Barcelona
Print_ISBN :
1-4244-0343-X
Electronic_ISBN :
1-4244-0344-8
DOI :
10.1109/ICGRID.2006.311019