Title :
XML document compression based on BWT algorithm
Author_Institution :
Sch. of Math. & Comput., Jianghan Univ., Wuhan, China
Abstract :
XML documents are widely used in the data exchange of applications, its large amount of redundant information seriously cost the transmitting and storing resource because of adding tags to every semantic content unit. XML data compression is an effective means to reduce its space expansion. This paper implements a compression algorithm of XML documents. Firstly, structure information and data information of XML document are isolated and then data dictionary is established to replace and simplify the XML document. Secondly, the BWT algorithm and run-length coding are used to compress the XML document. Lastly, the final compressed file is obtained by using the Gzip tool. The algorithm has a better compression ratio with relatively less time and space resources.
Keywords :
XML; data compression; dictionaries; document handling; electronic data interchange; BWT algorithm; Gzip tool; XML data compression; XML document compression; data dictionary; data exchange; run-length coding; Analytical models; Books; Ear; XML; BWT; XML document; compression;
Conference_Titel :
Circuits,Communications and System (PACCS), 2010 Second Pacific-Asia Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7969-6
DOI :
10.1109/PACCS.2010.5627004