Title :
Evaluating the role of context in syntax directed compression of XML documents
Author :
Hariharan, S. ; Shankar, Priti
Author_Institution :
Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore
Abstract :
Summary form only given. This paper proposes a new technique for tracking context to be used in a statistical code compression scheme for XML documents. Based on recursive finite state machines, the techniques employs an arithmetic coding scheme. The tradeoffs between space and compression ratio is studied by observing the effects of either using or ignoring root to leaf contexts for textual content in the associated tree structures. The scheme is syntax aware and the compressor and decompressor can be generated automatically from the document type definition (DTD) without interactive inputs from the user. A comparison of the path sensitive and path agnostic schemes for storing context for PCDATA was performed. Experimental results show that path sensitive schemes are less effective in the fixed memory model
Keywords :
XML; arithmetic codes; computational linguistics; data compression; finite state machines; statistical analysis; tree codes; XML documents; arithmetic coding scheme; document type definition; path sensitive schemes; recursive finite state machines; statistical code compression scheme; syntax directed compression; tree structures; Arithmetic; Automata; Automation; Computer science; Data compression; Decoding; Mirrors; Size measurement; Tree data structures; XML;
Conference_Titel :
Data Compression Conference, 2006. DCC 2006. Proceedings
Conference_Location :
Snowbird, UT
Print_ISBN :
0-7695-2545-8
DOI :
10.1109/DCC.2006.34