DocumentCode :
2534211
Title :
Differencing data streams
Author :
Chawathe, Sudarshan S.
Author_Institution :
Dept. of Comput. Sci., Maryland Univ., College Park, MD, USA
fYear :
2005
fDate :
25-27 July 2005
Firstpage :
273
Lastpage :
284
Abstract :
We present external-memory algorithms for differencing large hierarchical datasets. Our methods are especially suited to streaming data with bounded differences. For input sizes m and n and maximum output (difference) size e, the I/O, RAM, and CPU costs of our algorithm rdiff are, respectively, m + n, 4e + 8, and O(MN). That is, given 4e + 8 blocks of RAM, our algorithm performs no I/O operations other than those required to read both inputs. We also present a variant of the algorithm that uses only four blocks of RAM, with I/O cost 8me + 18m + n + 6e + 5 and CPU cost O(MN).
Keywords :
computational complexity; data handling; random-access storage; storage management; very large databases; RAM; data streams; external-memory algorithms; large hierarchical dataset differencing; Change detection algorithms; Computer science; Costs; Database systems; Educational institutions; Random access memory; Read-write memory; Spatial databases; Testing; Warehousing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Database Engineering and Application Symposium, 2005. IDEAS 2005. 9th International
ISSN :
1098-8068
Print_ISBN :
0-7695-2404-4
Type :
conf
DOI :
10.1109/IDEAS.2005.21
Filename :
1540917
Link To Document :
بازگشت