Title : 
Discovering Conservation Rules
         
        
            Author : 
Golab, Lukasz ; Karloff, Howard ; Korn, Flip ; Saha, Barna ; Srivastava, Divesh
         
        
            Author_Institution : 
Univ. of Waterloo, Waterloo, ON, Canada
         
        
        
        
        
        
            Abstract : 
Many applications process data in which there exists a ``conservation law´´ between related quantities. For example, in traffic monitoring, every incoming event, such as a packet´s entering a router or a car´s entering an intersection, should ideally have an immediate outgoing counterpart. We propose a new class of constraints -- Conservation Rules -- that express the semantics and characterize the data quality of such applications. We give confidence metrics that quantify how strongly a conservation rule holds and present approximation algorithms (with error guarantees) for the problem of discovering a concise summary of subsets of the data that satisfy a given conservation rule. Using real data, we demonstrate the utility of conservation rules and we show order-of-magnitude performance improvements of our discovery algorithms over naive approaches.
         
        
            Keywords : 
approximation theory; data handling; approximation algorithm; confidence metrics; conservation law; conservation rules; data quality; discovery algorithm; error guarantees; naive approach; Approximation algorithms; Delay; Electricity; IP networks; Monitoring; Roads;
         
        
        
        
            Conference_Titel : 
Data Engineering (ICDE), 2012 IEEE 28th International Conference on
         
        
            Conference_Location : 
Washington, DC
         
        
        
            Print_ISBN : 
978-1-4673-0042-1
         
        
        
            DOI : 
10.1109/ICDE.2012.105