Title of article :
Repairing XML functional dependency violations
Author/Authors :
Zijing Tan، نويسنده , , Liyong Zhang، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2011
Pages :
17
From page :
5304
To page :
5320
Abstract :
We study the problem of repairing XML functional dependency violations by making the smallest modifications in terms of repair cost. Our cost model assigns a weight to each leaf node in the XML document, and the cost of a repair is measured by the total weight of the modified nodes. We define an optimum repair as the repair with the minimum cost among all of the repairs. We prove lower and upper bounds for the optimum XML repair problem. We show that, in practice, it is beyond reach to find the optimum repairs; this problem is already NP-complete for a setting with a fixed DTD, a fixed set of functional dependencies, and equal weights for all of the nodes in the XML document. Instead, we provide an efficient two-step heuristic method to repair XML functional dependency violations. First, the initial violations are captured and fixed by leveraging the conflict hypergraph. Second, the remaining conflicts are resolved by modifying the violating nodes and their related nodes called determinants in a way that guarantees no new violations. We implement our method and evaluate it on synthetic and real-life data. The experimental results demonstrate that our algorithm scales well and is effective at improving data quality.
Keywords :
XML , Functional dependency , Data Quality , Repair
Journal title :
Information Sciences
Serial Year :
2011
Journal title :
Information Sciences
Record number :
1214768
Link To Document :
بازگشت