Title : 
Correlation-based Attribute Outlier Detection in XML
         
        
            Author : 
Koh, Judice L Y ; Lee, Mong Li ; Hsu, Wynne ; Ang, Wee Tiong
         
        
            Author_Institution : 
Sch. of Comput., Nat. Univ. of Singapore, Singapore
         
        
        
        
        
        
            Abstract : 
Compared to relational data models, the hierarchical structure of semi-structured data such as XML provides semantically meaningful neighbourhoods advancing data cleaning problems such as outlier detection. In this paper, we introduce the concept of correlated subspace that leverages on the hierarchical relationships between XML attributes to provide contextually informative neighbourhoods for attribute outlier detection. We also design two correlation-based attribute outlier metrics for XML, namely the xO-Measure and xQ-Measure. The effectiveness of our XML outlier detection approach is supported with experimental results.
         
        
            Keywords : 
XML; data structures; XML; correlation-based attribute outlier detection; xO-Measure; xQ-Measure; Cities and towns; Cleaning; Data models; Humans; Object detection; Pattern analysis; Stock markets; Virtual colonoscopy; Watches; XML;
         
        
        
        
            Conference_Titel : 
Data Engineering, 2008. ICDE 2008. IEEE 24th International Conference on
         
        
            Conference_Location : 
Cancun
         
        
            Print_ISBN : 
978-1-4244-1836-7
         
        
            Electronic_ISBN : 
978-1-4244-1837-4
         
        
        
            DOI : 
10.1109/ICDE.2008.4497610