Title :
A survey analysis on duplicate detection in Hierarchical Data
Author :
Gaikwad, Shital ; Bogiri, Nagaraju
Author_Institution :
Dept. of Comput., Savitribai Phule Pune Univ., Pune, India
Abstract :
Electronic Data Processing used automated methods for processing commercial data. There is big amount of work on discovering duplicates in relational data; merely elite findings concentrate on duplication in additional multifaceted hierarchical structures. Electronic information is one of the key factors in several business operations, applications, and determinations, at the same time as an outcome, guarantee its superiority is necessary. Data superiority, on the other hand, can be adjusted by different kind of errors from the heterogeneous domains. Duplicates are several delegacy of the identical real world thing which is dissimilar from each other. Duplicate finding a little assignment because of the actuality that duplicates are not accurately equivalent, frequently because of the errors in the information. Accordingly, many data processing techniques never apply widespread assessment algorithms which identify precise duplicates. As an alternative, evaluate all objective representations, by means of a probably compound identical approach, to identifying that the object is real world or not. In this paper we given detailed survey analysis and groundwork on duplicate detection in hierarchical data. Also we proposed a new idea i.e. use of pruning algorithm to detect similarity between the objects. This survey paper is useful to the persons who are doing research in Duplicate Detection in XML data or Hierarchical Data.
Keywords :
data handling; electronic commerce; relational databases; XML data; automated methods; commercial data processing; data processing techniques; data superiority; duplicate detection; electronic data processing; electronic information; hierarchical data; multifaceted hierarchical structures; objective representations; pruning algorithm; relational data; Algorithm design and analysis; Cleaning; Data warehouses; Internet; Object recognition; Standards; XML; XML data; XML document; duplicate detection; electronic data; hierarchical data;
Conference_Titel :
Pervasive Computing (ICPC), 2015 International Conference on
Conference_Location :
Pune
DOI :
10.1109/PERVASIVE.2015.7087099