DocumentCode
2907505
Title
Grid Enabling Data De-Duplication
Author
Austin, Jim ; Turner, Aaron ; Alwis, Sujeewa
Author_Institution
University of York, UK
fYear
2006
fDate
Dec. 2006
Firstpage
2
Lastpage
2
Abstract
A Grid based implementation of a system for finding duplicates in large databases is described. The solution is scalable to many nodes and does not suffer the problems found in other implementations that can result of loss of data and/or deadlock. The system may be applied to conventional de-duplication problems such as found in address management as well as more advanced problems such as banned image detection. The system uses the AURA pattern match methods implemented within a service oriented architecture. The approach builds on the PMS and PMC technology developed in the DAME eScience project.
Keywords
Computer architecture; Computer science; Image databases; Merging; Neuroscience; Pattern matching; Search engines; Search problems; Service oriented architecture; System recovery;
fLanguage
English
Publisher
ieee
Conference_Titel
e-Science and Grid Computing, 2006. e-Science '06. Second IEEE International Conference on
Conference_Location
Amsterdam, The Netherlands
Print_ISBN
0-7695-2734-5
Type
conf
DOI
10.1109/E-SCIENCE.2006.261092
Filename
4030981
Link To Document