DocumentCode
3507770
Title
Approximate Address Matching
Author
Li, Dengyue ; Wang, Shengrui ; Mei, Zhen
Author_Institution
Dept. of Comput. Sci., Univ. of Sherbrooke, Sherbrooke, QC, Canada
fYear
2010
fDate
4-6 Nov. 2010
Firstpage
264
Lastpage
269
Abstract
Address management is a major challenge for many organizations, as errors occur frequently in the address capturing process, and address standards and usages may vary among different databases. Rather than comparing house number, street, city and province individually, we use a string similarity measurement to perform address comparison, which enables us to combine the edit distance with the vector space model to search for potentially matching address candidates by associating them with a similarity matching score. Upon evaluating the strengths and weaknesses of these techniques, we introduce an algorithm for effective address matching, called Term-Weighted Dissimilarity, which combines edit distance similarity with Term Frequency-Inverse Document Frequency weighting. We implement this algorithm in software and show its effectiveness via a real application for address matching and correction based on Canada Post´s address standard.
Keywords
geographic information systems; information retrieval; string matching; text analysis; address capturing process; address comparison; address management; address matching; address standards; edit distance; string similarity measurement; term frequency-inverse document frequency weighting; term-weighted dissimilarity; vector space model; Address matching; TF-IDF weight; address correction; address standardization; edit distance; string similarity; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC), 2010 International Conference on
Conference_Location
Fukuoka
Print_ISBN
978-1-4244-8538-3
Electronic_ISBN
978-0-7695-4237-9
Type
conf
DOI
10.1109/3PGCIC.2010.43
Filename
5662779
Link To Document