DocumentCode :
2480699
Title :
Transfer of Supervision for Improved Address Standardization
Author :
Kothari, G. ; Faruquie, Tanveer A. ; Subramaniam, L. Venkata ; Prasad, K. Hima ; Mohania, Mukesh K.
Author_Institution :
IBM Res. India, New Delhi, India
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
2178
Lastpage :
2181
Abstract :
Address Cleansing is very challenging, particularly for geographies with variability in writing addresses. Supervised learners can be easily trained for different data sources. However, training requires labeling large corpora for each data source which is time consuming and labor intensive to create. We propose a method to automatically transfer supervision from a given labeled source to a target unlabeled source using a hierarchical dirichlet process. Each dirichlet process models data from one source. The shared component distribution across these dirichlet processes captures the semantic relation between data sources. A feature projection on the component distributions from multiple sources is used to transfer supervision.
Keywords :
data handling; geographic information systems; learning (artificial intelligence); statistical distributions; stochastic processes; address cleansing; address standardization; hierarchical Dirichlet process; shared component distribution; supervised learner; Adaptation model; Buildings; Clustering algorithms; Data models; Roads; Semantics; Training; HDP; address cleansing; address standardization; transfer learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.533
Filename :
5595945
Link To Document :
بازگشت