DocumentCode
3100748
Title
Efficient and accurate approach for approximate string search in spatial dataset
Author
Nikam, Pratiksha Praful
Author_Institution
Dept. of Comput. Eng., GSMCOE, Pune, India
fYear
2015
fDate
12-13 June 2015
Firstpage
315
Lastpage
318
Abstract
This paper proposes a new index and method to find strings approximately in spatial databases. Specifically, the task of candidate generation is as follows. Given a location name with wrong spelling, the system finds location in OSM dataset which are most similar to that location name which are misspelled. An approximate solution is proposed using log linear model which is defined as a conditional probability distribution of a corrected word and a rule set for the correction conditioned on wrong location name. An Aho-corasic tree which is used for storing and applying correction rules referred to as rule index and an Aho-Corasic algorithm which is efficient and gives guarantee to find top k candidates. Experiment on large real OSM dataset demonstrates the accuracy of proposed method upon existing methods.
Keywords
search problems; statistical distributions; string matching; text analysis; trees (mathematics); visual databases; Aho-Corasic algorithm; Aho-corasic tree; OSM dataset; approximate string search; candidate generation; conditional probability distribution; correction rules; log linear model; rule index; spatial databases; spatial dataset; Accuracy; Algorithm design and analysis; Approximation algorithms; Data structures; Indexes; Probabilistic logic; Spatial databases; Aho-Corasick algorithm; Approximate string search; OSM dataset; spatial databases;
fLanguage
English
Publisher
ieee
Conference_Titel
Advance Computing Conference (IACC), 2015 IEEE International
Conference_Location
Banglore
Print_ISBN
978-1-4799-8046-8
Type
conf
DOI
10.1109/IADCC.2015.7154721
Filename
7154721
Link To Document