• DocumentCode
    3100748
  • Title

    Efficient and accurate approach for approximate string search in spatial dataset

  • Author

    Nikam, Pratiksha Praful

  • Author_Institution
    Dept. of Comput. Eng., GSMCOE, Pune, India
  • fYear
    2015
  • fDate
    12-13 June 2015
  • Firstpage
    315
  • Lastpage
    318
  • Abstract
    This paper proposes a new index and method to find strings approximately in spatial databases. Specifically, the task of candidate generation is as follows. Given a location name with wrong spelling, the system finds location in OSM dataset which are most similar to that location name which are misspelled. An approximate solution is proposed using log linear model which is defined as a conditional probability distribution of a corrected word and a rule set for the correction conditioned on wrong location name. An Aho-corasic tree which is used for storing and applying correction rules referred to as rule index and an Aho-Corasic algorithm which is efficient and gives guarantee to find top k candidates. Experiment on large real OSM dataset demonstrates the accuracy of proposed method upon existing methods.
  • Keywords
    search problems; statistical distributions; string matching; text analysis; trees (mathematics); visual databases; Aho-Corasic algorithm; Aho-corasic tree; OSM dataset; approximate string search; candidate generation; conditional probability distribution; correction rules; log linear model; rule index; spatial databases; spatial dataset; Accuracy; Algorithm design and analysis; Approximation algorithms; Data structures; Indexes; Probabilistic logic; Spatial databases; Aho-Corasick algorithm; Approximate string search; OSM dataset; spatial databases;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advance Computing Conference (IACC), 2015 IEEE International
  • Conference_Location
    Banglore
  • Print_ISBN
    978-1-4799-8046-8
  • Type

    conf

  • DOI
    10.1109/IADCC.2015.7154721
  • Filename
    7154721