• DocumentCode
    5707
  • Title

    Spatial Approximate String Search

  • Author

    Feifei Li ; Bin Yao ; Mingwang Tang ; Hadjieleftheriou, MariosMarios

  • Author_Institution
    Sch. of Comput., Univ. of Utah, Salt Lake City, UT, USA
  • Volume
    25
  • Issue
    6
  • fYear
    2013
  • fDate
    Jun-13
  • Firstpage
    1394
  • Lastpage
    1409
  • Abstract
    This work deals with the approximate string search in large spatial databases. Specifically, we investigate range queries augmented with a string similarity search predicate in both euclidean space and road networks. We dub this query the spatial approximate string (SAS) query. In euclidean space, we propose an approximate solution, the MHR-tree, which embeds min-wise signatures into an R-tree. The min-wise signature for an index node u keeps a concise representation of the union of q-grams from strings under the subtree of u. We analyze the pruning functionality of such signatures based on the set resemblance between the query string and the q-grams from the subtrees of index nodes. We also discuss how to estimate the selectivity of a SAS query in euclidean space, for which we present a novel adaptive algorithm to find balanced partitions using both the spatial and string information stored in the tree. For queries on road networks, we propose a novel exact method, RSASSOL, which significantly outperforms the baseline algorithm in practice. The RSASSOL combines the q-gram-based inverted lists and the reference nodes based pruning. Extensive experiments on large real data sets demonstrate the efficiency and effectiveness of our approaches.
  • Keywords
    approximation theory; query processing; trees (mathematics); visual databases; Euclidean space; MHR-tree; RSASSOL; SAS query; balanced partitions; baseline algorithm; concise representation; exact method; index node subtrees; large spatial databases; min-wise signature; min-wise signatures; q-gram-based inverted lists; range queries; reference node based pruning; road networks; spatial approximate string query; spatial approximate string search; spatial information; string information; string similarity search; Equations; Estimation; Indexes; Mathematical model; Roads; Spatial databases; Synthetic aperture sonar; Approximate string search; range query; road network; spatial databases;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2012.48
  • Filename
    6165287