• DocumentCode
    69669
  • Title

    A Prefix-Filter based Method for Spatio-Textual Similarity Join

  • Author

    Sitong Liu ; Guoliang Li ; Jianhua Feng

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
  • Volume
    26
  • Issue
    10
  • fYear
    2014
  • fDate
    Oct. 2014
  • Firstpage
    2354
  • Lastpage
    2367
  • Abstract
    Location-based services have attracted significant attention due to modern mobile phones equipped with GPS devices. These services generate large amounts of spatio-textual data which contain both spatial location and textual descriptions. Since a spatio-textual object may have different representations, possibly because of deviations of GPS or different user descriptions, it calls for efficient methods to integrate spatio-textual data from different sources. In this paper we study a new research problem called spatio-textual similarity join: given two sets of spatio-textual objects, find the similar object pairs. We make the following contributions: (1) We develop a filter-and-refine framework and devise several efficient algorithms. We extend the prefix filter technique to generate spatial and textual signatures for the objects and build inverted index on top of these signatures. Then we generate candidate pairs using the inverted lists of signatures. Finally we refine the candidates and generate the final result. (2) We study how to generate high-quality signatures for spatial information. We develop an MBR-prefix based signature to prune large numbers of dissimilar object pairs. (3) We propose a hybrid signature scheme to support both textual pruning and spatial pruning simultaneously. (4) Experimental results on real and synthetic datasets show that our algorithms achieve high performance and scale well.
  • Keywords
    Global Positioning System; data integration; filtering theory; mobile computing; mobile radio; visual databases; GPS devices; Global Position Systems; MBR-prefix based signature; candidate pair generation; filter-and-refine framework; hybrid signature scheme; inverted index; location-based services; mobile phones; prefix-filter based method; spatial location; spatial pruning; spatial signature generation; spatio-textual data generation; spatio-textual data integration; spatio-textual similarity join; textual descriptions; textual pruning; textual signature generation; user descriptions; Complexity theory; Filtering algorithms; Global Positioning System; Indexes; Partitioning algorithms; Probes; Sorting; Database Applications; Database Management; Information Search and Retrieval; Information Storage and Retrieval; Information Technology and Systems; MBR prefix; Spatial databases and GIS; Spatio-textual objects; hybrid signature; similarity join;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.83
  • Filename
    6517849