Title :
Detection of POI boundaries through geographical topics
Author :
Thanh-Hieu Bui ; Yong-Jin Han ; Seong-Bae Park ; Se-Young Park
Author_Institution :
Kyungpook Nat. Univ., Daegu, South Korea
Abstract :
Characteristics of a point-of-interest (POI) can be discovered by mining geo-tagged textual data from social media around the POI. However, the relevance of the data to the POI varies according to the distance between the POI and the positions in which the data are generated. That is, the textual data generated in positions far from the POI usually have nothing to do with the POI. Therefore, it is of importance to know a POI boundary that sufficiently covers geo-tagged textual data relevant to the POI. This paper proposed Boundary-dependent Explicit Semantic Analysis (BESA) to detect a boundary of a POI. BESA represents the POI boundary as a circle centered at the POI, but radius of the circle is unknown. If the radius is specified as a certain distance from the POI, textual data generated within the circular boundary are reduced into a topic vector where each topic is a Wikipedia concept. The number of Wikipedia concepts is fixed, but their enormous quantity allows BESA to explore variation of topics according to different circular boundaries. In order to approximate the best fit boundary, we regard a POI as circles with increasing radii. By exploring the topical similarities between the vector of a POI center and those of distant positions, the boundary of the POI is determined. The similarities reveal changes of geographical topics from the POI center as the radius of the POI increases. Thus the boundary of the POI is set with the radius at which the similarity declines drastically. According to the experiments with five different POIs, top 20 highly weighed topics are more relevant to the POIs within the boundaries detected by the proposed method rather than within the POI centers and within boundaries next larger than the detected boundaries. The results prove plausibility of BESA in detecting boundaries of POIs.
Keywords :
Web sites; geographic information systems; natural language processing; text analysis; BESA; POI boundary detection; Wikipedia concept; boundary-dependent explicit semantic analysis; geo-tagged textual data; geographical topics; point-of-interest; topic vector; Electronic publishing; Encyclopedias; Equations; Internet; Mathematical model; Vectors;
Conference_Titel :
Big Data and Smart Computing (BigComp), 2015 International Conference on
Conference_Location :
Jeju
DOI :
10.1109/35021BIGCOMP.2015.7072827