Title :
A tool to analyse spatial distribution of science research activities based on toponym resolution in text
Author :
Ma, Jianxia ; Ma, Hanqing ; Liu, Shaoxiong ; Zhao, Yingguang ; Li, Na
Author_Institution :
Cold & Arid Regions Environ. & Eng. Res. Inst., Chinese Acad. of Sci., Lanzhou, China
Abstract :
Spatial distribution of research activities in geo-sciences is related not only to the distribution of authors but also to the distribution of research area. In the present work we developed a tool to analyze spatial distribution of science research activities based on toponym resolution in text of scientific papers. The idea is to identify and annotate toponym referred in research papers automatically, then to geo-code them, at last to carry out an analysis of the distribution of research area and authors in related subjects in geo-sciences. We carried out some experiment with the tool. Firstly, we extracted geographical names from research articles in a pre-built Chinese documentary database on some scientific subjects with words segment software and gazetteer. Secondly, combining with Google map´s geo-coding API to get coordinates of the places, we exported the place names and coordinates into arcGIS. Consequently, we tried to analyze the spatial distribution of authors and research area in ecological footprint in China. The results show that, based on toponym resolution from large-scale article collection, we can analyze hot areas and blank areas of scientific research on some subjects and the distribution of the researchers of the subjects. And we presented a redesign of the system framework. The improvement is focused on leveraging a geographical knowledge database and some rules for disambiguation, integrating Conditional Random Fields in toponym resolution in Chinese text.
Keywords :
geographic information systems; random processes; text analysis; Chinese text; Google map; arcGIS; conditional random field; digital gazetteer; geo-coding API; geo-sciences; geographical knowledge database; geographical name; science research activities; spatial distribution; toponym resolution; words segment software; Biological system modeling; Databases; Entropy; Google; Hidden Markov models; Knowledge based systems; Spatial resolution; Geographic Information System (GIS); digital gazetteer; geo-coding; geo-parsing; information analysis; text-mining; toponym resolution;
Conference_Titel :
Geoinformatics, 2011 19th International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-849-5
DOI :
10.1109/GeoInformatics.2011.5981126