DocumentCode :
3082220
Title :
Entity Relation Extraction from geological text using Conditional Random Fields and subsequence kernels
Author :
Sobhana, N.V. ; Ghosh, Soumya K. ; MITRA, PINAKI
Author_Institution :
Indian Inst. of Technol., Kharagpur, Kharagpur, India
fYear :
2012
fDate :
7-9 Dec. 2012
Firstpage :
832
Lastpage :
840
Abstract :
An important research field in text mining is Entity Relation Extraction. Extracting various relations between geological entities is of immense benefit to developing intelligent search tools for geology researchers. In this paper Conditional Random Fields (CRFs) as well as sequence kernels are used for extracting relations between entities from a geological corpus. A geological corpus was developed from a collection of scientific reports and articles on the geology of the Indian subcontinent. The training set, consisting of more than 200K words, has been annotated with a named entity tag set of seventeen tags and with labeled instances of part-of and nearby relations. The system is able to recognize part-of and near-by relations with 71.57% and 77.27% F-measure values for T-CRF, and 78.25% and 83.71% for subsequence kernels. The extracted relations were used for query expansion in a retrieval system to achieve a gain of 10.86% for T-CRF, and 10.58% for subsequence kernels over the baseline Mean Average Precision.
Keywords :
data mining; geographic information systems; query processing; text analysis; F-measure values; Indian subcontinent geology; T-CRF; baseline mean average precision; conditional random fields; entity relation extraction; geological corpus; geological text; intelligent search tools; query expansion; retrieval system; scientific reports collection; sequence kernels; subsequence kernels; text mining; Feature extraction; Geology; Kernel; Labeling; Semantics; Training; Weight measurement; F-measure; Geological corpus; Mean Average Precision; Precision; Recall;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
India Conference (INDICON), 2012 Annual IEEE
Conference_Location :
Kochi
Print_ISBN :
978-1-4673-2270-6
Type :
conf
DOI :
10.1109/INDCON.2012.6420733
Filename :
6420733
Link To Document :
بازگشت