DocumentCode :
695473
Title :
Exploring concept graphs for biomedical literature mining
Author :
Min Song
Author_Institution :
Dept. of Libr. & Inf. Sci., Yonsei Univ., Seoul, South Korea
fYear :
2015
fDate :
9-11 Feb. 2015
Firstpage :
103
Lastpage :
110
Abstract :
Full-text publications in an electronic form become more prevalent than ever before. It is a difficult challenge to extract concepts from unstructured document collections data because different concepts and their relationships are buried in them and ample term variations make the challenge compound. Extracted concepts are useful instruments of managing and searching large document collections and play a pivotal role in indexing electronic documents and building digital libraries. In this paper we explore a biomedical concept extraction technique based on a ranking algorithm of concept graphs. The proposed technique comprises two major steps: the first step is to represent documents with graphs whose nodes and edges are created by Named Entity Recognition and UMLS Semantic Network. The second step is rank concepts with relative importance algorithms. We evaluate our technique with a set of biomedical full-texts and compare it to various different key-phrase extraction and graph ranking techniques. The experimental results show that our technique achieves the best performance over other compared algorithms. We further take a close look at the properties of the network to examine how concepts are related to each other and what concept plays a dominant role in the network. To this end, we build the network with 526 full-text articles published in PubMed Central and measure the significance of nodes by centrality.
Keywords :
data acquisition; data mining; digital libraries; document handling; electronic publishing; indexing; medical information systems; semantic networks; PubMed Central; UMLS semantic network; biomedical concept extraction technique; biomedical full-text; biomedical literature mining; concept graphs; digital libraries; electronic document indexing; electronic publication; full-text publication; graph ranking technique; key-phrase extraction; large document collection managing; large document collection searching; named entity recognition; ranking algorithm; unstructured document collection; Algorithm design and analysis; Data mining; Feature extraction; Markov processes; Semantics; Unified modeling language; Web pages; formatting; insert; style; styling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Big Data and Smart Computing (BigComp), 2015 International Conference on
Conference_Location :
Jeju
Type :
conf
DOI :
10.1109/35021BIGCOMP.2015.7072818
Filename :
7072818
Link To Document :
بازگشت