DocumentCode :
3734073
Title :
From text to XML by structural information extraction
Author :
Yong Piao;Tianyu Wang;He Jiang
Author_Institution :
School of Software, Dalian University of Technology, Dalian 116620, China
fYear :
2015
Firstpage :
448
Lastpage :
452
Abstract :
Facing tremendous volume of semi-structured XML and non-structured free text, network information retrieval is one of the most research hotspots in dealing with these data more efficiently, precisely and uniformly. Many traditional IR methods ignore text semantics and their labeling result has usually only one level, lacking of context expression as well, therefore structure extraction from free text and its conversion to XML format are studied, with a CRF based algorithm SIECRF provided. Experiment results are analyzed, showing its efficiency to extracting text structure and has a good application future.
Keywords :
"Hidden Markov models","Information retrieval","Labeling","Semantics","Data mining","Entropy","XML"
Publisher :
ieee
Conference_Titel :
Computer and Communications (ICCC), 2015 IEEE International Conference on
Print_ISBN :
978-1-4673-8125-3
Type :
conf
DOI :
10.1109/CompComm.2015.7387613
Filename :
7387613
Link To Document :
بازگشت