Title :
A Mechanism Generating Semi-Structured RDF Metadata from Web Documents
Author :
Hsueh, Hsiang-Yuan ; Chen, Chun-Nan ; Huang, Kun-Fu
Author_Institution :
Inf. & Commun. Res. Lab., Ind. Technol. Res. Inst., Hsinchu, Taiwan
Abstract :
The Semantic Web still cannot be realized on the Internet, since a large number of un-structure web documents available on the Internet contain texts in natural language that are still only read by human beings. For content providers and developers, it is almost impossible to generate metadata of Web content manually. In this paper, a mechanism generating content-based RDF Semantic Web schema from web document set as the semantic metadata is proposed. Analyzing the structural information and content of web documents, they can be conceptualized as resource objects with inter-relationships in RDF diagram. It is expected that with the semantic metadata of document sets on the Web being systematically translated instead of manually edited, the semantic operation on the whole Web, such as semantic query or semantic search, will be possible in the near future.
Keywords :
document handling; meta data; natural language processing; semantic Web; Internet; RDF metadata; Semantic Web; Web content metadata; Web documents; natural language processing; structural information; Asia; Conferences; Metadata; Resource Description Framework; World Wide Web;
Conference_Titel :
Services Computing Conference (APSCC), 2011 IEEE Asia-Pacific
Conference_Location :
Jeju Island
Print_ISBN :
978-1-4673-0206-7
DOI :
10.1109/APSCC.2011.73