DocumentCode :
2922007
Title :
Modeling semantic correspondence in heterogeneous structured document collection
Author :
Tan, Saravadee Sae ; Tang, Enya Kong ; Ranaivo-Malançon, Bali ; Sodhy, Gian Chand
Author_Institution :
Fac. of Inf. Technol., Multimedia Univ., Cyberjaya, Malaysia
fYear :
2011
fDate :
28-29 June 2011
Firstpage :
189
Lastpage :
196
Abstract :
On the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has led to the emergence of heterogeneous structured documents. The heterogeneity of structured documents is one of the reason for query-document mismatch in structured document retrieval. In structured document retrieval, a user is assumed to have intimate knowledge of the document structures and is able to specify contextual constraints in their queries. However, it is impossible for the user to know all structures in heterogeneous structured document collections. In this paper, we propose to include similar correspondence relations in the representation model for structured document retrieval. The similar correspondences make the relations between similar contents explicit in order to improve structured document retrieval effectiveness. We introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections. We also illustrate how the proposed model can be utilized in structured document retrieval.
Keywords :
document handling; query processing; semantic Web; heterogeneous structured document collection; query-document mismatch; semantic Web; semantic correspondence; structured document retrieval; Context; Context modeling; Marine vehicles; Periodic structures; Semantics; Tagging; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Semantic Technology and Information Retrieval (STAIR), 2011 International Conference on
Conference_Location :
Putrajaya
Print_ISBN :
978-1-61284-354-4
Electronic_ISBN :
978-1-61284-353-7
Type :
conf
DOI :
10.1109/STAIR.2011.5995787
Filename :
5995787
Link To Document :
بازگشت