DocumentCode :
1737504
Title :
A semantic similarity approach to electronic document modeling and integration
Author :
Song, William W. ; Cheung, David ; Tan, CJ
Author_Institution :
E-Bus. Technol. Inst., Hong Kong Univ., China
Volume :
1
fYear :
2000
fDate :
2000
Firstpage :
116
Abstract :
The World Wide Web is an enormous collection of information resources serving various purposes. However, the diversity of the Web information, as well as its related formats, makes it very difficult for users to efficiently search and obtain the information they require. The reason for the difficulty is because most of the information uploaded on to the Web is unstructured or semi-structured. Many meta-data models have been proposed to respond to this problem. These models attempt to provide a certain kind of general description for the Web information in order to improve its structuredness. Although these documents consist of the largest portion of the Web information or Web resources, few meta-data models deal with ill-structured Web documents by analyzing their semantic relations with each other. In this paper, we consider this huge set of Web information, called electronic documents. We propose a meta-data model called the EDM (Electronic Document Metadata) model. Using this model, we can extract the semantic characteristics from electronic documents and then use these characteristics to form a semantic electronic document model. This model, inversely, provides a basis for the analysis of semantic similarity between electronic documents and for electronic document integration. This document modeling and integration supports further manipulations on the electronic documents, such as document exchange, searching and evolution
Keywords :
data models; document handling; information resources; meta data; EDM model; World Wide Web; document evolution; document exchange; document searching; electronic document integration; electronic document modeling; ill-structured Web documents; information formats; information resources; meta-data models; semantic similarity; semi-structured information; unstructured information; Councils; Data mining; Data models; Information resources; Labeling; Partial response channels; Search engines; Software standards; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Information Systems Engineering, 2000. Proceedings of the First International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-0577-5
Type :
conf
DOI :
10.1109/WISE.2000.882382
Filename :
882382
Link To Document :
بازگشت