Title :
Toward a retrieval of HTML documents using a semantic approach
Author :
Ferri, Fernando ; Ghiselli, Cristina ; Grifoni, Patrizia ; Padula, Marco
Author_Institution :
Ist. di Studi sulla Ricerca e sulla Documentazione Sci., CNR, Rome, Italy
Abstract :
The growth of the Internet has produced a lot of advantages, together with the opportunity to provide different people with access to a large warehouse of information. However, this phenomenon produces some difficulties in the activities of searching and retrieving. A large amount of information is sometimes useless if it does not offer tools to respond to the information needs of the user. This paper introduces an approach devoted to facilitating information access and retrieval using the World Wide Web´s syntactic structures and semantic organization. We consider the HTML language syntax structure and the organization of information in a general Web document, and we define some rules that people use for structuring Web information. These rules can be used for managing and retrieving Web information and its semantics. A Web document is treated as a complex informative object formed by images, tables, animations, videos and text organized into chapters, paragraphs, titles, and so on, connected according to semantic links. Knowledge associated with the information structure helps in retrieving relevant information
Keywords :
Internet; computational linguistics; data structures; data warehouses; hypermedia markup languages; information needs; information resources; information retrieval; multimedia databases; HTML document retrieval; Internet; World Wide Web semantic organization; World Wide Web syntactic structures; animations; complex informative object; data warehouse; images; information access; information organization; information searching; information structuring rules; language syntax structure; relevant information retrieval; semantic links; tables; text; user information needs; videos; Animation; Data mining; Electronic mail; Focusing; HTML; Information management; Information retrieval; Internet; Organizing; Videos;
Conference_Titel :
Multimedia and Expo, 2000. ICME 2000. 2000 IEEE International Conference on
Conference_Location :
New York, NY
Print_ISBN :
0-7803-6536-4
DOI :
10.1109/ICME.2000.871069