Title :
Semi-structured data extraction and schema knowledge mining
Author :
Enhong, Chen ; Xufa, Wang
Author_Institution :
Dept. of Comput. Sci., Univ. of Sci. & Technol. of China, Hefei, China
Abstract :
It is well known that World Wide Web has become a huge information resource. Therefore, it is very important for us to utilize this kind of information effectively. This paper proposes a semi-structured data extraction method to get the useful information embedded in a group of relevant web pages, and store it with OEM (Object Exchange Model). Then, we adopt data mining method to discover schema knowledge implicit in the semi-structured data. This knowledge can make users understand the information structure on the web more deeply and thoroughly. At the same time, it can also provide a kind of effective schema for the querying of web information
Keywords :
data mining; information resources; World Wide Web; data mining method; information resource; object exchange model; schema knowledge mining; semi-structured data extraction; web information; Computer science; Data mining; Documentation; Electrical capacitance tomography; Electronic commerce; HTML; Information resources; Search engines; Web pages; World Wide Web;
Conference_Titel :
EUROMICRO Conference, 1999. Proceedings. 25th
Conference_Location :
Milan
Print_ISBN :
0-7695-0321-7
DOI :
10.1109/EURMIC.1999.794795