DocumentCode :
2253106
Title :
Autonomic Wrapper Induction using Minimal Type System from Web Data
Author :
Son, Youngju ; Jamil, Hasan ; Fotouhi, Farshad
Author_Institution :
Comput. Sci., Wayne State Univ., Detroit, MI
fYear :
2005
fDate :
5-8 Dec. 2005
Firstpage :
130
Lastpage :
135
Abstract :
Biological and genomic source integration has become a major research field. Most of biological data has been provided over the Web. This Web data is unstructured and cannot be queried using traditional querying language. Furthermore, the problems that integration of biological data faces come from several factors such as the various data types, presentations and formats. So, it is not easy to find the desired data from diverse data sources. Although humans can easily understand Web data, which are heterogeneous and unstructured, it is impossible for machine itself to figure it out. In order for machine to extract data from the Web, it requires knowledge of both their structures and contents. We propose a novel architecture for automatic wrapper induction that exploits a user supplied type system and an ontology for establishing schema correspondence precisely and efficiently. In this paper, the type system helps recognize target data and improves precision of schema matching which is impossible without manual intervention
Keywords :
Internet; biology computing; information retrieval; ontologies (artificial intelligence); Web data extraction; autonomic wrapper induction; biological source integration; genomic source integration; minimal type system; ontology; Computer science; Costs; Data mining; Electronic mail; Engines; Genomics; HTML; Induction generators; Mediation; Ontologies; Information Extraction; Type Hierarchy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial intelligence, 2005. epia 2005. portuguese conference on
Conference_Location :
Covilha
Print_ISBN :
0-7803-9366-X
Electronic_ISBN :
0-7803-9366-X
Type :
conf
DOI :
10.1109/EPIA.2005.341280
Filename :
4145939
Link To Document :
بازگشت