DocumentCode :
1436855
Title :
Information integration
Author :
Hearst, M.A. ; Levy, A.Y. ; Knoblock, Craig ; Minton, S. ; Cohen, W.
Author_Institution :
California Univ., Berkeley, CA, USA
Volume :
13
Issue :
5
fYear :
1998
Firstpage :
12
Lastpage :
24
Abstract :
Despite the Web\´s current disorganized and anarchic state, many AI researchers believe that it will become the world\´s largest knowledge base. We examine a line of research whose final goal is to make disparate data sources work together to better serve users\´ information needs. This work is known as information integration. The authors talk about its application to datasets made available over the Web. A. Levy discusses the relationship between information-integration and traditional database systems. He then enumerates important issues in the field and demonstrates how the Information Manifold project has addressed some of these. C. Knoblock and S. Minton describe the Ariadne system. Two of its distinguishing features are its use of wrapper algorithms to extract structured information from semistructured data sources and its use of planning algorithms to determine how to integrate information efficiently and effectively across sources. W. Cohen describes an interesting variation on the theme, focusing on "informal" information integration. The idea is that, as in related fields that deal with uncertain and incomplete information, an information-integration system should be allowed to take chances and make mistakes. His Whirl system uses information-retrieval algorithms to find approximate matches between different databases, and as a consequence knits together data from quite diverse sources.
Keywords :
Internet; distributed databases; information retrieval; knowledge based systems; AI researchers; Ariadne system; Information Manifold project; Web; Whirl system; database systems; disparate data sources; incomplete information; information integration; information needs; information-retrieval algorithms; knowledge base; planning algorithms; semistructured data sources; structured information extraction; uncertain information; wrapper algorithms; Data mining; Database systems; HTML; Internet; Motion pictures; Prefetching; Protocols; Spatial databases; Web pages; XML;
fLanguage :
English
Journal_Title :
Intelligent Systems and their Applications, IEEE
Publisher :
ieee
ISSN :
1094-7167
Type :
jour
DOI :
10.1109/5254.722342
Filename :
722342
Link To Document :
بازگشت