Title :
Federating diverse collections of scientific literature
Author :
Schatz, Bruce ; Mischo, William H. ; Cole, Timothy W. ; Hardin, Joseph B. ; Bishop, Ann P. ; Chen, Hsinchun
Author_Institution :
Grainger Eng. Libr. Inf. Center, Illinois Univ., Urbana, IL, USA
fDate :
5/1/1996 12:00:00 AM
Abstract :
The Digital Library Initiative (DLI) project at the University of Illinois at Urbana-Champaign is developing the information infrastructure to effectively search technical documents on the Internet. The authors are constructing a large testbed of scientific literature, evaluating its effectiveness under significant use, and researching enhanced search technology. They are building repositories (organized collections) of indexed multiple-source collections and federating (merging and mapping) them by searching the material via multiple views of a single virtual collection. Developing widely usable Web technology is also a key goal. Improving Web search beyond full-text retrieval will require using document structure in the short term and document semantics in the long term. Their testbed efforts concentrate on journal articles from the scientific literature, with structure specified by the Standard Generalized Markup Language (SGML). Research efforts extract semantics from documents using the scalable technology of concept spaces based on context frequency. They then merge these efforts with traditional library indexing to provide a single Internet interface to indexes of multiple repositories
Keywords :
Internet; academic libraries; full-text databases; indexing; information retrieval; library automation; research initiatives; Digital Library Initiative project; Internet; SGML; Standard Generalized Markup Language; University of Illinois; World Wide Web; database search; document semantics; document structure; full-text retrieval; information infrastructure; library indexing; repositories; scientific literature collections; searching; technical documents; Buildings; Frequency; Indexing; Internet; Merging; SGML; Software libraries; Space technology; Testing; Web search;