Abstract :
One of the enabling technologies of the World Wide Web, along with browsers, domain name servers, and hypertext markup language, is the search engine. Although the Web contains over 100 million pages of information, those millions of pages are useless if you cannot find the pages you need. All major Web search engines operate the same way: a gathering program explores the hyperlinked documents of the Web, foraging for Web pages to index. These pages are stockpiled by storing them in some kind of database or repository. Finally, a retrieval program takes a user query and creates a list of links to Web documents matching the words, phrases, or concepts in the query. Although the retrieval program itself is correctly called a search engine, by popular usage the term now means a database combined with a retrieval program. For example, the Lycos search engine comprises the Lycos Catalog of the Internet and the Pursuit retrieval program. This paper describes the Lycos system for collecting, storing, and retrieving information about pages on the Web. After outlining the history and precursors of the Lycos system, the paper discusses some of the design choices made in building this Web indexer and touches briefly on the economic issues involved in working with very large retrieval systems
Keywords :
Internet; cataloguing; economics; indexing; information retrieval; online front-ends; systems analysis; Internet; Internet search service; Lycos; Lycos Catalog; Pursuit retrieval program; World Wide Web; browsers; design choices; domain name servers; economic issues; history; hyperlinked documents; hypertext markup language; indexing; information retrieval; retrieval program; search engine; very large retrieval systems; Databases; History; Information retrieval; Markup languages; Search engines; Web and internet services; Web pages; Web search; Web server; Web sites;