DocumentCode :
1356692
Title :
Discovering relevant scientific literature on the Web
Author :
Bollacker, Kurt D. ; Lawrence, Steve ; Giles, C. Lee
Author_Institution :
Internet Archive, San Francisco, CA, USA
Volume :
15
Issue :
2
fYear :
2000
Firstpage :
42
Lastpage :
47
Abstract :
Scientific literature on the Web makes up a massive, noisy, disorganized database. Unlike large, single-source databases such as a corporate customer database, the Web database draws from many sources, each with its own organization. Also, owing to its diversity, most records in this database are irrelevant to an individual researcher. Furthermore, the database is constantly growing in content and changing in organization. All these characteristics make the Web a difficult domain for knowledge discovery. To quickly and easily gather useful knowledge from such a database, users need the help of an information filtering system that automatically extracts only relevant records as they appear in a stream of incoming records. To this end, we have developed the CiteSeer. CiteSeer is an automatic generator of digital libraries of scientific literature. It uses sophisticated acquisition, parsing, and presentation methods to eliminate most of the manual effort of finding useful publications on the Web
Keywords :
data mining; digital libraries; information resources; information retrieval; natural sciences computing; CiteSeer; Web database; World Wide Web; automatic extraction; automatic generator; digital libraries; disorganized database; incoming records; information filtering system; knowledge discovery; presentation methods; publications; relevant scientific literature discovery; Acceleration; Costs; Feature extraction; Helium; Information filtering; Information filters; National electric code; Page description languages; Software libraries; Spatial databases;
fLanguage :
English
Journal_Title :
Intelligent Systems and their Applications, IEEE
Publisher :
ieee
ISSN :
1094-7167
Type :
jour
DOI :
10.1109/5254.850826
Filename :
850826
Link To Document :
بازگشت