Title :
ACIRD: intelligent Internet document organization and retrieval
Author :
Lin, Shian-Hua ; Chen, Meng Chang ; Ho, Jan-Ming ; Huang, Yueh-Ming
Author_Institution :
Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
Abstract :
This paper presents an intelligent Internet information system, Automatic Classifier for the Internet Resource Discovery (ACIRD), which uses machine learning techniques to organize and retrieve Internet documents. ACIRD consists of a knowledge acquisition process, document classifier, and two-phase search engine. The knowledge acquisition process of ACIRD automatically learns classification knowledge from classified Internet documents. The document classifier applies learned classification knowledge to classify newly collected Internet documents into one or more classes. Experimental results indicate that ACIRD performs as well or better than human experts in both knowledge acquisition and document classification. By using the learned classification knowledge and the given class lattice, the ACIRD two-phase search engine responds to user queries with hierarchically structured navigable results (instead of a conventional flat ranked document list), which greatly aids users in locating information from numerous, diversified Internet documents
Keywords :
Internet; classification; data mining; deductive databases; information resources; information retrieval; knowledge acquisition; learning (artificial intelligence); search engines; ACIRD; Automatic Classifier for the Internet Resource Discovery; data mining; document classification; experimental results; intelligent Internet document organization; intelligent Internet information system; intelligent document retrieval; knowledge acquisition; machine learning; two-phase search engine; Humans; Information retrieval; Information systems; Intelligent systems; Internet; Knowledge acquisition; Lattices; Learning systems; Machine learning; Search engines;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
DOI :
10.1109/TKDE.2002.1000345