Title :
Multi-source Automatic Annotation for Deep Web
Author :
Xiao-Jun, Cui ; Zhi-Yong, Peng ; Hui, Wang
Author_Institution :
State Key Lab. of Software Eng., Wuhan Univ., Wuhan
Abstract :
A large number of Web pages returned by filling in search forms are not indexed by most search engines today. The set of such Web pages is referred to as the deep Web. Since results returned by Web databases seldom have proper annotations, it is necessary to assign meaningful labels to the results. This paper presents a framework of automatic annotation which uses multi-annotator to annotate results from different aspects. Especially, search engine-based annotator extends question-answering techniques commonly used in the AI community, constructing validate queries and posing to the search engine. It finds the most appropriate terms to annotate the data units by calculate the similarities between terms and instances. Information for annotating can be acquired automatically without the support of domain ontology. Experiments over four real world domains indicate that the proposed approach is highly effective.
Keywords :
Internet; information analysis; information retrieval; search engines; Web databases; Web pages; deep Web; multisource automatic annotation; question-answering techniques; search engine-based annotator; search engines; search forms; validate queries; Artificial intelligence; Contracts; Data mining; Databases; Filling; Laboratories; Ontologies; Search engines; Software engineering; Web pages; Deep Web; Interface Schema; Semantic Annotation; Validate Query;
Conference_Titel :
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location :
Wuhan, Hubei
Print_ISBN :
978-0-7695-3336-0
DOI :
10.1109/CSSE.2008.439