DocumentCode
480195
Title
Multi-source Automatic Annotation for Deep Web
Author
Xiao-Jun, Cui ; Zhi-Yong, Peng ; Hui, Wang
Author_Institution
State Key Lab. of Software Eng., Wuhan Univ., Wuhan
Volume
4
fYear
2008
fDate
12-14 Dec. 2008
Firstpage
659
Lastpage
662
Abstract
A large number of Web pages returned by filling in search forms are not indexed by most search engines today. The set of such Web pages is referred to as the deep Web. Since results returned by Web databases seldom have proper annotations, it is necessary to assign meaningful labels to the results. This paper presents a framework of automatic annotation which uses multi-annotator to annotate results from different aspects. Especially, search engine-based annotator extends question-answering techniques commonly used in the AI community, constructing validate queries and posing to the search engine. It finds the most appropriate terms to annotate the data units by calculate the similarities between terms and instances. Information for annotating can be acquired automatically without the support of domain ontology. Experiments over four real world domains indicate that the proposed approach is highly effective.
Keywords
Internet; information analysis; information retrieval; search engines; Web databases; Web pages; deep Web; multisource automatic annotation; question-answering techniques; search engine-based annotator; search engines; search forms; validate queries; Artificial intelligence; Contracts; Data mining; Databases; Filling; Laboratories; Ontologies; Search engines; Software engineering; Web pages; Deep Web; Interface Schema; Semantic Annotation; Validate Query;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Software Engineering, 2008 International Conference on
Conference_Location
Wuhan, Hubei
Print_ISBN
978-0-7695-3336-0
Type
conf
DOI
10.1109/CSSE.2008.439
Filename
4722705
Link To Document