Title :
OntoLabel - data labeling for deep web using WordNet
Author_Institution :
Sch. of Comput. & IT, Taylor´´s Univ., Subang Jaya, Malaysia
Abstract :
Wrappers are used to extract relevant information from the deep web, align, tabulate, and label these data for the users to recognize the contents simply and quickly. Existing wrappers use DOM Tree and visual cue to label data. These wrappers use several sets of heuristic rules to determine the label for a particular data, which may not be applicable to certain groups of data with words which are nearly similar in meaning. In this paper, we propose an ontological wrapper for labeling data using existing lexical database for English, WordNet. Our wrapper could label a wide range of data records without using any specific assumptions for the data structure. Instead of examining data structure and layout, our wrapper assigns label using the contents of the data. Experimental results show that our wrapper could label data records with high accuracy.
Keywords :
Internet; data structures; natural languages; ontologies (artificial intelligence); relevance feedback; search engines; word processing; English database; OntoLabel; WordNet; data align; data labeling; data records; data structure; deep Web; heuristic rules; label determination; lexical database; ontological wrapper; relevant information extraction; search engines; Data mining; Engines; Labeling; Metasearch; Search engines; Semantics; Web pages; Automatic Wrapper; Data Labeling; Information Extraction; Search Engines;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2012 9th International Conference on
Conference_Location :
Sichuan
Print_ISBN :
978-1-4673-0025-4
DOI :
10.1109/FSKD.2012.6234075