DocumentCode :
3110262
Title :
Semantic Labeling of Data by Using the Web
Author :
Rigutini, Leonardo ; Iorio, Ernesto Di ; Ernandes, Marco ; Maggini, Marco
Author_Institution :
Dipt. di Ingegneria dell´´Informazione, Siena Univ.
fYear :
2006
fDate :
18-22 Dec. 2006
Firstpage :
638
Lastpage :
641
Abstract :
This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains)
Keywords :
neural nets; semantic Web; text analysis; bag-of-words representation; class context lexicon; entity context lexicons; multi-label classification; neural network; semantic labeling; text categorization; Cities and towns; Context modeling; Data mining; Intelligent agent; Labeling; Neural networks; Search engines; Text categorization; Thesauri; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence and Intelligent Agent Technology Workshops, 2006. WI-IAT 2006 Workshops. 2006 IEEE/WIC/ACM International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2749-3
Type :
conf
DOI :
10.1109/WI-IATW.2006.118
Filename :
4053332
Link To Document :
بازگشت