DocumentCode :
2520405
Title :
Name Disambiguation Using Semantic Association Clustering
Author :
Jin, Hai ; Huang, Li ; Yuan, Pingpeng
Author_Institution :
Services Comput. Technol. & Syst. Lab., Huazhong Univ. of Sci. & Technol., Wuhan, China
fYear :
2009
fDate :
21-23 Oct. 2009
Firstpage :
42
Lastpage :
48
Abstract :
Due to homonyms, abbreviations, etc., name ambiguity is widely available in Web and e-document. For example, when integrating heterogeneous literature databases, because there are different name specifications, different authors may be thought of as the same author, and vice versa. Therefore, name ambiguity makes data robust even dirty and lowers the precision of information retrieval. In this paper, we present an approach, named as semantic association based name disambiguation method (SAND), to solve person name ambiguity. The basic idea of SAND is to explore the semantic association of name entities and cluster name entities according to their associations. Finally, the name entities in the same group are considered as the same entities. We test SAND using data from CitesSeer, DBLP and Libra. The test results show that SAND is an effective approach to solve the problem of name ambiguity.
Keywords :
Web sites; document handling; information retrieval; pattern clustering; CitesSeer; DBLP; Libra; SAND; Website; e-document; heterogeneous literature database; information retrieval; name specification; semantic association clustering; semantic association-based name disambiguation method; Bibliographies; Clustering algorithms; Computer science; Couplings; Databases; Frequency; Grid computing; Information retrieval; Robustness; Testing; clusting; name disambiguation; semantic association;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
e-Business Engineering, 2009. ICEBE '09. IEEE International Conference on
Conference_Location :
Macau
Print_ISBN :
978-0-7695-3842-6
Type :
conf
DOI :
10.1109/ICEBE.2009.16
Filename :
5342132
Link To Document :
بازگشت