DocumentCode
2068406
Title
A method for person name disambiguation based on Baidu Encyclopedia
Author
Li, Xinfu ; Cao, Wenxue
Author_Institution
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
fYear
2011
fDate
16-18 Dec. 2011
Firstpage
423
Lastpage
426
Abstract
The phenomenon of person name ambiguity is widespread on web pages in that one name may be used by different people. It is important to uniquely identify the given person on the web. In this paper, the method Baidu-PND is proposed by the authors. It is an unsupervised name disambiguation method based on Baidu Encyclopedia. We extract three features including background knowledge, contextual feature and Related-Set of the characters from the online Baidu Encyclopedia. The weights of the features are studied by logistic regression algorithm. Then we make a linear fusion of the features. The maximum combined value is selected as the correct person on web pages. Experiments are conducted to measure the performance of Baidu-PND, which show that the performance is higher than we expected, validating its feasibility and effectiveness for person name disambiguation on web pages. And, Baidu-PND is a new method for knowledge mining based on Baidu Encyclopedia.
Keywords
Internet; encyclopaedias; feature extraction; natural language processing; regression analysis; search engines; Baidu Encyclopedia; Baidu-PND method; Web pages; feature extraction; logistic regression algorithm; person name disambiguation; unsupervised name disambiguation method; Accuracy; Context; Educational institutions; Encyclopedias; Feature extraction; Physics; Web pages; Baidu Encyclopedia; person name disambiguation; unsupervised learning; web mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Transportation, Mechanical, and Electrical Engineering (TMEE), 2011 International Conference on
Conference_Location
Changchun
Print_ISBN
978-1-4577-1700-0
Type
conf
DOI
10.1109/TMEE.2011.6199232
Filename
6199232
Link To Document