Title :
Named Entity Resolution in Chinese News Comments on the Web
Author :
Zong, Liang ; Wan, Xiaojun ; Zhao, Lihong ; Yang, Jianwu ; Wu, Yuqian
Author_Institution :
Key Lab. of Comput. Linguistics, MOE Peking Univ., Beijing, China
Abstract :
News comment is a new text genre which people use to express their opinions on recent news events. Different from normal text corpus, news comments have some particular properties. The named entities in the news comments usually use some wrongly written words, informal abbreviations or aliases, which bring great difficulties for machine detection and understanding. This paper addresses the issue of named entity resolution in Chinese news comments on the web, which is a special case of coreference resolution. Traditional resolution algorithms have some limitations for this special task. In this paper, we first define the special task, and then propose a novel resolution algorithm with new features to improve the resolution performance. We manually labeled a benchmark dataset with 60 pieces of news and their corresponding comments downloaded from a popular Chinese news portal and the experimental results on the dataset show that our algorithm is effective for this special task.
Keywords :
Internet; natural language processing; portals; text analysis; Chinese news comments; Chinese news portal; Web portals; benchmark dataset; coreference resolution; informal abbreviation; machine detection; named entity resolution; resolution algorithm; Broadcasting; Computational linguistics; Information resources; Internet; Joining processes; Laboratories; Natural language processing; Portals; Resists; comment focuses; named entity resolution; news comment;
Conference_Titel :
Web Conference (APWEB), 2010 12th International Asia-Pacific
Conference_Location :
Busan
Print_ISBN :
978-1-7695-4012-2
Electronic_ISBN :
978-1-4244-6600-9
DOI :
10.1109/APWeb.2010.20