Title :
Cross-Document Coreference Resolution Based on Automatic Text Summary
Author :
Gao, Sanyuan ; Li, Si ; Xu, Weiran ; Guo, Jun
Author_Institution :
Pattern Recognition & Intell. Syst. Lab., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
Cross-document coreference resolution plays an import part in the filed of natural language processing (NLP). It captures the ability of gathering documents for information about a certain entity. Most previous algorithms identify the underlying entity of a given document depending on the original text, which is unreliable if the original text contains multiple parts of different themes. In this paper, we propose a cross-document coreference resolution algorithm based on automatic text summary instead of the original text. In our approach, we extract query-specific and informative-indicative summary from the original text by using Hobbs algorithm and measure the similarity between two summaries. This automatic text summary-based cross-document coreference resolution (ATSCDCR) system is effective in disambiguating different entities of the same mention name and identifying the same entity of different mention names. The results from our experiments show that the macro average of ATSCDCR system is up to 73.16% and the micro average of ATSCDCR system is 67.34 %.
Keywords :
natural language processing; text analysis; Hobbs algorithm; automatic text summary; cross-document coreference resolution; informative-indicative summary; natural language processing; query-specific summary; similarity measure; Biomedical informatics; Data mining; Displays; Electronic mail; Intelligent systems; Joining processes; Natural language processing; Pattern recognition; Testing; Text recognition; Automatic Text Summary; Cross-Document Coreference Resolution; Hobbs Algorithm; Named Entity Type Recognition;
Conference_Titel :
Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
Conference_Location :
Phuket
Print_ISBN :
978-1-4244-5397-9
Electronic_ISBN :
978-1-4244-5398-6
DOI :
10.1109/WKDD.2010.56