Abstract :
In reality, different persons often have the same person name. The Person Cross Document Co-reference Resolution is a task, which requires that all and only the textual mentions of an entity of type Person be individuated in a collection of text documents. In this paper, we implement a Chinese Person Name Cross Document Co-reference Resolution System. First, we utilize name identification module to recognize all person names of the texts, and then classify the document collection of same person name by rules preliminarily, and at last, we compute similarities of each classification based on VSM, according to the prior similarities, the system get the final classification results. We test the system on 30 usual Chinese names of the corpus provided by CLP, and average F measure is 85.9%.
Keywords :
natural language processing; text analysis; word processing; CLP; Chinese person name cross document coreference resolution system; VSM; text documents; Computational linguistics; Computers; Data mining; Feature extraction; Information processing; Natural language processing; Vectors; VSM; cross document; name disambiguation; person name reorganization;