Title :
Filtering operation in cross-document linking
Author :
Mitocariu, Elena
Author_Institution :
Fac. of Comput. Sci., Al.I. Cuza Univ. of Iasi, Iasi, Romania
Abstract :
In this paper a filtering operation for cross-document connections is presented. Different texts could be connected to each other if they refer to the same entity. Linking different documents only in terms of overlap leads to too many False Positives. The method proposed in this paper has as start point Centering Theory (CT). A list of Principal Centers (Cp) is created for each sentence in the document. This list filters the results in cross-document linking. A bigraph representation is proposed to highlight the connections between texts. A score for classifying the topics is also presented. The score is calculated based on entities occurrence frequency in the whole document. Such an approach eliminated some of the False Positive results (Fps).
Keywords :
graph theory; information filtering; text analysis; bigraph representation; centering theory; cross-document connections; cross-document linking; false positives; filtering operation; principal centers; Filtering theory; Information filters; Joining processes; Text analysis; XML; centering theory; cross-document analysis; topics;
Conference_Titel :
Communications and Information Technologies (ISCIT), 2014 14th International Symposium on
Conference_Location :
Incheon
DOI :
10.1109/ISCIT.2014.7011894