Title :
An effective approach to entity resolution problem using quasi-clique and its application to digital libraries
Author :
On, B.-W. ; Elmacioglu, E. ; Lee, Daewoo
Author_Institution :
Pennsylvania State Univ., University Park, PA
Abstract :
We study how to resolve entities that contain a group of related elements in them (e.g., an author entity with a list of citations or an intermediate result by GROUP BY SQL query). Such entities, named as grouped-entities, frequently occur in many applications. By exploiting contextual information mined from the group of elements per entity in addition to syntactic similarity, we show that our approach, Quasi-Clique, improves precision and recall unto 91% when used together with a variety of existing entity resolution solutions, but never worsens them
Keywords :
data mining; digital libraries; entity-relationship modelling; QuasiClique; digital library; entity resolution problem; grouped-entities; Collaboration; Data structures; Degradation; Entropy; Erbium; Information retrieval; Information systems; Joining processes; Partitioning algorithms; Software libraries; entity resolution; graph partition; name disambiguation;
Conference_Titel :
Digital Libraries, 2006. JCDL '06. Proceedings of the 6th ACM/IEEE-CS Joint Conference on
Conference_Location :
Chapel Hill, NC
Print_ISBN :
1-59593-354-9
DOI :
10.1145/1141753.1141761