Title : 
Fusion of News Reports Using Surface-Based Methods
         
        
            Author : 
Azzopardi, Joel ; Staff, Christopher
         
        
            Author_Institution : 
Fac. of ICT, Univ. of Malta, Msida, Malta
         
        
        
        
        
        
            Abstract : 
Events occurring in the real world are covered by news reports from different sources. Each report generally contains information that is found in others, but may also contain unique information. To learn all the information about a particular event, a user will need to read all the different reports. This is a duplication of effort since most information will be repeated in the different reports. In our research, we attempt to fuse news reports about the same event into a single coherent document eliminating repetition but preserving all the information contained in the source reports using only surface-based methods. Information in each news report is represented by a set of entity relationship graphs. The graphs representing each report are then merged into a single graph whilst keeping track of the source sentences. The fused report is generated using the maximally expressive set of sentences -- the sentences that carry most information about the entities and their relationships in the news report, and ensuring that all entities and relationships are expressed in the fused document. Our Document fusion system was evaluated using a set of news reports downloaded from MSNBC News that cite their sources, and also using human evaluation. We show that our system is able to capture most of the information found across different source documents whilst maintaining readability.
         
        
            Keywords : 
data structures; document handling; entity-relationship modelling; graph theory; information resources; sensor fusion; MSNBC news; document fusion system; entity relationship graphs; graphs representation; human evaluation; information preservation; news reports fusion; readability maintenance; single coherent document elimination; source sentences; surface-based methods; Data mining; Fuses; Humans; Periodic structures; Redundancy; Semantics; Statistical analysis; Document fusion; conceptual graphs; entity-relation graphs; news;
         
        
        
        
            Conference_Titel : 
Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on
         
        
            Conference_Location : 
Fukuoka
         
        
            Print_ISBN : 
978-1-4673-0867-0
         
        
        
            DOI : 
10.1109/WAINA.2012.113