• DocumentCode
    1850323
  • Title

    Fusion of News Reports Using Surface-Based Methods

  • Author

    Azzopardi, Joel ; Staff, Christopher

  • Author_Institution
    Fac. of ICT, Univ. of Malta, Msida, Malta
  • fYear
    2012
  • fDate
    26-29 March 2012
  • Firstpage
    809
  • Lastpage
    814
  • Abstract
    Events occurring in the real world are covered by news reports from different sources. Each report generally contains information that is found in others, but may also contain unique information. To learn all the information about a particular event, a user will need to read all the different reports. This is a duplication of effort since most information will be repeated in the different reports. In our research, we attempt to fuse news reports about the same event into a single coherent document eliminating repetition but preserving all the information contained in the source reports using only surface-based methods. Information in each news report is represented by a set of entity relationship graphs. The graphs representing each report are then merged into a single graph whilst keeping track of the source sentences. The fused report is generated using the maximally expressive set of sentences -- the sentences that carry most information about the entities and their relationships in the news report, and ensuring that all entities and relationships are expressed in the fused document. Our Document fusion system was evaluated using a set of news reports downloaded from MSNBC News that cite their sources, and also using human evaluation. We show that our system is able to capture most of the information found across different source documents whilst maintaining readability.
  • Keywords
    data structures; document handling; entity-relationship modelling; graph theory; information resources; sensor fusion; MSNBC news; document fusion system; entity relationship graphs; graphs representation; human evaluation; information preservation; news reports fusion; readability maintenance; single coherent document elimination; source sentences; surface-based methods; Data mining; Fuses; Humans; Periodic structures; Redundancy; Semantics; Statistical analysis; Document fusion; conceptual graphs; entity-relation graphs; news;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications Workshops (WAINA), 2012 26th International Conference on
  • Conference_Location
    Fukuoka
  • Print_ISBN
    978-1-4673-0867-0
  • Type

    conf

  • DOI
    10.1109/WAINA.2012.113
  • Filename
    6185494