Merging news reports that describe events

  • Authors:
  • Anthony Hunter;Rupert Summerton

  • Affiliations:
  • Department of Computer Science, University College London, London, UK;Department of Computer Science, University College London, London, UK

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many kinds of news report provide information about events. For example, business news reports in the area of mergers and acquisitions, provide information about events such as "company X making a bid for company Y", or "takeover of company Y by company X being rejected by the anti-trust authorities". Furthermore, news reports do not normally exist in isolation. There is an underlying narrative which concerns a number of entities related in some way over a period of time. In many domains, stories will follow a stereotypical sequence. For example, a particular takeover may involve a bid being made, a rejection by the target board, a rise in the bid value by the potential buyer, a recommendation of acceptance by the target board, acceptance by the shareholders, and finally successful completion of the takeover. In order to merge heterogeneous news reports that describe events, we need to identify and reason about the events being described prior to merging them. In this paper, we investigate this problem with a focus on structured news reports. Each structured news report (SNR) is an XML document, where the textentries are restricted to individual words or simple phrases, such as names and domain-specific terminology, and numbers and units. We assume SNRs do not require natural language processing. As each SNR is isomorphic to a term in logic, we use a logic-based approach to extract relevant information about the events being described in the reports to be merged. We then provide a new version of the event calculus to assimilate the information from the various reports, to obtain the most up-to-date and complete picture of the events being described. Finally, from this assimilated information, we generate an SNR as the output.