Exploring the corporate ecosystem with a semi-supervised entity graph

  • Authors:
  • Hassan H. Malik;Ian MacGillivray;Måns Olof-Ors;Siming Sun;Shailesh Saroha

  • Affiliations:
  • Thomson Reuters, New York, NY, USA;Thomson Reuters, New York, NY, USA;Thomson Reuters, Baar, Switzerland;Thomson Reuters, New York, NY, USA;Thomson Reuters, New York, NY, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Investment decisions in the financial markets require careful analysis of information available from multiple data sources. In this paper, we present Atlas, a novel entity-based information analysis and content aggregation platform that uses heterogeneous data sources to construct and maintain the "ecosystem" around tangible and logical entities such as organizations, products, industries, geographies, commodities and macroeconomic indicators. Entities are represented as vertices in a directed graph, and edges are generated using entity co-occurrences in unstructured documents and supervised information from structured data sources. Significance scores for the edges are computed using a method that combines supervised, unsupervised and temporal factors into a single score. Important entity attributes from the structured content and the entity neighborhood in the graph are automatically summarized as the entity "fingerprint". A highly interactive user interface provides exploratory access to the graph and supports common business use cases. We present results of experiments performed on five years of news and broker research data, and show that Atlas is able to accurately identify important and interesting connections in real-world entities. We also demonstrate that Atlas entity fingerprints are particularly useful in entity similarity queries, with a quality that rivals existing human maintained databases.