How to build a WebFountain: An architecture for very large-scale text analytics

  • Authors:
  • D. Gruhl;L. Chavet;D. Gibson;J. Meyer;P. Pattanayak;A. Tomkins;J. Zien

  • Affiliations:
  • IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120;IBM Research Division, Almaden Research Center, 650 Harry Road, San Jose, California 95120

  • Venue:
  • IBM Systems Journal
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

WebFountain is a platform for very large-scale text analytics applications. The platform allows uniform access to a wide variety of sources, scalable system-managed deployment of a variety of document-level "augmenters" and corpus-level "miners," and finally creation of an extensible set of hosted Web services containing information that drives end-user applications. Analytical components can be authored remotely by partners using a collection of Web service APIs (application programming interfaces). The system is operational and supports live customers. This paper surveys the high-level decisions made in creating such a system.