Billing in the large

  • Authors:
  • Andrew Hume

  • Affiliations:
  • AT&T Labs Research, Florham Park, NJ

  • Venue:
  • Handbook of massive data sets
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is a growing need for very large databases which are not practical to implement with conventional relational database technology. These databases are characterized by huge size and frequent large updates; they do not require traditional database transactions, instead the atomicity of bulk updates can be guaranteed outside of the database. Given the I/O and CPU resources available on modern computer systems, it is possible to build these huge databases using simple flat files and simply scanning all the data when doing queries. This paper describes Gecko, a system for tracking the state of every call in a very large billing system, which uses sorted flat files to implement a database of about 60G records occupying 2.6TB. We focus on the performance issues, particularly with regard to job management, I/O management and data distribution, and on the tools we built to run the system.