Replication between Geographically Separated Clusters - An Asynchronous Scalable Replication Mechanism for Very High Availability

  • Authors:
  • Anders Björnerstedt;Helena Ketoja;Johan Sintorn;Martin Sköld

  • Affiliations:
  • -;-;-;-

  • Venue:
  • DBTel '01 Proceedings of the VLDB 2001 International Workshop on Databases in Telecommunications II
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In telecommunication systems such as Home Location Registers (HLRs) and AAA-selwers (Authentication, Authorization, and Accounting) requirements on availability, real-time, scalability, consistency and persistence (durability) of the data storage are important. A base for high availability, real-time, scalability, and consistency can be achieved by using a distributed real-time main memory database system implemented on a local cluster of sbured nothing processors. Even higher availability and improved persistence Call be achieved throngh all additional level of redundancy, combined with geographical separation. Two or more clusters are separated geographically to protect against site failure or site unreachability, due to any reason, includillg externally caused disasters such as earthquakes, bombs or fires. A wide-area replication mechanism ensures that the database is always Consistent and nearly always complete (up-to-date), at all sites. The persistency requirement on telecommunication systems is usually not as severe as, for example, banking systems. On the other hand, the availability and real-time requirements are usually very high, with milli-second response times and fail-over times of no more than a few seconds when a site fails.The protocol chosen for replication between the separate sites/clusters can impact both availability and performance. If strict synchronous replication (2PC or 3PC) is imposed on all geographically replicated transactions, then clients will be forced to wait a considerable time on replies from geographically distant sites. A synchronous protocol can also have a tendency to propagate problems "upstream" from one site to others. Finally, if the replication protocol becomes a bottleneck then this will undermine the throughput and scalability of the local cluster.This paper presents all asynchronous replication mechanism that preserves the availability, scalability, and consistency requirements while at the same time achieving acceptable level of persistency/completeness.The paper also presents tile Ericsson TelORB1 platfolan including a distributed soft real-time main-memory database system. TelORB and the replication mechanism described here, is already in service in commercial HLRs and other products.