Migration of software partition in UNIX system

  • Authors:
  • Satish Kharat;Rajeev Mishra;Ranadip Das;Srikanth Vishwanathan

  • Affiliations:
  • IBM, Bangalore, Karnataka, India;IBM, Bangalore, Karnataka, India;IBM, Bangalore, Karnataka, India;IBM, Pune, Maharashtra, India

  • Venue:
  • COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software partitioning is a technology to partition a machine running a single instance of operating system image into multiple virtual machines called partitions. Each partition emulates an independent machine running a single instance of the operating system on dedicated hardware. All partitions are isolated form each other by the operating system. Software partitioning is very useful in server consolidation. A single powerful machine can be used to host many different servers each using a single software partition. This increases hardware resource utilization, gives flexibility to the administrator and can reduce system administration costs. The advantages offered by software partitioning are greatly enhanced with the capability to checkpoint a running software partition and restart it on a different machine. It helps in load balancing over hardware resources, load balancing over time and fault tolerance. Workload Partition [WPAR] is IBM's implementation that provides software partitioning capability on the AIX operating system. It is possible to do the live migration of the WPARs in and across AIX systems. The live migration is achieved by the checkpoint/restart mechanism. It is possible to checkpoint and restart WPARs running most existing AIX applications without any modification to the applications. Also the checkpoint and restart process is transparent to the application running inside the WPAR (Partition). This paper discusses the issues faced in implementing software partition checkpoint and restart in the AIX operating system. These issues will be typical to any standard UNIX operating system. To successfully checkpoint and restart a software partition, it is necessary not only to checkpoint all the user processes in the partition but also to checkpoint global data pertaining the partition itself and data shared between processes of the Partition like IPC data, Streams, timers, file handles, memory mapped regions, shared memory, System services, Virtual devices etc. The WPAR implementation handles both; the checkpoint of individual processes as well as checkpoint of partition wide data.