Context Navigation

Changes between Version 3 and Version 4 of FGBI

Timestamp:: 08/30/11 20:47:31 (14 years ago)
Author:: lvpeng
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

FGBI

-                      v3
+                      v4
 As virtualization becomes more and more prevalent, we can overcome these limitations by introducing the virtual machine (VM). In the virtual world, all the applications are running in the VM, so now it’s possible to implement the whole-system replication in an easy and efficient way — by saving the copy of the whole VM running on the system. As VMs are totally hardware-independent, the cost is much lower compared to the the hardware expenses in traditional HA solutions. Besides, virtualization technology can facilitate the management of multiple VMs on a single physical machine.With virtual machine monitors (VMM), the service applications are separated from physical machines, thus provides increased flexibility and improved performance.
 Remus, built on top of the well-known Xen hypervisor [2], provides transparent, comprehensive high availability by using a checkpointing method under the Primary-Backup model (Figure 1). It checkpoints the runningVMon the primary host, and transfers the latest checkpoint to the backup host as whole-system migration. Once the primary host fails, the backup host will take over the service based on the latest checkpoint. Remus proves that it is possible to create a general, fully transparent, high-availability solution entirely in software. However, checkpointing at high frequency will introduce significant overhead, since significant CPU and memory resources are consumed by the migration. Therefore, clients endure a long network delay. Jiang et. al. proposed an integrated live migration mechanism, called LLM, which integrates both whole-system checkpointing and input replay to reduce the network delay in Remus. The basic idea is that the primary host migrates the guest VM image (including CPU/memory status updates and new writes to the file system) to the backup host at low frequency. In the meanwhile, the service requests from network clients are migrated at high frequency, and because of this, LLM significantly outperforms Remus in terms of network delay by more than 90%.
+Remus, built on top of the well-known Xen hypervisor, provides transparent, comprehensive high availability by using a checkpointing method under the Primary-Backup model (Figure 1). It checkpoints the runningVMon the primary host, and transfers the latest checkpoint to the backup host as whole-system migration. Once the primary host fails, the backup host will take over the service based on the latest checkpoint. Remus proves that it is possible to create a general, fully transparent, high-availability solution entirely in software. However, checkpointing at high frequency will introduce significant overhead, since significant CPU and memory resources are consumed by the migration. Therefore, clients endure a long network delay. Jiang et. al. proposed an integrated live migration mechanism, called LLM, which integrates both whole-system checkpointing and input replay to reduce the network delay in Remus. The basic idea is that the primary host migrates the guest VM image (including CPU/memory status updates and new writes to the file system) to the backup host at low frequency. In the meanwhile, the service requests from network clients are migrated at high frequency, and because of this, LLM significantly outperforms Remus in terms of network delay by more than 90%.
 == Downtime Problem ==
 Downtime is the primary factor for estimating the high availability of a system, since any long downtime experience for clients may result in loss of client loyalty and thus revenue loss. Under the Primary-Backup model (Figure 1), there are two types of downtime: I) the time from when the primary host crashes until the VM resumes from the last checkpointed state on the backup host and starts to handle client requests (D1 = T3 - T1); II) the time from when the VM pauses on the primary (to save for the checkpoint) until it resumes (D2). From Jiang’s paper [9] we observe that for memory-intensive workloads running on guest VMs (such as the HighSys workload), LLM endures much longer type I downtime than Remus. This is because, these workloads update the guest memory at high frequency. On the other side, LLM migrates the guest VM image update (mostly from memory) at low frequency but uses input replay as an auxiliary. In this case, when failure happens, a significant number of memory updates are needed in order to ensure synchronization between the primary and backup hosts. Therefore, it needs significantly more time for the input replay process in order to resume the VM on the backup host and begin handling client requests.
+Downtime is the primary factor for estimating the high availability of a system, since any long downtime experience for clients may result in loss of client loyalty and thus revenue loss. Under the Primary-Backup model (Figure 1), there are two types of downtime: I) the time from when the primary host crashes until the VM resumes from the last checkpointed state on the backup host and starts to handle client requests (D1 = T3 - T1); II) the time from when the VM pauses on the primary (to save for the checkpoint) until it resumes (D2). From Jiang’s paper we observe that for memory-intensive workloads running on guest VMs (such as the HighSys workload), LLM endures much longer type I downtime than Remus. This is because, these workloads update the guest memory at high frequency. On the other side, LLM migrates the guest VM image update (mostly from memory) at low frequency but uses input replay as an auxiliary. In this case, when failure happens, a significant number of memory updates are needed in order to ensure synchronization between the primary and backup hosts. Therefore, it needs significantly more time for the input replay process in order to resume the VM on the backup host and begin handling client requests.
+Regarding the type II downtime, there are several migration epochs between two checkpoint, and the newly updated memory data is copied to the backup host at each epoch. At the last epoch, the VM running on the primary host is suspended and the remaining memory states are transfered to the backup host. Thus, the type II downtime depends on the amount of memory that remains to be copied and transferred when pausing the VM on the primary host. If we reduce the dirty data which need to be transferred at the last epoch, then we can reduce the type II downtime. Moreover, if we
+reduce the dirty data which needs to be transferred at each epoch, trying to synchronize the memory state between primary and backup host all the time, then at the last epoch, there won’t be too much new memory update that need to be transferred, so we can reduce the type I downtime too.
+Regarding the type II downtime, there are several migration epochs between two checkpoint, and the newly updated memory data is copied to the backup host at each epoch. At the last epoch, the VM running on the primary host is suspended and the remaining memory states are transfered to the backup host. Thus, the type II downtime depends on the amount of memory that remains to be copied and transferred when pausing the VM on the primary host. If we reduce the dirty data which need to be transferred at the last epoch, then we can reduce the type II downtime. Moreover, if we reduce the dirty data which needs to be transferred at each epoch, trying to synchronize the memory state between primary and backup host all the time, then at the last epoch, there won’t be too much new memory update that need to be transferred, so we can reduce the type I downtime too.
 Therefore, in order to achieve HA in these virtualized systems, especially to address the downtime problem under memory-intensive workloads, we propose a memory synchronization technique for tracking memory updates, called Fine-Grained Block Identification (or FGBI). Our main contributions includes: FGBI tracks and transfers the memory updates efficiently, by reducing the total number of dirty bytes which need to be transferred from primary to backup host. Besides, we integrate memory block sharing support with FGBI to reduce the newly-introduced memory overheads. In addition, we also support a hybrid compression mechanism among the memory dirty blocks to further reduce the migration traffic in the transfer period. Our experimental results reveal that FGBI reduces the type I downtime over LLM and Remus by as much as 77% and 45% respectively, and reduces the type II downtime by more than 90% and 70%