Context Navigation

Changes between Version 66 and Version 67 of FGBI

Timestamp:: 10/13/11 16:14:55 (14 years ago)
Author:: lvpeng
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

FGBI

-                      v66
+                      v67
 == The Downtime Problem in [wiki:LLM LLM] ==
+[[Image(figure1.jpg, center)]]
+            Figure 1. Primary-Backup model and the downtime problem.
+{{{
+#!html
+<center>
+<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/figure1.jpg">
+<h4>Figure 1. Primary-Backup model and the downtime problem.</h4>
+</center>
+}}}
 Downtime is the primary factor for estimating the high availability of a system, since any long downtime experience for clients may result in loss of client loyalty and thus revenue loss. Under the Primary-Backup model (Figure 1), there are two types of downtime: I) the time from when the primary host crashes until the VM resumes from the last checkpointed state on the backup host and starts to handle client requests (D,,1,, = T,,3,, - T,,1,,); and II) the time from when the VM pauses on the primary (to save for the checkpoint) until it resumes (D,,2,,). From the [wiki:Publications SSS'10] paper, we observe that for memory-intensive workloads running on guest VMs (such as the highSys workload), [wiki:LLM LLM] endures much longer type I downtime than [http://nss.cs.ubc.ca/remus/ Remus]. This is because, such workloads update the guest memory at high frequency. In contrast, [wiki:LLM LLM] migrates the guest VM image update (mostly from memory) at low frequency, but uses input replay as an auxiliary. Thus, when a failure happens, a significant number of memory updates are needed in order to ensure synchronization between the primary and backup hosts. Therefore, [wiki:LLM LLM] needs significantly more time for the input replay process in order to resume the VM on the backup host and begin handling client requests.
 …
 [[Image(2cd.jpg, center)]]
+Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys.
+{{{
+#!html
+<center>
+<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2ab.jpg">
+</center>
+}}}
+{{{
+#!html
+<center>
+<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2cd.jpg">
+<h4>Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys.</h4>
+</center>
+}}}
 Figures 2(a), 2(b), 2(c), and 2(d) show the type I downtime comparison among [wiki:FGBI FGBI], [wiki:LLM LLM], and [http://nss.cs.ubc.ca/remus/ Remus] mechanisms under [http://httpd.apache.org/ Apache], [http://www.nas.nasa.gov/Resources/Software/npb.html NPB-EP],