Changes between Version 66 and Version 67 of FGBI
- Timestamp:
- 10/13/11 16:14:55 (13 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
FGBI
v66 v67 6 6 7 7 == The Downtime Problem in [wiki:LLM LLM] == 8 [[Image(figure1.jpg, center)]] 9 10 Figure 1. Primary-Backup model and the downtime problem. 8 {{{ 9 #!html 10 <center> 11 <img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/figure1.jpg"> 12 <h4>Figure 1. Primary-Backup model and the downtime problem.</h4> 13 </center> 14 }}} 11 15 12 16 Downtime is the primary factor for estimating the high availability of a system, since any long downtime experience for clients may result in loss of client loyalty and thus revenue loss. Under the Primary-Backup model (Figure 1), there are two types of downtime: I) the time from when the primary host crashes until the VM resumes from the last checkpointed state on the backup host and starts to handle client requests (D,,1,, = T,,3,, - T,,1,,); and II) the time from when the VM pauses on the primary (to save for the checkpoint) until it resumes (D,,2,,). From the [wiki:Publications SSS'10] paper, we observe that for memory-intensive workloads running on guest VMs (such as the highSys workload), [wiki:LLM LLM] endures much longer type I downtime than [http://nss.cs.ubc.ca/remus/ Remus]. This is because, such workloads update the guest memory at high frequency. In contrast, [wiki:LLM LLM] migrates the guest VM image update (mostly from memory) at low frequency, but uses input replay as an auxiliary. Thus, when a failure happens, a significant number of memory updates are needed in order to ensure synchronization between the primary and backup hosts. Therefore, [wiki:LLM LLM] needs significantly more time for the input replay process in order to resume the VM on the backup host and begin handling client requests. … … 48 52 [[Image(2cd.jpg, center)]] 49 53 50 Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys. 54 {{{ 55 #!html 56 <center> 57 <img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2ab.jpg"> 58 </center> 59 }}} 60 {{{ 61 #!html 62 <center> 63 <img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2cd.jpg"> 64 <h4>Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys.</h4> 65 </center> 66 }}} 51 67 52 68 Figures 2(a), 2(b), 2(c), and 2(d) show the type I downtime comparison among [wiki:FGBI FGBI], [wiki:LLM LLM], and [http://nss.cs.ubc.ca/remus/ Remus] mechanisms under [http://httpd.apache.org/ Apache], [http://www.nas.nasa.gov/Resources/Software/npb.html NPB-EP],