Changes between Version 66 and Version 67 of FGBI


Ignore:
Timestamp:
10/13/11 16:14:55 (13 years ago)
Author:
lvpeng
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FGBI

    v66 v67  
    66
    77== The Downtime Problem in [wiki:LLM LLM] ==
    8 [[Image(figure1.jpg, center)]]
    9 
    10             Figure 1. Primary-Backup model and the downtime problem.
     8{{{
     9#!html
     10<center>
     11<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/figure1.jpg">
     12<h4>Figure 1. Primary-Backup model and the downtime problem.</h4> 
     13</center>
     14}}}
    1115
    1216Downtime is the primary factor for estimating the high availability of a system, since any long downtime experience for clients may result in loss of client loyalty and thus revenue loss. Under the Primary-Backup model (Figure 1), there are two types of downtime: I) the time from when the primary host crashes until the VM resumes from the last checkpointed state on the backup host and starts to handle client requests (D,,1,, = T,,3,, - T,,1,,); and II) the time from when the VM pauses on the primary (to save for the checkpoint) until it resumes (D,,2,,). From the [wiki:Publications SSS'10] paper, we observe that for memory-intensive workloads running on guest VMs (such as the highSys workload), [wiki:LLM LLM] endures much longer type I downtime than [http://nss.cs.ubc.ca/remus/ Remus]. This is because, such workloads update the guest memory at high frequency. In contrast, [wiki:LLM LLM] migrates the guest VM image update (mostly from memory) at low frequency, but uses input replay as an auxiliary. Thus, when a failure happens, a significant number of memory updates are needed in order to ensure synchronization between the primary and backup hosts. Therefore, [wiki:LLM LLM] needs significantly more time for the input replay process in order to resume the VM on the backup host and begin handling client requests.
     
    4852[[Image(2cd.jpg, center)]]
    4953
    50 Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys.
     54{{{
     55#!html
     56<center>
     57<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2ab.jpg">
     58</center>
     59}}}
     60{{{
     61#!html
     62<center>
     63<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/FGBI/2cd.jpg">
     64<h4>Figure 2. Type I Downtime comparison under different benchmarks: (a) Apache. (b) NPB-EP. (c) SPECweb. (d) SPECsys.</h4> 
     65</center>
     66}}}
    5167
    5268Figures 2(a), 2(b), 2(c), and 2(d) show the type I downtime comparison among [wiki:FGBI FGBI], [wiki:LLM LLM], and [http://nss.cs.ubc.ca/remus/ Remus] mechanisms under [http://httpd.apache.org/ Apache], [http://www.nas.nasa.gov/Resources/Software/npb.html NPB-EP],