Changes between Version 19 and Version 20 of LLM


Ignore:
Timestamp:
10/13/11 16:23:47 (13 years ago)
Author:
lvpeng
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • LLM

    v19 v20  
    2727Similar to the migration of CPU/memory/disk updates, the migration of service requests is also done in an asynchronous manner, i.e., the primary machine resumes its service without waiting for an acknowledgement from the backup machine.
    2828
    29 [[Image(figure2.jpg)]]
    30 
    31         Figure 2. Checkpointing Sequence.
    32 
     29{{{
     30#!html
     31<center>
     32<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/LLM/figure2.jpg">
     33<h4>Figure 2. Checkpointing Sequence.</h4> 
     34</center>
     35}}}
     36     
    3337Figure 2 shows the time sequence of migrating the checkpointed resources and the incoming service requests at different frequencies on a single network socket. The entire sequence within an epoch is described as follows:
    3438
     
    4953
    5054== Evaluation Results ==
    51 [[Image(figure3.jpg)]]
    52 
    53           Figure 3. Downtime under highnet and highsys.
    54  
     55{{{
     56#!html
     57<center>
     58<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/LLM/figure3.jpg">
     59<h4>Figure 3. Downtime under highnet and highsys.</h4> 
     60</center>
     61}}}
     62           
    5563Figure 3 shows the downtime results under highnet and highsys. We observe that under highsys, [wiki:LLM LLM] incurs a downtime that is longer than, yet comparable to, that of [http://nss.cs.ubc.ca/remus/ Remus]. The reason is that, [wiki:LLM LLM] runs at low frequency, hence the migration traffic in each period is higher than that of [http://nss.cs.ubc.ca/remus/ Remus]. Under highnet, the downtime of [wiki:LLM LLM] and [http://nss.cs.ubc.ca/remus/ Remus] show a reverse relationship, where [wiki:LLM LLM] outperforms [http://nss.cs.ubc.ca/remus/ Remus]. This is because, from the client side, there are too many duplicated packets to be served again by the backup machine in [http://nss.cs.ubc.ca/remus/ Remus]. In [wiki:LLM LLM], on the contrary, the primary machine migrates the requested packets as well as boundaries to the backup machine, i.e., only those packets yet to be served are served by the backup. Thus the client does not need to re-transmit the requests, and therefore experiences a shorter downtime.
    5664
    5765[[Image(figure4.jpg)]]
    58 
    59              Figure 4. Network Delay under highnet and highsys.
     66{{{
     67#!html
     68<center>
     69<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/LLM/figure4.jpg">
     70<h4>Figure 4. Network Delay under highnet and highsys.</h4> 
     71</center>
     72}}}
    6073
    6174Figure 4 shows the network delay results under highnet and highsys. In both cases, we observe that [wiki:LLM LLM] significantly reduces the network delay by removing the egress queue management and releasing responses immediately. In Figure 4, we only recorded the average network delay in a migration period. Next, we show the details of the network delay in a specific migration period in Figure 4, in which the interval between two adjacent peak values represents one migration period. We observe that the network delay of [http://nss.cs.ubc.ca/remus/ Remus] decreases linearly within a period but remains at a plateau. In [wiki:LLM LLM], on the contrary, the network delay is very high at the beginning of a period, then quickly decreases to nearly zero after a system update is over. Therefore, most of the time, [wiki:LLM LLM] demonstrates a much shorter network delay than [http://nss.cs.ubc.ca/remus/ Remus].
    6275
    63 [[Image(figure5.jpg)]]
    64 
    65              Figure 5. Overhead under Kernel Compilation.
     76{{{
     77#!html
     78<center>
     79<img style="border:0px;" src="/trac/fgbi/raw-attachment/wiki/LLM/figure5.jpg">
     80<h4>Figure 5. Overhead under Kernel Compilation.</h4> 
     81</center>
     82}}}
    6683
    6784Figure 5 shows the overhead under kernel compilation. The overhead significantly changes only in the checkpointing period interval of [1;60] seconds, as shown in the figure. For checkpointing in the shorter periods, the migration of system updates may last longer than a configured checkpointing period. Therefore the kernel compilation time for these cases are almost the same with minor fluctuation. For checkpointing in the longer periods, especially when it is longer than the baseline (i.e., kernel compilation without any checkpointing), a VM suspension may or may not occur during one compilation process. Therefore, the kernel compilation time will be very close to the baseline, meaning a zero percent overhead. Right in this interval, [wiki:LLM LLM]’s overhead due to the suspension of domain U is significantly lower than that of [http://nss.cs.ubc.ca/remus/ Remus], as it runs at much lower frequency than [http://nss.cs.ubc.ca/remus/ Remus].