Jabba's Hutt

incr interface

Interface options for incremental checkpointing
 
a) cr_checkpoint -i -pid [pid]

- Any checkpoints with the flag -i are considered as incremental and will be checkpointed incremental. Without the  -i flag, a full checkpoint will be taken

b) cr_checkpoint -p period -pid pid

- Here the -p flag indicates the period of incremental checkpointing. The 'period' argument will be passed to indicate the number of incremental checkpoints to take between two full checkpoints. This is probably not a preferred approach. BLCR doesnt care what is the optimal frequency of incremental checkpointing is. BLCR will only be told whether the current checkpointing is full or incremental.

c) cr_checkpoint --full-incr [method] -pid pid

This is probably the best way to implement the flags. Here we pass the method type [write bit/ dirty bit] along with full incr flag at the start of a set of checkpoints(This also means we can switch methods between two successive sets of checkpoints, even in the same process run.)
As for the successive incremental checkpoints after the above full checkpoint, there are again two ways of going about it.
(i) The '--full-incr'flag can be used to turn a flag in blcr so that the subsequent checkpoints will be incremental if the command is given without any flag, like a normal blcr command, will be taken as incremental(till the next command with '--full-incr' comes around, indicating next set). For example

cr_checkpoint --full-incr wb -pid XXX               // first full checkpointing
cr_checkpoint -pid XXX               // first incremental
cr_checkpoint -pid XXX               // second incremental
cr_checkpoint -pid XXX               // third incremental

cr_checkpoint --full-incr wb -pid XXX               // second set
or
cr_checkpoint --stop-incr -pid XXx            // Going back to default blcr checkpointing.




This has a couple of setbacks. The first being that we will need to have a flag to stop the incremental run and go back to normal blcr checkpointing (something like --stop). Other problem being that it will introduce a certain amount of ambiguity in the normal checkpoint command. A user might forget that the checkpoints he's been doing with the normal checkpointing command are actually incremental since the last '--full-chkpt' flag. This is probably not a good way of doing this but was worth exploring since it requires only one special flag to remember ('--full-incr' flag... and '--stop-incr' but that is only once in a while)

(ii) The second way of doint this is to use above approach for full checkpointing (i.e. '--full-incr' flag) and use approach (a) (-i flag) for subsequent incremental checkpoints. for example


cr_checkpoint --full-incr wb -pid XXX               // first full checkpointing
cr_checkpoint -i -pid XXX               // first incremental
cr_checkpoint -i -pid XXX               // second incremental
cr_checkpoint -i -pid XXX               // third incremental

cr_checkpoint --full-incr wb -pid XXX               // second set

This method is much clearer in terms of syntax but requires user to remember two flags for the full checkpoint and the incremental one. Also, it allows user to make mistakes. We need to consider special scenarios like the following.

cr_checkpoint --full-incr wb -pid XXX               // first full checkpointing
cr_checkpoint -i -pid XXX               // first incremental

cr_checkpoint -pid XXX            // Normal blcr checkpoint   

cr_checkpoint -i -pid XXX               // second incremental
cr_checkpoint -i -pid XXX               // third incremental


In this case, should the incremental checkpoints be allowed when a normal checkpoint is taken in middle of a set of checkpoints? Should it be considered a full checkpoint and thus the start of next set? If yes, which method (wb/db) do we resort to?