SimFactory should allow steering of parameters between restarts

Issue #487 new
Ian Hinder created an issue

sim create currently takes the --parfile option, and this parfile is used for every restart. Cactus allows parameters to change upon recovery - this is called parameter steering. SimFactory should allow this as well.

The obvious implementation is to allow a --parfile option to "sim submit" which uses the new parameter file in this and all subsequent restarts. This needs some code to be moved from restart.create to restart.submit regarding parameter file substitutions. We also need to decide what should happen to the copy of the parameter file stored in the top-level SIMFACTORY directory. Should it remain the same, or should it be updated with the new parameter file. Subsequent restarts should inherit the parameter file from the previous restart.

Keyword:

Comments (7)

  1. Erik Schnetter
    • removed comment

    The design of simfactory is to hide the details of running a simulation as much as possible. Ideally, one would point to an executable and a parameter file, and receive the results four weeks later. Technically, this is not yet reliably possible.

    In my mind, a simulation is defined by the source code and the parameter file. Changing these should not be possible. If one wants to change a parameter file or an executable (which are valid requests), then this defines a new simulation. For example, what happens if the new parameter file wasn't good? Or, how can one ensure that a simulation is exactly reproducible by someone else?

    I would thus change the request to "allow using checkpoint files from a different simulation". Upon creating the simulation, simfactory would copy (hard link if possible) the checkpoint files.

  2. Ian Hinder reporter
    • removed comment

    I agree - I like this abstraction and it would be good to keep it. Maybe a good interface would be a new option "--continue" to the create command:

    sim create --continue [--parfile <parfile>] [--config <config>] <simname>

    The high-level view would be that there is an existing simulation which needs to be "continued" in a new simulation. The implementation would be that the checkpoint files of the old simulation are hard-linked into the new simulation. At the same time, a new parameter file can be provided to do parameter steering.

    One issue with this is that users will want to treat the resulting "combination" of simulations as a single physics run. Does this mean that analysis tools must now handle not only the case of data chunked into restarts, but also of collections of simulations corresponding to different parameter choices? When steering can be done within a simulation, this extra level of complication in analysis tools is not needed.

    If the new simulation contains a pointer to the simulation it was continued from, analysis tools can be given the last simulation in the sequence and work their way back to get the entire physics run.

  3. Bruno Mundim
    • removed comment

    I should say that this is the main reason I haven't switched to SimFactory yet. I probably have already mentioned this before, but I should repeat here, that I commonly output a different number of variables between the initial data and evolution parts of the simulation. I find it useful for initial inspection (to see if everything is all right before spending thousands of SUs). It would be really great if SimFactory allows me to steer parameters so I don't need to hack into its internals to force a change of parameters.

    Also I can think of other examples where it could be useful: the case when you want to find the horizons with a different frequency in the inspiral phase than in the plunge and merger phases. Or if someone else has a very expensive analysis routine that would be turned on only at a particular time of the simulation (that we don't know when to do so in advance for a new simulation). Another example: your simulation quits with nans and you find out that you can avoid that by decreasing the courant factor or maybe by enlarging an AH mask in a hydro simulation, it would be very useful to still do so easily with SimFactory. Currently I don't think it is possible without hacking it. Since the other alternative, running my own scripts, gives me more freedom, I find it hard to justify all the time necessary to learn yet another tool (that may not give me such a freedom). Erik's ideal of a simulation unfortunately is currently too unrealistic, since all these eventual tweaks in the parameters would need to be some how dealt with automatically. I don't think that's possible.

    So, said that, maybe we can still use the same simulation infrastructure and if we need to change the parameter file for some reason, that would be stored in the appropriate checkpoint directory and tagged accordingly. This way the simulation is still easily reproducible by someone else. What do you think?

  4. Roland Haas
    • removed comment

    I have similar situations to the ones Bruno mentioned. In particular the AH issue happened during a current run for the CIGR BHNS project (and I "fixed" this by cloning the simulation directory by hand and modifying the parameter file). We also have been turning apparent horizon finding on and off during simulations which we do by either modifying the parameter file or even using HTTPD (eg. to turn off the AHFinderDirect after some evolution time has passed and to turn on an AHFinderDirect at the location of the merged BH once they merge).

    Generally I find myself often playing with parameter files to find good settings whenever a simulation fails. Once I have found good ones I can often put them into a single parameter file (eg. using a simple thorn that lets me steer parameters at given iterations). While testing though it would be nice to be able to change parameter files easily. The --continue option might work for me in that respect. Would it be possible to link/mkdir in the output directories from the previous restarts instead of having the pointer mechanism that Ian suggested? That way the continued simulation would look like a normal one with changing parameter files and simplify postprocessing scripts.

    I fully agree that for a production run the simulation is the unit of source code and parameter file and that it should not change during the whole simulation. On the other hand the fast majority of the SU that I use are used for testing runs where code and parameters do usually change for me (I might find a bug, find NaNs...).

  5. Erik Schnetter
    • removed comment

    It seems the Real World (TM) interferes with my idealised vision of it. Well, that happened before.

    The "submit" command is the wrong place to change the parameter file; job submission is only supposed to handle technical details, such as job queues, node counts, thread counts, etc. All the physics is supposed to be handled elsewhere.

    The current "create" command creates a simulation, which is then assumed to remain unmodified. I suggest to add a new command "modify" (or with a similar name) which modifies a simulation, i.e. replaces an executable (with submit/run scripts), parameter file, etc. In this way, modifying a simulation is independent of submitting a new job; in practice, one may often want to do them at the same time, but these are really separate steps.

    There should then also be a "modify-submit" command, or (maybe better) the "create" and "modify" commands should submit automatically, unless an option "--no-submit" is given.

    Bruno, Roland, would this work for you?

    A point about "hacking simfactory's internals": Simfactory stores the executable and parameter file when creating a simulation, in subdirectories called "exe" and "par", so that they are easy to find. These are used when a new restart is submitted. You can modify them at any time; simfactory makes copies for every restart, so that there will still be a full, self-consistent description of previous (and currently queued) restarts. In fact, modifying this parameter file may be better than specifying a new one, since specifying a new one would require you to keep (elsewhere, outside simfactory) a copy of the parameter file -- which should not be necessary, since simfactory already reliably keeps such a copy.

  6. Roland Haas
    • removed comment

    I would find the modify option useful (more than the continue option, though that might also be useful in case I end up branching a simulation). It is more or less what I currently do. Together with Ian's proposed support for rpar files as well as par files I think it would suit my current workflow.

  7. Log in to comment