Decouple submit and run scripts from configurations

Issue #73 new
Ian Hinder created an issue

Currently, SimFactory embeds the submit script and the run script in a configuration when the configuration is built. Neither of these are intrinsic properties of a configuration from a Cactus point of view. For example, they could be changed at job submission time. I think it would make a lot of sense for these to be decoupled from the configuration and chosen at job submission and job run time respectively. Here are some real-world situations where this would be useful:

1. At the AEI, we have workstations built into the same environment as our cluster, so that the same configuration/executable can run on both. At the moment, since the submission script (which would be different for a workstation and the cluster) is part of the configuration, you need to build two configurations even though it would be logical to just choose a different submit script at submit time.

2. When modifying a submit script or a run script, it is currently necessary to use "sim build". This can take a very long time (several minutes) on some clusters, even when there are no changes. There is no logical reason to recompile the configuration when only the submit or run script has changed.

3. It is very confusing that editing the submit script does not lead to it being used in subsequent submissions. I have very often forgotten that you need to do a new "build --submitscript..." command.

4. When SimFactory gains support for running testsuites, it will probably happen by having an alternate run script. In this case, it would be necessary to have a configuration specifically for testsuites, where the only difference is that the run script is a testsuite script rather than a parameter file script.

I propose removing --submitscript and --runscript from the "sim build" command and moving them to "sim submit" and "sim run" respectively.

Keyword:

Comments (1)

  1. Erik Schnetter
    • removed comment

    1. The run script is logically part of the Cactus configuration, since one has to use the same MPI version to build and to run an executable. The submit script is admittedly different -- at the moment, one can either submit an executable to a queuing system, or can run it interactively. It makes thus sense to submit different queuing systems, or to choose a queuing system only at the time of submission. However, the run script needs to be closely tied to the configuration; it is an omission in Cactus that this is not so. If you have an executable on a particular system, then there is only one correct way of running it.

    2. When a run script is modified, the executable is rebuilt because it contains (via Formaline) a copy of the run script. It is unfortunate that make takes so long to rebuild -- it shouldn't if there are no changes. There is no way to find out whether anything changed except to look at all files, which is just what make does. I'm afraid there is no way to speed up things, except in an unsafe manner (by not checking source files). You can, of course, just manually copy the new run script into the Cactus configuration directory.

    3. It is a feature that editing a run script does not lead to it being used. This prevents problems when you have different configurations with different run scripts or different MPI versions. Each configuration is self-contained once it has been built, so that one can change/update/modify anything one wants, and the configuration remains usable, even after weeks or months. This is important in production situations. It may be necessary to highlight this in the documentation.

    4. No, the Cactus test suite mechanisms needs to be updated (or SimFactory needs to be able to translate) such that it uses the information in the mdb and in the Cactus submit/run scripts, depending on whether the test cases are checked interactively or in a submitted job.

    Moving the --submitscript to the submit command may make sense, or at least letting the user override the submit script.

    What kind of changes do you find yourself making to the run script and submit script? Are these corrections for debugging? Or would there be a more elegant way to make these changes via SimFactory, so that these changes are also logged?

    Could SimFactory make these changes automatically? This is the key point of SimFactory: to automate things to relieve people of low-level details. For example, running simulations on Damiana or on attached workstations is very similar to using a different queuing system, and SimFactory should totally support this, either by looking at the machine from where you run/submit, or via a --remote or --queue option.

  2. Log in to comment