Simfactory: it doesn't honor --procs when running testsuites

Issue #1150 resolved
Bruno Mundim created an issue

I have entered the following command:

sim create-submit trestles__2_8 --testsuite --procs 2 --num-threads 8

but trestles__2_8/output-0000/SIMFACTORY/RunScript only sets -np 1:

${MPICHDIR}/bin/mpirun_rsh -tv -np 1 -hostfile ${MPI_NODEFILE} /oasis/scratch/trestles/bcmundim/temp_project/simulations/trestles__2_8/SIMFACTORY/exe/cactus_sim -L 3 ${TESTSUITE_PARFILE}

This may be related to bug #1075. Thanks!

Keyword:

Comments (16)

  1. Roland Haas
    • removed comment

    --procs is processors (cores) not MPI processes. The names have become confusing. Simfactory should likely abort in this case since you asked for more threads than cores.

    This is actually explained in the docs but very confusing (and I think simfactory is not consistent with its use of procs for processes and processors).

  2. Ian Hinder
    • removed comment

    The documentation Roland is referring to is at http://simfactory.org/info/documentation/userguide/processterminology.html. I don't remember coming across a situation where it was found to be inconsistent with the simfactory code. On the other hand, the machine database entries in submit script and run scripts often don't use these variables and terms correctly. I agree that the terminology is confusing. I have opened #1151.

    In your case, since you have 8 threads, you need --procs 16 to run with 2 processes.

    Note that this is nothing to do with testsuites; you should get the same behaviour if you run a normal job. If this is not the case, then it is indeed a bug.

  3. Bruno Mundim reporter
    • removed comment

    ok, shouldn't --procs actually correspond to MPI processes/jobs/tasks? It is what I thought immediately, so I assume it is more intuitive...

  4. Bruno Mundim reporter
    • removed comment

    also procs is described in http://simfactory.org/info/documentation/userguide/processterminology.html as the number of threads, which is even more confusing. We end up with the following possible mappings:

    --procs --> total number of threads --procs --> number of threads per MPI task (probably not since –num-threads is described as such) --procs --> number of processes, meaning MPI tasks --procs --> number of processors, meaning CPUs --procs --> number of cores, for those thinking of a core as a computational unit (less probable though)

  5. Roland Haas
    • removed comment

    I certainly was confused about procs as well. In particular since many places in the code use Procs to refer to MPI processes (eg CCTK_NProcs). This might be historical and based on the assumption that there are as many processes as processors (do we mean socket or core by that as well) but is what you are used to when writing MPI code.

    So my preference would be to rename --procs to --cores (if this is what the queuing systems wants) and output a warning message if --procs is used. Since I think of procs as MPI processes my personal favourite would be of course to have --procs by MPI processes but that seems impossible to change now since all users of simfactory are now used to the old meaning.

    We might also want to have look at how simfactory handles undersubscribing a node eg. run only 2 MPI processes with 4 threads each on a node that has say 16 cores (since one needs lots of memory). I think this is currently possible but would double check that the naming convention there is not "unfortunate". :-)

  6. Bruno Mundim reporter
    • removed comment

    Since I think of procs as MPI processes my personal favourite would be of course to have --procs by MPI
    processes but that seems impossible to change now since all users of simfactory are now used to the old meaning.

    My feeling is that everyone thinks is confusing, but no one wants to change since people either got used to it or do not have a better alternative in mind. I think we should propose a better alternative and change it, under a mutual agreement of course. New users will appreciate more intuitive command options.

  7. Erik Schnetter
    • removed comment

    People rarely want to specify the number of MPI processes. They typically want to specify the number of cores that they request from the queueing system, since this is what they pay for. Given that the number of OpenMP threads is also fixed (either at one, or at a small integer number depending on the system hardware), then actual number of MPI processes then needs to be calculated automatically. This is guaranteed to always lead to a "sensible" answer.

    The converse, e.g. specifying MPI processes and threads, can easily lead to underused nodes. Specifying cores and MPI processes can easily lead to strange (non-optimal) numbers of OpenMP threads.

    Under- and over-subscribing should work fine in Simfactory, apart from particular submit scripts that may not be able to handle this.

  8. Bruno Mundim reporter
    • removed comment

    People rarely want to specify the number of MPI processes. They typically want to specify the number of >cores that they request from the queueing system, since this is what they pay for.

    Wouldn't the be more natural that they submit their jobs through simfactory with a --cores option?

    Given that the number of OpenMP threads is also fixed (either at one, or at a small integer number >depending on the system hardware), then actual number of MPI processes then needs to be calculated >automatically. This is guaranteed to always lead to a "sensible" answer.

    I agree. Most of the time users won't care for the number of MPI processes and or number of threads. That's what the machine database in simfactory is there for, and, as far as I understood, you wouldn't need to specify the number of threads per process you want, only the total number of cores or nodes (another interesting option would be --nodes or --num-nodes). However when we want to experiment with other configurations then the options become unintuitive. Also when I suggested --tasks or --mpi-tasks or --num-mpitasks I was referring tasks per node. Again that would be important only if the user wants to experiment (or to run testsuites).

    The converse, e.g. specifying MPI processes and threads, can easily lead to underused nodes.

    That's correct if we are specifying the total number of MPI processes. That wouldn't be a problem if we specify the number of MPI processes per node in the same fashion we do specify the number of threads per MPI process.

    Specifying cores and MPI processes can easily lead to strange (non-optimal) numbers of OpenMP threads.

    I agree. We either specify one or the other. Not both at the same time.

    Under- and over-subscribing should work fine in Simfactory, apart from particular submit scripts that >may not be able to handle this.

    Ok, so it is only a matter of making the option names more intuitive for the work they are supposed to do.

  9. Ian Hinder
    • removed comment

    --procs is ambiguous, so we should deprecate it. It really means "total number of threads which will be launched", but from the user's point of view, it also usually corresponds to "cores". I would like to be able to specify "--cores" (i.e. the amount of hardware I want to use), on the understanding that I will get that many cores, and that there will be one thread per core (the sensible choice). We should keep --procs for compatibility to mean exactly what it means now. We can introduce "--processes" which will determine the number of MPI processes; this is probably only ever used when testing, and when running test suites.

  10. Erik Schnetter
    • removed comment

    We should do this, together with updating documentation and examples, but should do it after the release. It wouldn't make sense to deprecate "procs" without having thoroughly tested "cores".

  11. Frank Löffler

    I agree about --cores and the default of one thread per core.

    About options to deviate from that we should provide what users will most likely want to complement --cores above. I would think that this is most likely: --num-processes: number of mpi processes per node --num-threads : number of threads per mpi process

    Obviously specifying only one of the latter two should set the other such that a whole node is filled as much as possible (not oversubscribing). Specifying both would be used for over- or undersubscribing nodes.

    In some sense --cores would be used for how much you request from the queuing system and the other two would be used to change the default layout of how these cores will be used (mpi/openmp).

  12. Bruno Mundim reporter
    • removed comment

    I like Frank's complementary idea, and surely we should implement this after the release.

  13. Log in to comment