Test cases with specific number of processes

Issue #1075 resolved
Erik Schnetter created an issue

Some test cases require a specific number of processes. Instead not running, they should run on that number of processes. This would ensure that all tests execute all the time.

Keyword:

Comments (48)

  1. Frank Löffler
    • removed comment

    While it might be possible to do that with fewer processes than specified, doing to with more is usually not possible.

  2. Ian Hinder
    • removed comment

    I disagree. The test system chooses the number of processes to launch and calls the given mpirun command to do so. There is nothing that says we have to run the "right" number of processes for the number of cores and threads that we have. I agree that this would be a very nice thing to have. I think it is much more user-friendly than just not running some tests.

  3. Frank Löffler
    • removed comment

    When I specify to run a testsuite on, let's say 2 processes, I might only have two. That means mpi only knows about two and will choke on a request for, let's say four or even more. Even if it would not it might actually bring down the machine or at least slow it down more than it should. After all, I requested only 2 processes. Thus, I think it can be problematic to try to run testsuites with more processes than have been specified.

  4. Ian Hinder
    • removed comment

    Will mpirun choke on a request for more? That must be system-dependent. If I run on my laptop, I can ask mpirun for as many processes as I like. I think Erik calls this "oversubscribing". I don't think that running a test case on 2 processes rather than 1 is going to "bring a machine down". The performance will of course be lower, but these are not benchmarks.

  5. Frank Löffler
    • removed comment

    Some mpi implementations will choke on asking for more processes than known from the hostlist which they get from the batch system. You are right about 2 processes on a 1-processor system, but it is easy to come up with more problematic scenarios.

    What I think it more important it to fix why we need to have these specific processor numbers in the first place. Most of the time it is probably the ASCII output. Assuming we get finally someone working on hdf5 testsuites this could be history.

  6. Roland Haas
    • removed comment

    hdf5 test suites will only help if we also either redesign the internal hdf5 format or write a clever comparison program. Right now, even 1d or 2d hdf5 files will internally show how many components (ie. processes) existed. The layout is very similar to CarpetIOASCII's output, just with hdf5 datasets instead of blocks of numbers.

    You can get almost process number independent output when using compact_format, no headers, no ghost or symmetry points. The difference then is only some extra blank lines which currently are significant for the test system. One could very likely make compact format do away with the extra empty line.

  7. Ian Hinder
    • removed comment

    I would stick with ASCII and do what Roland suggests. We could then modify the tests so that they can run on any number of processes.

  8. Roland Haas
    • removed comment

    It turns out that removing the "extra" blank line is harder than expected. It is the eol from line 1450 of ioascii.cc

    1445 for (int d=0; d<outdim; ++d) {
    1446   if ((*it)[d]!=(*ext.end())[d]) break;
    1447   if (not compact_format or ext.shape()[d] > ext.stride()[d]) {
    1448     // In the compact format, don't separate outputs that
    1449     // consist of a single lines only
    1450     os << eol;
    1451   }
    1452 } 
    

    which is output with all points that touch the outer edge of the output for any component (ie. any one of x = xmax, y = ymax, z = zmax holds). Unfortunately this means that it will always trigger for the last point in the output region for a component and since the components are already process decomposed ones, there is one black line per MPI process. A way around would have been to only output a black line once we are at the edge of the containing superregion, which unfortunately requires the code to compute if this components will be (once we are all done with output) the last one output for this superregion. It is also hard to insert the separating blank line at the beginning of a each chunk since non blank line should be output if only a single line of output will be produced.

  9. Frank Löffler
    • removed comment

    Might not also the order of the lines be different for especially a lot of processes? Wouldn't it be better to have the perl comparison tool understand these differences not being important? One way could be to pre-process the files before diffing: sort by iteration and coordinates (or indices), removing blank lines and comparing then?

  10. Erik Schnetter reporter
    • changed status to open
    • removed comment

    The attached patch changes the number of processes with which test cases run.

  11. Roland Haas
    • removed comment

    Until we change either the test comparison algorithm to ignore white space or make sure output from Carpet's IOASCII thorn is processor independent I believe the patch in comment:10 would cause test failures when a test is run on a different number of processes than it was designed on (at least for the test suites where I explicitly specified a number of processes that was always due to the test failing otherwise [eg grid too small to be split over two processes] or producing different answers [eg CarpetIOASCII]).

  12. Erik Schnetter reporter
    • removed comment

    The patch in comment:10 ensures that a test is always run on the number of processes it specified. Currently the test would be skipped; with this patch, it is run, using the correct number of processes.

  13. Frank Löffler
    • removed comment

    Does Cactus currently output how many MPI processes have been used for each test separately? We would need that in order to not get confused about how many processes have in fact been used because with this patch it might be different to what has been specified.

    Also, can we have this tested before a commit: Are mpi implementations going to choke on specifying more processes in mpirun than have been requested using the queue?

  14. Roland Haas
    • removed comment

    Replying to [comment:13 knarf]:

    Does Cactus currently output how many MPI processes have been used for each test separately? We would need that in order to not get confused about how many processes have in fact been used because with this patch it might be different to what has been specified.

    Also, can we have this tested before a commit: Are mpi implementations going to choke on specifying more processes in mpirun than have been requested using the queue?

    The ones using Infiniband usually do in my experience, at least if you were to actually ask for more processes than there are cores on the node. I tried this occasionally when trying to debug something that needed many cores but not much memory on the clusters where you can get an interactive node (ranger,zwicky,kraken [I believe]).

  15. Ian Hinder
    • removed comment

    I wouldn't worry about this. Typical machines have at least 4 cores per node, and I don't think any of the tests ask for more processes than this. We can still ask for as many OpenMP threads as we like, in order to test OpenMP.

    Cactus does not output the number of MPI processes used for the test, but Carpet does, to standard output ("There are XX processes in total"). It might be convenient to modify the test system to output the number of processes to its log file for each test as well. Of course, this is going to break all the ad-hoc parsers because we are not using an extensible output format.

  16. Frank Löffler
    • removed comment

    This is a problem regardless of the number of cores. Some mpi implementations look for a file containing a list of the nodes used, together with the number of processes on each, e.g.

    cnode01: 4 cnode02: 4

    Now consider such a file when requesting only one process

    cnode01: 1

    When you then try to run with two the mpi implementation would complain that it doesn't know where to run two processes. After all, the list only contains one and you are not supposed to run on more than one because you only requested one through the queuing system.

  17. Erik Schnetter reporter
    • removed comment

    A queueing system does not provide MPI processes and OpenMP threads, it provides nodes and cores. One has to map these manually, in particular when OpenMP is used; the queueing systems' output is usually not good enough for this. The respective logic is present in most Simfactory run scripts. Simfactory supports over- and under-subscribing, and this works fine on most systems (probably on all except BlueGene/P).

    In your case above, most MPI implementations will wrap around, and assign the second process to cnode01 as well.

  18. Frank Löffler
    • removed comment

    I agree that simfactory usually handles/parses/changes these lists. However, simfactory's wouldn't know here that one particular testsuite should be run with a different number of processes than specified (even to simfactory). The problem isn't really made easier because of simfactory.

    If an mpi implementation wraps around there should be no problem, yes. However, some don't because especially when asking for 1 process (and, e.g., one openmp thread, you might actually get only part of a node and running two processes there would oversubscribe the system and hurt other users (granted, using openmp would as well). In any case: what do you propose to do on systems that are not wrapping around? Currently the test would be marked as failure.

    Could we not avoid this issue by only overriding the users' choice if the number of processes specified by the testsuite is smaller than what the user specified?

  19. Erik Schnetter reporter
    • removed comment

    Couldn't we just test it and see what happens? I have not encountered problems.

  20. Ian Hinder
    • removed comment

    HPC machines must support the ability to run different numbers of MPI processes due to not all codes supporting OpenMP. At least up to the number of cores, there is always a way to run that many MPI processes. For a given machine, we will be able to come up with simfactory command-line arguments which give the correct number of MPI processes, and then we can choose the number of threads separately from this. Assuming that simfactory requests resources sufficient to run the maximum number of MPI processes required by any test, there is no limit imposed by the queuing system.

    A separate issue (as I think you are referring to) is that if simfactory is configured to run a simulation on N processes, the run script will have this hard-coded after expansion by simfactory. i.e. the run script (which is used in the "mpirun" command by the simfactory testsuite mechanism, http://git.barrywardell.net/simfactory2.git/blob/HEAD:/bin/RunTestSuite) doesn't look at the -np option given by the test system. We should think about how we can modify simfactory to run the correct number of MPI processes in this case, unless Erik has done so already.

    Erik: I believe the only potential problem is the one related to simfactory, and that is definitely a problem unless it has already been fixed.

  21. Frank Löffler
    • removed comment

    I agree that we are able to request more processes through the queuing system than needed, and then run (most tests) with less and a few (that request more) using up to the requested amount. But that is not my point. Applying this patch as is would break the testsuite mechanism for cases that currently work, it would be a regression. Simfactory might be used to circumvent this a bit, but it also has to work without.

    Let me suggest something else. What I believe is a problem is that with this patch Cactus will try "to be smart", overrides a users decision and might actually fail. What I believe the intention of the patch is that Cactus should run as many testsuites as possible, even if that means using different numbers of processes for different tests. So, why don't we add an option to the test system that, instead of specifying a fixed number of processes, specifies 'as many processes as needed, and if nothing is specified then use the default of 1 (which we should be able to change, and simfactory might actually do that). This would introduce one more option for the testsuite mechanism. It wouldn't break anything if this is not the default in Cactus, and it could be the default within simfactory because there we know whether a machine supports over-subscribing or not.

  22. Roland Haas
    • removed comment

    I just tried it using simfactory on lonestar. I used

    sim create-submit bns_all-tests --procs 1 --testsuite --configuration bns_all --walltime 0:20:0 which unfortunately then ran all tests on 1 process even the one (ML_BSSN_Test/ML_BSSN_NewRad) that requested 2 processes. I suspect this is due to simfactory not going through the test-suite mechanism and instead uses its own RunScript to start the run.

    The test still claimed everything worked though close inspection showed that it says "Success: 0 files identical" and also that in fact the files are not identical (ran a vimdiff). So Ian's comment points out a real issue.

    I have also tried the reverse and was offered to run a 1 processor test with 2 processes on my workstation:

    Issuing mpirun -np 2 /mnt/data/rhaas/postdoc/gr/Zelmani/exe/cactus_bns_all /mnt/data/rhaas/postdoc/gr/Zelmani/arrangements/EinsteinAnalysis/AHFinderDirect/test/checkpointML.par

    Something is still odd.

  23. Roland Haas
    • removed comment

    Ignoring NPROCS for test was due to the NPROCS information not being passed along in the correct data structure (testdata vs. rundata). I updated the patch and attach a replacement version. This does nothing for simfactory of course.

  24. Roland Haas
    • removed comment

    Cleaned up the patch and added code to output the number of MPI processes used for a test if it is not the user specified default.

    We should also rename the $nprocs_required and $nprocs_available variable names to something more in line with what we (now) do. Maybe $nprocs_requested_by_test and $nprocs_requested_by_user .

  25. Ian Hinder

    Did anyone test that this new version works with SimFactory before committing the change? I explained why it would not work (comment:20), and Roland tested that this was the case. As of now, I think the test system tries to run all the tests, and assumes that it can set the number of processes in the mpirun command. This does not work; SimFactory assumes that @NUM_PROCS@ is correct, and this doesn't come from the test system. If you now try to run a set of tests with different process number requirements, some of them are always going to fail, because they are always run on the same number of processes.

    I think that such a drastic change to the test system, one that breaks the ability to use simfactory to run the tests, should not be applied just before a release. I propose reverting this commit, and revisiting this after the release when we can come up with a comprehensive solution that ensures that simfactory can still work correctly.

  26. Roland Haas
    • removed comment

    Well I was also the one to commit the flesh change to the test system. So clearly I had forgotten my own comment. I did not test it with simfactory at that time.

    Reverting it would be ok with me unless we can come up with a tested, simple solution.

    I am not familiar with way simfactory interface with the test system (but assume it to be horrible). Would a carefully crafted "mpirun" option to the test system be sufficient, ie. one that does simfactory run --procs XXX --num-threads YYY ?

  27. Erik Schnetter reporter
    • removed comment

    Simfactory uses the "make sim-testsuite" command, setting several environment variables to pass options. This part of the interface is quite clean.

    Let's revert the patch.

  28. Erik Schnetter reporter
    • removed comment

    I notice that you moved the release milestone, which also happened for the last release. Does this make sense? Either it is release critical (then it needs to be addressed), or the milestone can easily be moved (then it may as well be removed altogether). I don't see any particular reason why this feature is important for the 2013 November release.

  29. Frank Löffler
    • removed comment

    Moving the release target again is unfortunate, I agree. It is not critical, however. The release tag should only indicate that this "should be fixed by that release".

    It would be nice if issues with the test system would get more attention though. I would think they should be critical. The past showed that while most probably agree with that statement, reality tells differently. As for the next release: I don't think whatever solution we might come up with would make it in the release. There isn't enough time for it. I still think this is important enough to keep reminding ourselves.

    On this specific issue: the underlying root issue is that most testsuites that set a specific number of processes run just fine on other numbers, but their output can currently not be successfully compared by Cactus. This could be fixed by either Cactus being able to compare the files, or by having processor-number-independend output files. The former was started by a student at LSU, but died after the student decided to stop working on it (undergraduate...). The latter would be nice, but didn't happen so far and especially performance implications are unclear. So, in order to pick up one or the other again in the future we should keep this ticket open - and it is important enough to keep reminding us about this embarrassing issue.

  30. Ian Hinder
    • removed comment

    I see the milestone as an indication that this is a goal for the next release. This is something that we explicitly want to work on, rather than something which is just "major", with no indication of whether it will ever be addressed. Being a milestone does not mean that it is a release blocker, or is critical in any way. The fact that it keeps slipping means that there have been insufficient resources to implement it so far, but I think the milestones should be allowed to move like this, as they demonstrate intention to work on things.

  31. Ian Hinder
    • removed comment

    I believe that this can be made to work with SimFactory, but it will require modifications to SimFactory. When SimFactory needs to run the tests as part of a simulation, it does this by using the file simfactory/bin/RunTestSuite as its "run script", passing the real machine runscript as an argument to that script. RunTestSuite then sets CCTK_TESTSUITE_RUN_COMMAND to call the passed runscript on '$parfile', which is supplied by the test system. The test system will also supply '$nprocs', since this command is usually an mpirun command, but simfactory ignores this at present. The reason is because the runscript has already been expanded at this point, so that @NUM_PROCS@ has been replaced with the number of processes that the whole simulation is running on. One solution would be to modify simfactory so that, in the case of running a testsuite, it expands @NUM_PROCS@ to a reference to an environment variable, e.g. $NUM_PROCS. The CCTK_TESTSUITE_RUN_COMMAND set in RunTestSuite can then set this variable to '$nprocs' before launching the machine's run script. This is similar to what is done for the parameter file; there is special code in simfactory to look at a TESTSUITE_PARFILE environment variable which is passed by the CCTK_TESTSUITE_RUN_COMMAND.

    This assumes that changing only NUM_PROCS in the machine run script is sufficient to do what we want; other variables might have been used, and some mpirun commands might take the number of cores or something similar instead. I expect this to break some systems.

  32. Frank Löffler

    What we should do (after the ET_2013_11 release) is: - When a testsuite doesn't specify a certain number of processes, run it with however many have been specified as default (and if nothing was specified there, use some reasonable default which at the moment is 2) - When a testsuite does specify a particular number of processes, use this number. In particular (at least for the moment) also use more than had been specified as default. Since currently we don't have testsuites with a large number of processes this is hopefully fine on most machines. We might have to change this though if we encounter too many problems or turn towards testsuites with too many processes such that single-processor systems would be unreasonably overloaded. What we likely would do then is using the specified default as upper bound. Testsuites that specify more resources wouldn't be run in that case.

  33. Roland Haas

    The pull request https://bitbucket.org/cactuscode/cactus/pull-requests/73/rhaas-lowproctests/diff implements the more conservative approach of running all tests requesting up to NPROCS processes. This avoids the issue of oversubscribing, which is indeed an issue on some clusters where asking the queuing system for X cores gets be exactly X cores even if the node I run on (even exclusively) has Y>X cores. It also avoids issues with simfactory’s RunScripts tying the code to cores which will fail if more cores are used than what simfactory thought.

    OpenMPI can, and by default on Debian / Ubuntu is, also be configured to disallow oversubscribing a machine, which is an issue already on student laptops with 2 cores for the 4 process tests.

    Even undersusbscribing can fail however in cases where mpirun (or its equivalent) is too tightly interwoven with the queuing system and mpirun always starts as many MPI ranks as were requested from the queuing system (and eg SLURM does now how many MPI ranks are used).

    There is one further thing to consider: the implementation as proposed (and all older ones as far as I can tell) make it impossible to for example run only the 4 process tests and not the 1 and 2 process tests.

    Please review.

  34. Steven R. Brandt

    @Roland Haas if I understand what you’re saying, you want to allow a test designed for 6 processes to run on 4? In my experience with CaFunwave, if you change the number of processes, the ascii IO format sometimes changes resulting in false negatives. AFAIK, you have to use exactly the same number of processes if you want the test to reliably pass.

  35. Roland Haas

    Not quite. Tests will always run with the number of processes they specify (if they specify any), ie if NPROCS is specified in test.ccl for either the thorn or the test parfile then that many MPI ranks will be used to run the test. However if the testsuite is run with say NPROCS=6 available then the tests that specify NPROCS=4 will be run, but run with 4 MPI ranks. Right now those tests would not be run at all.

    The number of NPROCS that is used is set in lines 1600ff https://bitbucket.org/cactuscode/cactus/pull-requests/73/rhaas-lowproctests/diff#Llib/sbin/RunTestUtils.plT1600 :

    my $NPROCS;
    if($rundata->{"$thorn NPROCS"}) {
      $NPROCS = $rundata->{"$thorn NPROCS"};
    } elsif($rundata->{"$thorn $test NPROCS"}) {
      $NPROCS = $rundata->{"$thorn $test NPROCS"};
    } else {
      $NPROCS = $config_data->{'NPROCS'};
    }
    $cmd =~ s/\$nprocs/$NPROCS/g;
    

    where the number of MPI ranks uses is set to either the NPROCS setting for the thorn or the test or the default chosen by the user. The only tests marked as runnable are those who request an NPROCS smaller or equal then the one in config_datawith those that request more been set to UNRUNNABLE. This is done in line 2130ff https://bitbucket.org/cactuscode/cactus/pull-requests/73/rhaas-lowproctests/diff#Llib/sbin/RunTestUtils.plT2130 :

    elsif ($nprocs_required > $nprocs_available)
    {
      $testdata->{"$thorn UNRUNNABLE"} .= "$testbase ";
      $testdata->{"$thorn $testbase NPROCS"} = $nprocs_required;
    

    where the elsif condition used to be $nprocs_required != $nprocs_available ie. all tests requiring a number of MPI ranks different from the default were marked as not runnable.

    Having said this, I notice that the conditions in 1600ff are in the wrong order since a test specific NPROCS should take precedence over a thorn specific NPROCS so the if and elsif branches should be swapped.

  36. Roland Haas
    • changed status to open

    Works in plain Cactus ie with make sim-testsuite but breaks in simfactory --testsuite due to simfactory having to use the RunScripts and those scripts hard-coding the number of MPI ranks to @NUM_PROCS@.

    I am testing a fix to simfactory right now.

  37. Log in to comment