comet files in simfactory use one MPI rank per node

Issue #2187 resolved
Roland Haas created an issue

The current (https://bitbucket.org/simfactory/simfactory2/src/master/mdb/machines/comet.ini) uses 1 MPI rank per node:

max-num-threads = 24
num-threads     = 24

This is usually not the best way to set things up, I would eg have expected that the default choice would be something like 1 MPI rank per NUMA domain. Given that, unless limited by communication overhead, we seem to obtain fastest per-node performance when using only MPI and no OpenMP (about a factor of 50% speedup on my 12 core workstation with 2 NUMA domains) if anyone is using Comet for production work and wants to contribute their machine description file that would be great.

Keyword: None

Comments (6)

  1. Roland Haas reporter

    Private conversation with users on Comet that are using it for production runs (in 2018 so I am a bit tardy reporting this) indicate that best performance was achieved when using 4 threads per MPI rank and 4 MPI ranks per node ie leaving 8 cores per node empty (Comet has 24 cores per node) gave best results.

  2. Roland Haas reporter

    Changed to use 6 threads per rank in git hash 5ea0f7b "comet.ini: use 6 threads by default" of simfactory2 as stopgap measure. This needs to be properly measured with a couple of test runs to find a good setting for runs typical for comet.

  3. Log in to comment