Carpet: Disable OpenMP parallelization of transport operators

Issue #2072 closed
Erik Schnetter created an issue

This disables the OpenMP parallelization of Carpet's transport operators. I have observed that this leads to a significant speedup when many threads are used.

The likely reason is that the regions which are parallelized are typically small. A typical reason would e.g. be the lower x-boundary of one component of one grid variable. The OpenMP thread startup overhead and the cache misses caused by parallelizing this are then larger than any benefit.

See https://bitbucket.org/eschnett/carpet/pull-requests/18/carpet-disable-openmp-parallelization-of/diff.

In a next step (not proposed here), we can parallelize transport operators again, but at a much higher level, e.g. at the level of the loop over all variables that need to be prolongated. However, Carpet currently (and quite unfortunately) uses static variables to hold pointers to timers, and these are not thread-safe. (Neither the static variables nor the timer implementation itself are.) This needs to be either corrected or disabled, which will be the topic of a further pull request.

At this time, I ask some of those who are interested in performance in trying this pull request on a few iterations of a production simulation that uses many OpenMP threads and report back here.

Keyword:

Comments (11)

  1. Roland Haas
    • removed comment

    @sbrandt: I believe this is the ticket Erik mentioned in the call on Monday 2017-08-28 .

  2. Ian Hinder
    • removed comment

    I am unable to compile this branch. I am using master of the ET, and the eschnett/no-openmp branch of Carpet. I am getting errors

    COMPILING arrangements/CactusBase/Boundary/src/Check.c
    COMPILING arrangements/CactusBase/Fortran/src/cctk_Timers.F90
    COMPILING configs/noopenmp/bindings/build/AEILocalInterp/cctk_ThornBindings.c
    COMPILING arrangements/CactusBase/Fortran/src/cctk_Types.F90
    Creating /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_IOUtil.a
    COMPILING arrangements/CactusBase/Fortran/src/cctk_Version.F90
    /bin/sh: /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_IOUtil.a.objectlist: No such file or directory
    make[3]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_IOUtil.a.objectlist] Error 1
    make[2]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_IOUtil.a] Error 2
    make[1]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_IOUtil.a] Error 2
    COMPILING arrangements/CactusBase/Fortran/src/cctk_WarnLevel.F90
    COMPILING arrangements/CactusBase/Fortran/src/util_Table.F90
    COMPILING arrangements/CactusBase/Fortran/src/paramcheck.F90
    Creating /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_TensorTypes.a
    /bin/sh: /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_TensorTypes.a.objectlist: No such file or directory
    make[3]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_TensorTypes.a.objectlist] Error 1
    make[2]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_TensorTypes.a] Error 2
    make[1]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_TensorTypes.a] Error 2
    Creating /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_CycleClock.a
    /bin/sh: /home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_CycleClock.a.objectlist: No such file or directory
    make[3]: *** [/home/ianhin/Cactus/Optimisation/configs/noopenmp/lib/libthorn_CycleClock.a.objectlist] Error 1
    

    Do I need a different version of the flesh/build system? The same Cactus tree builds fine on the master branch of Carpet.

  3. Ian Hinder
    • removed comment

    This error went away when I deleted the configuration and rebuild it. There must be a bug in the build system, because an interrupted build (I may have cancelled it at one point) should not cause this sort of problem.

    Anyway, I have timing results for the BBH runs with and without the patch. There is no appreciable difference, either in the total evolution time, or the prolongation timer.

  4. Ian Hinder
    • removed comment

    According to Erik (private communication), the branch does not do anything unless certain parameters are set, which explains why the performance is unaffected. However, the parameters he mentioned are not in this version of the code, suggesting that this is not really the branch that should be tested. I suggest taking this ticket out of "review" state, since it looks like the code is not quite ready yet? (Trying to pull discussion back into this ticket and out of private email).

  5. Roland Haas
    • changed status to resolved
    • removed comment

    Th code as is should not be committed due to the comments in the pull request. This same functionality will be included in a more comprehensive, future pull request.

  6. Log in to comment