CactusNumeric/Slab: Improve performance

Issue #374 closed
Erik Schnetter created an issue

The attached patch does the following, prompted by performance problems reported by Christian Ott:

Reorganise some of the internals of thorn Slab:

Use LoopControl to parallelise loops via OpenMP.

Refactor the "work horse" routines that perform the actual copy routines. These routines are specialised for common cases that need to execute efficiently, in particular for the cases encountered in RotatingSymmetry90 and RotatingSymmetry180 when handling CCTK_REAL variables.

Offer an additional API (Slab_MultiTransfer_Init, Slab_MultiTransfer_Apply, Slab_MultiTransfer_Finalize) that calculates the communication schedule only once, and then re-uses it in further calls. This avoids some communication overhead.

Remove old CVS header comments.

Keyword:

Comments (4)

  1. Roland Haas
    • removed comment

    I tried this patch with the current set of testsuites for the symmetry thorns. The testsuites still pass. Please apply.

  2. Log in to comment