PITTNull can take a very long time to compile with the Intel compiler

Issue #2178 closed
Roland Haas created an issue

Compiling on stampede2-skx (2018_02 simfactory files, so ifort 18.0.0) the file NullConstr/src/NullConstr_R00.F90 takes a very long time (103 minutes an counting) to compile.

My guess is that one has to translate its array operations to loops (similar to what was done for other source files in PITTNull), otherwise the compiler just takes too long to try and optimize the code.

Keyword: NullConstr

Comments (14)

  1. Yosef Zlochower
    • removed comment

    The source looks like some automatic code generation output with very long expressions. Adding the directive "!DIR$ OPTIMIZE: 2" sped up compilation on my machine to just a few seconds. I didn't check the performance of the resulting code...

  2. Roland Haas reporter
    • removed comment

    Thank you. Just to be sure: before adding the directive, compilation was slow?

  3. Yosef Zlochower
    • removed comment

    It was, but I didn't let it run to completion. I waited 25 minutes and then gave up.

  4. Roland Haas reporter
    • removed comment

    It also seems as if we already replaced array constructs with explicit loops in 388c945 - NullConstr: replace array operations by explicit loops (4 years, 9 months ago) and it seems is now the Intel compiler is finally smart enough to try and optimize the code so Yosef suggestion (maybe with an #ifdef __INTEL__ around it) seems like the best solution.

  5. Roland Haas reporter
    • removed comment

    @Yosef: could I impose on you to check which other files are affected (by adding the optimization option to the ones that are slow one by one) and to prepare a patch, please? The #ifdef construct I mentioned would be this (note that this is doing the "right thing" even when a user specified -O0 on the command line [see the docs linked below] "The procedure is compiled with an optimization level equal to the smaller of n and the optimization level specified by the O compiler option on the command line."):

    #ifdef __INTEL__
    !DIR$ OPTIMIZE: 2
    #endif
    

    For reference: those comments / pragma are described here: https://software.intel.com/en-us/node/693375#30475674-0480-4EF2-A02E-68BAD8BBA87F and I checked and indeed our use of the cpp preprocessor hides the !DIR$ comment from the Fortran compiler.

  6. Yosef Zlochower
    • removed comment

    The code does not seem to be tested by any testsuite. Also, depending on how F90 files are preprocessed, the directive may be hidden. I tried to add Null_Constr output to the existing testsuite, and there are NaNs at t=0 (but not later).

  7. Roland Haas reporter
    • removed comment

    I checked that our FPP preprocessor line "cpp --traditional" keeps the directive around if __INTEL___ is set. I do not know if NaNs at t=0 are of concern or if this is a manifestation of the thorn relying on quantities that are only computed in the evolution loop but not during CCTK_INITIAL.

    It not being tested by a testsuite does not terribly surprise me, to tell the truth.

  8. Roland Haas reporter
    • removed comment

    The Intel compiler defines this macro (and of course eg gfortran does not) so that one can use it to check if the code is being compiled with an Intel compiler and only disable the optimization in that case, since gfortran does not show the slow compilation.

  9. Yosef Zlochower
    • removed comment

    I attached two patches. One to include the compiler directives, the other the include Null_Constr data in the SphericalHarmonicRecon testsuite. I needed to increase the value of RELTOL to 1.e-9 in order for the latter to pass. I generated the testsuite data from the original NullConstr code first committed to the toolkit (the one that used array operations, rather than explicit loops). The test works against the latest version of NullConstr (with the above mentioned patch)

  10. anonymous
    • changed status to resolved
    • removed comment

    Applied as commits:

    • 4e86e93 - (HEAD -> master, origin/master, origin/HEAD) SphericalHarmonicRecon: add check for constraints (7 days ago) <Yosef Zlochower>
    • 6076941 - NullConstr: reduce optimization level to speed up compilation (7 days ago) <Yosef Zlochower>
    • 2e366e6 - NullConstr: return 0 in first step when constraints cannot be evaluated (7 days ago) <Yosef Zlochower>

    of pittnullcode.

  11. Log in to comment