- removed comment
PITTNull can take a very long time to compile with the Intel compiler
Compiling on stampede2-skx (2018_02 simfactory files, so ifort 18.0.0) the file NullConstr/src/NullConstr_R00.F90
takes a very long time (103 minutes an counting) to compile.
My guess is that one has to translate its array operations to loops (similar to what was done for other source files in PITTNull), otherwise the compiler just takes too long to try and optimize the code.
Keyword: NullConstr
Comments (14)
-
-
reporter - removed comment
Thank you. Just to be sure: before adding the directive, compilation was slow?
-
- removed comment
It was, but I didn't let it run to completion. I waited 25 minutes and then gave up.
-
reporter - removed comment
It also seems as if we already replaced array constructs with explicit loops in 388c945 - NullConstr: replace array operations by explicit loops (4 years, 9 months ago) and it seems is now the Intel compiler is finally smart enough to try and optimize the code so Yosef suggestion (maybe with an
#ifdef __INTEL__
around it) seems like the best solution. -
reporter - removed comment
@Yosef: could I impose on you to check which other files are affected (by adding the optimization option to the ones that are slow one by one) and to prepare a patch, please? The
#ifdef
construct I mentioned would be this (note that this is doing the "right thing" even when a user specified-O0
on the command line [see the docs linked below] "The procedure is compiled with an optimization level equal to the smaller of n and the optimization level specified by the O compiler option on the command line."):#ifdef __INTEL__ !DIR$ OPTIMIZE: 2 #endif
For reference: those comments / pragma are described here: https://software.intel.com/en-us/node/693375#30475674-0480-4EF2-A02E-68BAD8BBA87F and I checked and indeed our use of the cpp preprocessor hides the
!DIR$
comment from the Fortran compiler. -
- removed comment
The code does not seem to be tested by any testsuite. Also, depending on how F90 files are preprocessed, the directive may be hidden. I tried to add Null_Constr output to the existing testsuite, and there are NaNs at t=0 (but not later).
-
reporter - removed comment
I checked that our FPP preprocessor line "cpp --traditional" keeps the directive around if
__INTEL___
is set. I do not know if NaNs at t=0 are of concern or if this is a manifestation of the thorn relying on quantities that are only computed in the evolution loop but not duringCCTK_INITIAL
.It not being tested by a testsuite does not terribly surprise me, to tell the truth.
-
- removed comment
What sets the INTEL macro?
-
reporter - removed comment
The Intel compiler defines this macro (and of course eg gfortran does not) so that one can use it to check if the code is being compiled with an Intel compiler and only disable the optimization in that case, since gfortran does not show the slow compilation.
-
- removed comment
I attached two patches. One to include the compiler directives, the other the include Null_Constr data in the SphericalHarmonicRecon testsuite. I needed to increase the value of RELTOL to 1.e-9 in order for the latter to pass. I generated the testsuite data from the original NullConstr code first committed to the toolkit (the one that used array operations, rather than explicit loops). The test works against the latest version of NullConstr (with the above mentioned patch)
-
reporter - changed status to open
- removed comment
Thank you.
-
reporter - changed milestone to ET_2018_08
- removed comment
-
- changed status to resolved
- removed comment
Applied as commits:
- 4e86e93 - (HEAD -> master, origin/master, origin/HEAD) SphericalHarmonicRecon: add check for constraints (7 days ago) <Yosef Zlochower>
- 6076941 - NullConstr: reduce optimization level to speed up compilation (7 days ago) <Yosef Zlochower>
- 2e366e6 - NullConstr: return 0 in first step when constraints cannot be evaluated (7 days ago) <Yosef Zlochower>
of pittnullcode.
-
reporter - edited description
- changed status to closed
- Log in to comment
The source looks like some automatic code generation output with very long expressions. Adding the directive "!DIR$ OPTIMIZE: 2" sped up compilation on my machine to just a few seconds. I didn't check the performance of the resulting code...