Provide always-working isnan etc.

Issue #1563 closed
Erik Schnetter created an issue

Certain math optimization options (e.g. -ffast-math) tell the compiler that IEEE floating point numbers such as inf and nan do not need to be handled correctly (in the sense specified by the IEEE standard). This greatly improves floating-point speed and is commonly used in numerical HPC applications.

For example, fmax() specifies:

If exactly one argument is a NaN, fmax() returns the other argument. If both arguments are NaNs, fmax() returns a NaN.

Implementing this correctly requires checking whether each argument is a nan. To improve speed, one can omit this check, which means that fmax() may return NaN, even if one of its argument is not a NaN. This is fine in most cases, and people appreciate the added speed.

However, since compilers then don't need to handle inf and nan correctly, they have begun to optimise isnan(x) to simply returning false all the time. This improves speed (since the check does not actually need to occur) and reduces code size (since the nan-handling if branches can be omitted). Of course, this makes it then impossible to actually check for nan by calling isnan.

Currently, e.g. g++ performs this optimisation, whereas gcc does not. Things vary with other compilers. In the future, with link-time optimisations, I expect other compilers to follow g++.

The enclosed patch provides functions CCTK_IEEE_isnan etc. that always check for nan, independent of the chosen optimisation flags.

Keyword:

Comments (7)

  1. Roland Haas
    • changed status to open
    • removed comment

    This is becoming ridiculous. I'd consider optimizing out an explicit call to isnan() a compiler bug. Is there a compiler option to turn those back on maybe?

    Otherwise, fine with me, if this is what it takes. I'd even consider having CCTK's isnan function (which we already provide since the C++ math is inconsistent between compilers) man to CCTK_IEEEisnan. Do we have any idea if/how much using the home-brew CCTK_IEEEIsnan is slower than using (a properly functioning of course) isnan (I expect the later to be a single machine instruction)?

    The patch as is lacks support for float which is useful if CCTK_REAL is a 4 byte float rather than the 8 byte double (and also lacks support for long double).

  2. Erik Schnetter reporter
    • removed comment

    The meaning of isnan is specified in the C/C++ standard in the same way as e.g. sqrt. If you call sqrt(1.0), then you expect the compiler to evaluate this at compile time in the same way as isnan(1.0); since the compiler knows the result, it doesn't need to emit a run-time call. If you then use -ffast-math, which means that you tell the compiler that it should assume that no calculation ever results in a nan and thus the expensive special cases for nan can be ignored, then the compiler will assume that isnan always returns false, and will omit the call.

    No, there is no option to avoid this, except to not use -ffast-math, which prevent all sorts of very useful optimizations.

    I expect that the speed difference between CCTK_isnan and isnan is mostly that isnan can be inlined, while CCTK_isnan cannot. Apart from this, they will likely lead to identical code on x86 architectures. On other architectures, CCTK_isnan may be significantly more expensive than isnan.

    I will add support for float and long double as well.

  3. Frank Löffler
    • removed comment

    What we seem to want is that the compiler honors calls to isnan, so fast_math does not seem to be an option. Isn't there a better way to tell the compiler which optimizations it is allowed to do, and which isn't? fast_math is only a shorthand for a whole list of options usually.

  4. Erik Schnetter reporter
    • removed comment

    Compiler optimizations are usually not specified via a set of transformations that are allowed, but rather via a set of properties that the compiler can assume are valid in any program it sees, and that it needs to preserve when it generates machine code. By default, such a property is that the source code adheres to the IEEE floating point standard, and the compiler thus preserves the meaning of the source code. (We all know that this standard is too strict for us.)

    Thus there is -ffast-math, which means that we use a looser floating point standard where we do not care about "proper" behaviour of nans. For example, fmax(nan, 1.0) should return 1.0 ("man fmax"). Implementing this correctly requires checking each argument whether it is nan before actually performing the comparison to see which argument is larger, as in:

    inline double fmax(double x, double y)
    {
      if (isnan(x)) return y;
      if (isnan(y)) return x;
      return x>=y ? x : y;
    }
    

    Since we want fast code, we tell the compiler that we don't care about correct behaviour for nans. One of the easiest and most systematic ways to do this is to simply say that isnan and isinf never return true. This will automatically avoid all special cases for inf and nan, leading to faster code.

    Of course you may say "I only want to optimize out the isnan calls that I didn't write myself", but that is difficult to judge for the compiler. The compiler -- in particular in C++, where almost all of the standard library exists as header files that are compiled as needed, together with the application code -- cannot tell whether a particular isnan call has been inserted into a standard function (and should be omitted), or has been inserted into a error-checking function (and needs to be kept). For the compiler, either nans need to be treated as specified in the source code (keep the isnan calls), or whether it can optimize out the nan-handling cases (assume isnan is always false).

    One could, conceivably, have two kinds of isnan calls: the "annoying" once in fmax that should be omitted, and the "important" ones that we write in NaNChecker. Alas, the C and C++ standards don't make that distinction, they only specify one kind of isnan, namely the one that should always be kept. There is not even a proper specification of floating point numbers when -ffast-math is used. Thus, we need to roll our own "never-remove" isnan calls.

    We could try to find compiler options (that's highly compiler specific) that ignore certain IEEE properties (associativity, reciprocal) while keeping others (nan behaviour of fmax etc.). I don't think it's worthwhile.

  5. Log in to comment