Modify

Opened 2 years ago

Last modified 8 months ago

#1911 assigned defect

Hydro_InitExcision sphere_pugh_ppm test fails

Reported by: Barry Wardell Owned by: Roland Haas
Priority: unset Milestone: ET_2018_02
Component: Other Version: development version
Keywords: Cc:

Description

The Hydro_InitExcision sphere_pugh_ppm test fails for me when run on 1 process on an Ubuntu 16.04 machine. The diffs are attached.

Attachments (1)

sphere_pugh_ppm.diffs (3.4 KB) - added by Barry Wardell 2 years ago.

Download all attachments as: .zip

Change History (17)

Changed 2 years ago by Barry Wardell

Attachment: sphere_pugh_ppm.diffs added

comment:1 Changed 2 years ago by Frank Löffler

Do you use the standard Ubuntu configuration out of Simfactory? Can you try the debian configuration? The main difference I see could be that the Ubuntu configuration uses the -ffast-math option, while the Debian doesn't.

comment:2 Changed 2 years ago by Barry Wardell

Yes, this is with the ubuntu.cfg currently in Simfactory. I will try the Debian configuration or with -ffast-math removed.

comment:3 Changed 2 years ago by Barry Wardell

Resolution: invalid
Status: newclosed

I can confirm that the test passes (as do other failures in NaNChecker.nancount) when I remove -ffast-math from ubuntu.cfg.

comment:4 Changed 2 years ago by Erik Schnetter

Resolution: invalid
Status: closedreopened

It should not be necessary to avoid -ffast-math to make the tests pass. In Cactus, we have always taken the approach that the order in which floating-point expressions are evaluated should not matter, and that we do not want to rely on IEEE semantics when it comes to NaN (or similar exceptional values).

While it would be nice to use this part of the standard, it (a) slows things down considerably, and (b) the standard doesn't guarantee bitwise identical results anyway.

comment:5 Changed 2 years ago by Frank Löffler

Test suites containing nans have to be avoided then (a good idea anyway), because -ffast-math is not guaranteed to produce them. The only test that I could find using a quick search that contains nans is CT_MultiLevel/test/boostedpuncture.

comment:6 Changed 2 years ago by Frank Löffler

Non concerning nans: the absolute differences in the velocity here are larger than 10e-7. This is usually considered too large for double-precision hydro simulations. Only looking at the Cactus-diff-output I cannot judge why that is, but I suspect these values are supposed to be zero, or close to it, but -ffast-math seems to produce a considerably larger error. This is a problem.

I don't think we are likely to find a workaround for this in time for the release (code that works better even with fast math). I would suggest to remove --fast-math from the Ubuntu option list for the time being.

comment:7 in reply to:  5 Changed 2 years ago by Frank Löffler

Replying to knarf:

The only test that I could find using a quick search that contains nans is CT_MultiLevel/test/boostedpuncture.

#1909

comment:8 Changed 2 years ago by Erik Schnetter

Compiler behaviour is not OS-dependent -- this is probably triggered by a combination of compiler version, compiler flags, and CPU capabilities. I argue against remove -ffast-math from the options; instead, we should either have suitably stable tests, or sufficiently large tolerances.

comment:9 in reply to:  5 Changed 2 years ago by Barry Wardell

Replying to knarf:

Test suites containing nans have to be avoided then (a good idea anyway), because -ffast-math is not guaranteed to produce them. The only test that I could find using a quick search that contains nans is CT_MultiLevel/test/boostedpuncture.

The NaNChecker.nancount test is also expected to produce NaNs, and fails with ubuntu.cfg when -ffast-math is enabled. This is a somewhat different case, however, as the whole point of the test is to check for NaNs.

comment:10 Changed 8 months ago by Ian Hinder

Milestone: ET_2018_02

comment:11 Changed 8 months ago by Ian Hinder

Owner: set to Steven R. Brandt
Status: reopenedassigned

comment:12 Changed 8 months ago by Steven R. Brandt

Owner: changed from Steven R. Brandt to Frank Löffler

comment:13 Changed 8 months ago by Roland Haas

I am not sure the failure is still happening. See https://build.barrywardell.net/job/EinsteinToolkit/lastCompletedBuild/testReport/ which lists only

  • SphericalHarmonicRecon.regression_test/2procs
  • SphericalHarmonicReconGen.SpEC-dat-test/2procs
  • SphericalHarmonicReconGen.SpEC-h5-test/2procs

as failures. Of course this maybe compiler, processor and option list specific.

Last edited 8 months ago by Roland Haas (previous) (diff)

comment:14 Changed 8 months ago by Ian Hinder

This test might fail with -march=native and depend on the CPU being used. We might see failures when we test on multiple machines for the release. Leaving this ticket open for now.

comment:15 Changed 8 months ago by Roland Haas

There are no failures on Jenkins but there are some on ET supported clusters: http://einsteintoolkit.org/testsuite_results/index.php lists some where it does fails:

  • bethe
  • comet
  • hydra
  • stampede2-skx

comment:16 Changed 8 months ago by Ian Hinder

Owner: changed from Frank Löffler to Roland Haas

Modify Ticket

Change Properties
Set your email in Preferences
Action
as assigned The owner will remain Roland Haas.
Next status will be 'review'.
as The resolution will be set.
to The owner will be changed from Roland Haas to the specified user.
The owner will be changed from Roland Haas to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.