Suspicious logic in NewRad thorn

Issue #1261 closed
Wolfgang Kastaun created an issue

In NewRad boundary conditions, there is some logic to determine for which faces/edges/corners BC should be applied. This logic is duplicated in two files, extrap.cc and newrad.cc. In the latter, there is the line (324 in the Oersted release)

all_physbnd_or_ghostbnd = all_physbnd_or_ghostbnd and (is_physbnd[2d+0] or not is_ipbnd[2d+0]);

for the left boundary, while the same thing for the right boundary reads

all_physbnd_or_ghostbnd = all_physbnd_or_ghostbnd and (is_physbnd[2d+1] or not is_ipbnd[2d+1]);

in line 337. In extrap.cc, the right boundary is treated the same, for the left one however we have in line 136

all_physbnd_or_ghostbnd = all_physbnd_or_ghostbnd and (is_physbnd[2d+0] or is_ipbnd[2d+0]);

Note the "or" instead of "or not". I have no idea what this conditions are supposed to do, but it looks inconsistent. If it is correct, can someone enlighten me ?

Keyword: backport

Comments (14)

  1. Frank Löffler
    • removed comment

    I agree that this looks very much like a bug to me. Thanks for reporting it. Before fixing it I would however rather have someone else also looking at it.

    I am also unsure about how to actually fix it. "all_physbnd_or_ghostbnd" is commented to be "all boundary faces are either physical boundaries or ghost zones". However, in three cases it is calculated to true "if physical or not inter-processor" and only in one case "if physical or inter-processor". Later, this variable is used to do work only if it is true. This I understand as "only apply the boundary conditions (at a given point) when there are no symmetry boundaries in any of the three directions -- which makes sense.

    However, as I read the code (and apart from the inconsistency mentioned by Wolfgang), currently no boundary conditions are applied at points that happen to be at the outer boundary, but also at a inter-processor boundary.

    If this interpretation is correct, I believe the correct fix would be to remove the 'not' from the three offending lines:

    extrap.cc:153 newrad.cc:329 newrad.cc:342

    This would also mean that most simulations using NewRad on more than on processor didn't have the outer boundary applied at all necessary points, and that runs using symmetries might have had outer boundary conditions applied where instead symmetry should have been applied. Also, other wrong values might have been used because of the inconsistency in extrap.cc (which seems to be correct for lower boundaries). In any case, this should only influence values at the very outer boundary - not at mesh refinement boundaries.

  2. Erik Schnetter
    • removed comment

    The condition (is_physbnd or not is_ipbnd) does not make sense; since physbnd and ipbnd are exclusive, this is equivalent to (not is_ipbnd).

    The extrapolation and radiative boundary conditions have special cases for edges and corners. Thus they give incorrect results when applied on and edge or in a corner that is not a physical edge or corner. Thus I think the correct condition should be to remove the condition all_physbnd_or_ghostbnd, and to use the condition (have_bnd and any_physbnd and all_physbnd) in the end.

    Given the nature of this boundary condition, this means that symmetries need to be applied after this boundary condition, and one also needs to synchronise after applying this boundary condition. If this is done, errors on symmetry or ghost zones do not matter, which may make the noticed inconsistency in the code un-observable in practice.

    For safety, we may want to set all points that are (have_bnd and any_physbnd and not all_physbnd) to nan, because these are the outer boundary points that we did not set, and which thus need to be set by symmetries or synchronisation.

  3. Frank Löffler
    • removed comment

    After talking with Erik: What about this patch? It would need to be applied to the release as well.

  4. Ian Hinder
    • removed comment

    McLachlan has a "native" radiative boundary condition in the McLachlan_BSSN.m script which is not used, as most people use NewRad. This native implementation is wrong in the master branch, and I corrected it in the cakernel branch some time ago. At the same time, I created a test case for it, and cross-checked the test results with those from NewRad, and the results were the same to machine precision (3D gridfunctions checked). I then added the NewRad test to the master branch. My point is that the existing NewRad test in the master branch gives identical results to the (corrected) native implementation, so I trust it. So I agree with comment:3 that this bug might not be observable. Of course, this is only for one particular grid setup, set of parameters, number of processes etc.

    The corrected native implementation should probably be merged into the master branch.

  5. Frank Löffler
    • removed comment

    I agree. Please open a ticket for the native implementation and propose a patch.

    As for the NewRad thorn: it might well be that it is not observable. Does the test use multiple MPI processes (it would only show then, if at all)?

    Concerning two versions: how do these differ? Does it make sense to have two?

  6. Ian Hinder
    • removed comment

    It uses 2 processes; maybe this wouldn't be enough? I haven't followed the details of the discussion above.

    NewRad is more general; it has more features. I've never used them though.

  7. Frank Löffler
    • removed comment

    I still don't see a clear review in one or the other way. The current implementation is clearly wrong, whether observable or not. IMHO it should at least be applied to the development version. The current problem doesn't seem to be the patch, but the unknown effect the bug has. It is clear that if there is an effect at all, it cannot be large, or it would have been noticed much earlier.

    Ian: can you please test the current patch against the version you trust?

  8. Ian Hinder
    • removed comment

    The test case is ML_BSSN_Test/test/ML_BSSN_NewRad.par. It's probably easier for someone who already has the patched version to test this. That person should be running the whole ET testsuite before committing the patch anyway.

    This test produced the same results (within tolerance) as the Kranc radiative boundary condition currently in the cakernel branch. I know this because the test output data was copied from one test to the other without modification.

  9. Frank Löffler
    • removed comment

    Patch 'NewRad' didn't pass the testsuite, which I have to assume is correct, according to Ian. The new patch 'NewRad2' does correct the inconsistency in the code and removes the unused variable; and it passes the testsuite. It also renames 'all_physbnd_or_ghostbnd' to 'all_not_ipbnd', as that is what is actually calculated. The respective NewRad kernels are, using the new variable name, only called

    if (have_bnd and any_physbnd and all_not_ipbnd)

    Please review.

  10. Log in to comment