Simulation domain volume and reduction weight sum differ

Issue #65 closed
Ian Hinder created an issue

When running the newest Mercurial version of Carpet with the Llama multipatch infrastructure, I get the error

INFO (CarpetReduce): Simulation domain volume: 1 INFO (CarpetReduce): Reduction weight sum: 3718016 ESC[1mWARNING level 0 in thorn CarpetReduce processor 0 host node024.damiana.admin (line 84 of /home/ianhin/Cactus/llama/arrangements/CarpetHG/CarpetReduce/src/mask_test.c): ->Simulation domain volume and reduction weight sum differ

I have reduced the parameter file to the essentials, and it is attached, along with standard output and standard error. The whole simulation is on damiana in /lustre/AEI/ianhin/simulations/chgbug_8.

Versions of components:

Carpet: 3190:c24983d83cdd Llama: 85983412998b4fa7bf2ea7da93db427c471b7c72

Keyword:

Comments (6)

  1. Ian Hinder reporter
    • removed comment

    I used hg bisect command and it told me:

    Due to skipped revisions, the first bad revision could be any of: changeset: 3172:cff502f8527c user: Erik Schnetter <schnetter@cct.lsu.edu> date: Fri Oct 01 15:19:24 2010 -0500 summary: CarpetLib: Store more details for setting up the weight masks

    changeset: 3173:388fbf30768c user: Erik Schnetter <schnetter@cct.lsu.edu> date: Fri Oct 01 15:20:07 2010 -0500 summary: CarpetReduce: Correct errors in calculating the weight function

    The first one didn't compile for me, so I had to skip it. I don't know if 3173 just introduces a more stringent check on something that was silently wrong before, or if there is a mistake in 3173.

  2. Ian Hinder reporter
    • removed comment

    It looks like a real problem which was there before, and is now being detected. In 3171, a wave equation test gives an error which is 8 orders of magnitude larger than in the Git version after a few iterations. I suspect the problem is something to do with restriction.

  3. Peter Diener
    • removed comment

    I have seen something similar using just mesh refinement (i.e no llama's). A slightly modified version of the EinsteinToolkit parameter file qc0-mclachlan.par (just changed the initial data and puncture locations to correspond to D3.0 in order to have a longer evolution) runs for 7.250M (3712 iterations) and then dies after regridding with:

    INFO (CarpetReduce): Simulation domain volume: 432000 INFO (CarpetReduce): Reduction weight sum: 432000.001953125 WARNING level 0 in thorn CarpetReduce processor 0 host abe0515 (line 84 of /u/ac/diener/EinsteinToolkitReleaseHg/Cactus/arrangements/Carpet/CarpetReduce/src/mask_test.c): -> Simulation domain volume and reduction weight sum differ rank 0 in job 1 abe0515_58625 caused collective abort of all ranks exit status of rank 0: killed by signal 9

    The simulation output showing this problem can be found on abe.ncsa.uiuc.edu in:

    /cfs/scratch/users/diener/ET/simulations/d3.0-mclachlan-hg

  4. Erik Schnetter
    • changed status to open
    • removed comment

    Ian, the problem you report is caused by a routine in CarpetReduce that wasn't aware that there can be multiple patches. Please reduce the severity of the call to CCTK_VWarn as an intermediate solution; I will have a better solution soon.

    Peter, the problem you report is unrelated, and is unfortunately much more severe. I will have to investigate.

  5. Erik Schnetter
    • changed status to resolved
    • removed comment

    I believe this is now corrected. The corresponding changes to CarpetLib and CarpetReduce were extensive, but the resulting code is cleaner than before.

  6. Log in to comment