Carpet fails when running qc0-mclachlan.par

Issue #751 closed
Roland Haas created an issue

this is due to ggf::transfer_from_all line 613 (dst->transfer_from) where dst is NULL:

03      assert (lc1>=0 or lc2>=0);
604
605      // Source and destination data
606      gdata * const dst =
607        lc1>=0 ? storage.AT(ml1).AT(rl1).AT(lc1).AT(tl1) : NULL;
608      cdata const & srcs = srcstorage.AT(ml2).AT(rl2);
609      for (int i=0; i<(int)gsrcs.size(); ++i) {
610        gsrcs.AT(i) = lc2>=0 ? srcs.AT(lc2).AT(tl2s.AT(i)) : NULL;
611      }
612
613      dst->transfer_from
614        (state, gsrcs, times, recv, send, slabinfo, p1, p2, time, pos, pot);

which causes a segfault in line 613. lc1 is -1 in this case and passes the assert in line 603. Since dst is used as a pointer it may never be NULL.

It can be reproduced with the qc0-mclachlan.par example from simfactory. Tested on Caltech's bethe workstation with 12 MPI processes and no OpenMP.

hg blames commit 5418b354a3ab which seems to be the original Carpet import into mercurial so this seems to be caused by something else.

Keyword:

Comments (5)

  1. Roland Haas reporter
    • removed comment

    the segfault goes away if I back out of a2fdf2776631 though looking at transfer_from_all it still seems possible to end up with a NULL dst pointer if lc1==-1 (which was what happened).

    Backtrace from where the error occures (this is with -O0, does anyone know why it still has variables optimized out?):

    1. 0 0x0000000000dd1c84 in gdata::extent (this=0x0, $m7=<value optimized out>)
    2. 1 0x0000000003f8ee49 in gdata::transfer_from (this=0x0, state=..., srcs=std::vector of length 3, capacity 3 = {...}, times=std::vector of length 3, capacity 3 = {...}, dstbox=..., srcbox=..., slabinfo=0x0, dstproc=11, srcproc=10, time=0, order_space=5, order_time=2, $Z6=<value optimized out>, $Z7=<value optimized out>, $`5=<value optimized out>, $`6=<value optimized out>, $`7=<value optimized out>, $`8=<value optimized out>, $`9=<value optimized out>, $a0=<value optimized out>, $a1=<value optimized out>, $a2=<value optimized out>, $a3=<value optimized out>, $a4=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/gdata.cc:315
    3. 2 0x0000000003f9ac85 in ggf::transfer_from_all (this=0xcb66ff0, state=..., tl1=0, rl1=1, ml1=0, sendrecvs=0x78, tl2s=std::vector of length 3, capacity 3 = {...}, rl2=0, ml2=0, time=@0x7fffffffc840, use_old_storage=false, flip_send_recv=false, slabinfo=0x0, $�2=<value optimized out>, $�3=<value optimized out>, $�4=<value optimized out>, $�5=<value optimized out>, $�6=<value optimized out>, $�7=<value optimized out>, $�8=<value optimized out>, $�9=<value optimized out>, $�0=<value optimized out>, $�1=<value optimized out>, $�2=<value optimized out>, $�3=<value optimized out>, $�4=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/ggf.cc:613
    4. 3 0x0000000003f98b72 in ggf::ref_bnd_prolongate_all (this=0xcb66ff0, state=..., tl=0, rl=1, ml=0, time=0, $�6=<value optimized out>, $�7=<value optimized out>, $�8=<value optimized out>, $�9=<value optimized out>, $�0=<value optimized out>, $�1=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/CarpetLib/src/ggf.cc:384
    5. 4 0x0000000003e48544 in Carpet::ProlongateGroupBoundaries (cctkGH=0xacaae60, groups=std::vector of length 3, capacity 3 = {...}, $;0=<value optimized out>, $;1=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/Carpet/src/Comm.cc:200
    6. 5 0x0000000003e4799d in Carpet::SyncProlongateGroups (cctkGH=0xacaae60, groups=std::vector of length 3, capacity 3 = {...}, $89=<value optimized out>, $90=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/Carpet/src/Comm.cc:140
    7. 6 0x0000000003edcb0e in Carpet::SyncGroupsInScheduleBlock ( attribute=0xac8fb08, cctkGH=0xacaae60, $q5=<value optimized out>, $q6=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/Carpet/src/CallFunction.cc:408
    8. 7 0x0000000003edbed6 in Carpet::CallFunction (function=0x121c3da, attribute=0xac8fb08, data=0xacaae60, $01=<value optimized out>, $04=<value optimized out>, $05=<value optimized out>) at /data/rhaas/Zelmani/arrangements/Carpet/Carpet/src/CallFunction.cc:277
  2. Erik Schnetter
    • removed comment

    gdata.cc has three assert statements checking aligned-ness near the segfault. These are not necessary (debugging leftovers) and should be removed.

  3. Roland Haas reporter
    • removed comment

    Thank you for the quick help, Erik. Removing the three is_aligned asserts helped. I have surrounded the asserts with an "if (is_dst)" which should prevent any NULL pointer references at this point. What had me confused was that the member function was called with a NULL "this" pointer. Is this strictly valid C++ or just "works on all known compilers"?

  4. Log in to comment