Correct backtrace generation in Carpet

Issue #1100 resolved
Erik Schnetter created an issue

The file backtrace.cc in CarpetLib does not #include <cctk.h>; hence all HAVE_BACKTRACE* macros are undefined, and only basic backtraces are generated.

Correcting this is non-trivial, since the backtrace code is arcane, is written in C, probably expects glibc, contains (I'm fairly certain) memory allocation errors, and doesn't build e.g. on Mac OSX. The code also spends an inordinate amount of time allocating and freeing string buffers, which should be replaced by simply using C++ streams.

The backtrace code also probably requires a few more autoconf tests, so that it can be disabled where it does not work.

Keyword:

Comments (7)

  1. Ian Hinder
    • removed comment

    Given that backtraces are very useful, and used to work on "standard" linux systems, maybe the code could be enabled despite the problems that you mention. The memory allocation errors and buffers shouldn't be a problem since this only happens when the process is about to terminate due to the error anyway, right? Can we detect that we have glibc and Linux, and enable the backtrace code in that case? I know it's not as elegant as correctly autoconfing everything, but it's probably a lot easier. And having a backtrace with meaningful symbols is extremely useful.

  2. Roland Haas
    • removed comment

    Replying to [comment:1 hinder]:

    Given that backtraces are very useful, and used to work on "standard" linux systems, maybe the code could be enabled despite the problems that you mention. The memory allocation errors and buffers shouldn't be a problem since this only happens when the process is about to terminate due to the error anyway, right? Can we detect that we have glibc and Linux, and enable the backtrace code in that case? I know it's not as elegant as correctly autoconfing everything, but it's probably a lot easier. And having a backtrace with meaningful symbols is extremely useful.

    I second that :-)

  3. Roland Haas

    The current code in Carpet produces backtraces (but no line numbers) on Linux but not for OSX:

    Backtrace from rank 0 pid 33066:
    1. CarpetLib::signal_handler(int)   [/data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(_ZN9CarpetLib14signal_handlerEi+0xe7)            + [0x563a32128657]]
    2. /lib/x86_64-linux-gnu/libc.so.6(+0x3a100) [0x7fea0e0df100]
    3. /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x141) [0x7fea0e0df081]
    4. /lib/x86_64-linux-gnu/libc.so.6(abort+0x121) [0x7fea0e0ca535]
    5. /lib/x86_64-linux-gnu/libc.so.6(+0x2540f) [0x7fea0e0ca40f]
    6. /lib/x86_64-linux-gnu/libc.so.6(+0x32b92) [0x7fea0e0d7b92]
    7. /data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(+0xca22a3) [0x563a320582a3]
    8. /data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(CCTKi_ScheduleGHInit+0x4a) [0x563a340527ba]
    9. Carpet::Initialise(tFleshConfig*)   [/data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(_ZN6Carpet10InitialiseEP12tFleshConfig+0x2cc) + [0x563a3201f6dc]]
    10. /data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(main+0x35) [0x563a31c585f5]
    11. /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fea0e0cbbbb]
    12. /data/rhaas/postdoc/gr/cactus/ET_trunk/exe/cactus_sim(_start+0x2a) [0x563a31c5e53a]
    
    The hexadecimal addresses in this backtrace can also be interpreted
    with a debugger (e.g. gdb), or with the 'addr2line' (or 'gaddr2line')
    command line tool: 'addr2line -e cactus_sim <address>'.
    

    and

    Backtrace from rank 0 pid 76396:
    
    The hexadecimal addresses in this backtrace can also be interpreted
    with a debugger (e.g. gdb), or with the 'addr2line' (or 'gaddr2line')
    command line tool: 'addr2line -e cactus_sim <address>'.
    

    on OSX all of HAVE_BACKTRACE, HAVE_DLADDR, HAVE___CXA_DEMANGLE, andHAVE_BACKTRACE_SYMBOLS were set in cctk_Config.h so the empty stack trace basically is a failure of backtrace() to return anything useful. It s not failing, and indeed letting it output all stack frames shows that it does produce a backtrace of the backtrace function call itself and one more level up.

    This may be an optimization issue. Compiling the simple backtrace example in https://stackoverflow.com/questions/77005/how-to-automatically-generate-a-stacktrace-when-my-program-crashes with -O0 shows the backtrace but -O3 makes it 2 levels deep (same as Cactus). So it may just be an issue that gcc messes up the call stack on high enough optimization settings (same as icc does).

    Building just backtrace.cc with -O0 lets me get a backtrace.

    This pull request implements this and also removes dead code from backtrace.cc (that was inside of #ifdef HAVE_BACKTRACE but before #include <cctk.h>): https://bitbucket.org/eschnett/carpet/pull-requests/31/rhaas-deadbacktrace/diff

  4. Log in to comment