Aliaksey Kandratsenka
2018-02-03 23:49:59 UTC
Hi libunwind experts.
We've had couple recent reports of crashes that look like gperftools
stack trace capturing code somehow triggers libunwind attempt to write
into bogus memory location. I.e. see original email below (and store
in write path was originally highlighted). Full thread at:
https://groups.google.com/d/msg/gperftools/JnxBrEBrlBs/LLfJI2gvDAAJ
"Bogus" part is probably irrelevant. It is not uncommon to deal with
codes with invalid unwind info. But I am really curious what might be
happening with that write. gperftools is using basic stacktrace
capturing API (unw_local_init/unw_step/unw_get_reg). So we should not
be mutating any global state. This is the code:
https://github.com/gperftools/gperftools/blob/master/src/stacktrace_libunwind-inl.h
Is there something that gperftools might be doing wrong there? Or is
it known bug (or feature) in libunwind?
---------- Forwarded message ----------
From: krisschumi <***@gmail.com>
Date: Thu, Dec 21, 2017 at 3:26 PM
Subject: Re: Is there any downside to using libgcc's backtrace?
To: gperftools <***@googlegroups.com>
Thanks for the response, Aliaksey. I forgot to mention that I was
indeed using the latest version of libunwind, which is 1.2.1. And I
built it from their git master branch.
Below is the exact line in libunwind where it crashes in file Ginit.c.
I will submit a bug report as well. I will try gperftools with libgcc
backtrace. I already tried frame pointers and it works superb -- no
crashes. Our software is a large multi-threaded CAD program where
performance is important. Our software also has a peak memory of ~200
GB because we have to store and manipulate circuit layout objects.
static int
access_mem (unw_addr_space_t as, unw_word_t addr, unw_word_t *val, int write,
void *arg)
{
if (unlikely (write))
{
Debug (16, "mem[%016lx] <- %lx\n", addr, *val);
*(unw_word_t *) addr = *val;
}
else
{
/* validate address */
const struct cursor *c = (const struct cursor *)arg;
if (likely (c != NULL) && unlikely (c->validate)
&& unlikely (validate_mem (addr))) {
Debug (16, "mem[%016lx] -> invalid\n", addr);
return -1;
}
*val = *(unw_word_t *) addr;
Debug (16, "mem[%016lx] -> %lx\n", addr, *val);
}
return 0;
}
If you still hit problems, it looks like libunwind project isn't dead it might be best to just report any crashing issues to them. One possible issue (but arguably not excuse for libunwind) is that some asm functions in glibc don't bother with unwind annotations. Are you crashing in something like memset/strlen/etc ? If so and if newest libunwind doesn't help, then please mention that to libunwind bug report.
As for weakness of libgcc, indeed main risk is getting cpu profiler "tick" while exception is being thrown. If you're certain that you don't have them, it might be worth a try. But note that we have seen crash reports with backtrace()/libgcc as well. Could be that they're not expecting backtracing/unwinding from signal handler for example.
--
You received this message because you are subscribed to the Google
Groups "gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to gperftools+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/gperftools/3ea0ec9c-c8bc-4218-a6bf-f87f220e4841%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
We've had couple recent reports of crashes that look like gperftools
stack trace capturing code somehow triggers libunwind attempt to write
into bogus memory location. I.e. see original email below (and store
in write path was originally highlighted). Full thread at:
https://groups.google.com/d/msg/gperftools/JnxBrEBrlBs/LLfJI2gvDAAJ
"Bogus" part is probably irrelevant. It is not uncommon to deal with
codes with invalid unwind info. But I am really curious what might be
happening with that write. gperftools is using basic stacktrace
capturing API (unw_local_init/unw_step/unw_get_reg). So we should not
be mutating any global state. This is the code:
https://github.com/gperftools/gperftools/blob/master/src/stacktrace_libunwind-inl.h
Is there something that gperftools might be doing wrong there? Or is
it known bug (or feature) in libunwind?
---------- Forwarded message ----------
From: krisschumi <***@gmail.com>
Date: Thu, Dec 21, 2017 at 3:26 PM
Subject: Re: Is there any downside to using libgcc's backtrace?
To: gperftools <***@googlegroups.com>
Thanks for the response, Aliaksey. I forgot to mention that I was
indeed using the latest version of libunwind, which is 1.2.1. And I
built it from their git master branch.
Below is the exact line in libunwind where it crashes in file Ginit.c.
I will submit a bug report as well. I will try gperftools with libgcc
backtrace. I already tried frame pointers and it works superb -- no
crashes. Our software is a large multi-threaded CAD program where
performance is important. Our software also has a peak memory of ~200
GB because we have to store and manipulate circuit layout objects.
static int
access_mem (unw_addr_space_t as, unw_word_t addr, unw_word_t *val, int write,
void *arg)
{
if (unlikely (write))
{
Debug (16, "mem[%016lx] <- %lx\n", addr, *val);
*(unw_word_t *) addr = *val;
}
else
{
/* validate address */
const struct cursor *c = (const struct cursor *)arg;
if (likely (c != NULL) && unlikely (c->validate)
&& unlikely (validate_mem (addr))) {
Debug (16, "mem[%016lx] -> invalid\n", addr);
return -1;
}
*val = *(unw_word_t *) addr;
Debug (16, "mem[%016lx] -> %lx\n", addr, *val);
}
return 0;
}
Hi Aliaksey,
We're unable to use libunwind (because it crashes all the time) and frame pointer (because it is causing a 3% performance degradation when compiled with -fno-omit-frame-pointer). Therefore, I want to configure and compile gperftools with the "--enable-stacktrace-by-backtrace" option. But then, I see a note in the configure file saying "No libunwind and no frame pointer, expect crashy profiler". Can you explain why the profiler is more likely to crash with this approach? Just wanted to let you know that there's not a single try catch block in all of our code.
https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
Hi. I think you're likely to have more luck with updated libunwind than libgcc or backtrace() facility (glibc's backtrace is using libgcc as well). I.e. try getting their latest release or even building it from their git master branch.We're unable to use libunwind (because it crashes all the time) and frame pointer (because it is causing a 3% performance degradation when compiled with -fno-omit-frame-pointer). Therefore, I want to configure and compile gperftools with the "--enable-stacktrace-by-backtrace" option. But then, I see a note in the configure file saying "No libunwind and no frame pointer, expect crashy profiler". Can you explain why the profiler is more likely to crash with this approach? Just wanted to let you know that there's not a single try catch block in all of our code.
https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
If you still hit problems, it looks like libunwind project isn't dead it might be best to just report any crashing issues to them. One possible issue (but arguably not excuse for libunwind) is that some asm functions in glibc don't bother with unwind annotations. Are you crashing in something like memset/strlen/etc ? If so and if newest libunwind doesn't help, then please mention that to libunwind bug report.
As for weakness of libgcc, indeed main risk is getting cpu profiler "tick" while exception is being thrown. If you're certain that you don't have them, it might be worth a try. But note that we have seen crash reports with backtrace()/libgcc as well. Could be that they're not expecting backtracing/unwinding from signal handler for example.
Thanks,
Krishna
--
You received this message because you are subscribed to the Google Groups "gperftools" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/gperftools/3975b99c-0a07-4daf-a332-38ddf92e0260%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Krishna
--
You received this message because you are subscribed to the Google Groups "gperftools" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/gperftools/3975b99c-0a07-4daf-a332-38ddf92e0260%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google
Groups "gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to gperftools+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/gperftools/3ea0ec9c-c8bc-4218-a6bf-f87f220e4841%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.