Discussion:
[Libunwind-devel] libunwind write access causes crash while capturing stack trace (was: Fwd: Is there any downside to using libgcc's backtrace?)
Aliaksey Kandratsenka
2018-02-03 23:49:59 UTC
Permalink
Hi libunwind experts.

We've had couple recent reports of crashes that look like gperftools
stack trace capturing code somehow triggers libunwind attempt to write
into bogus memory location. I.e. see original email below (and store
in write path was originally highlighted). Full thread at:
https://groups.google.com/d/msg/gperftools/JnxBrEBrlBs/LLfJI2gvDAAJ

"Bogus" part is probably irrelevant. It is not uncommon to deal with
codes with invalid unwind info. But I am really curious what might be
happening with that write. gperftools is using basic stacktrace
capturing API (unw_local_init/unw_step/unw_get_reg). So we should not
be mutating any global state. This is the code:
https://github.com/gperftools/gperftools/blob/master/src/stacktrace_libunwind-inl.h

Is there something that gperftools might be doing wrong there? Or is
it known bug (or feature) in libunwind?

---------- Forwarded message ----------
From: krisschumi <***@gmail.com>
Date: Thu, Dec 21, 2017 at 3:26 PM
Subject: Re: Is there any downside to using libgcc's backtrace?
To: gperftools <***@googlegroups.com>


Thanks for the response, Aliaksey. I forgot to mention that I was
indeed using the latest version of libunwind, which is 1.2.1. And I
built it from their git master branch.

Below is the exact line in libunwind where it crashes in file Ginit.c.
I will submit a bug report as well. I will try gperftools with libgcc
backtrace. I already tried frame pointers and it works superb -- no
crashes. Our software is a large multi-threaded CAD program where
performance is important. Our software also has a peak memory of ~200
GB because we have to store and manipulate circuit layout objects.

static int
access_mem (unw_addr_space_t as, unw_word_t addr, unw_word_t *val, int write,
void *arg)
{
if (unlikely (write))
{
Debug (16, "mem[%016lx] <- %lx\n", addr, *val);
*(unw_word_t *) addr = *val;
}
else
{
/* validate address */
const struct cursor *c = (const struct cursor *)arg;
if (likely (c != NULL) && unlikely (c->validate)
&& unlikely (validate_mem (addr))) {
Debug (16, "mem[%016lx] -> invalid\n", addr);
return -1;
}
*val = *(unw_word_t *) addr;
Debug (16, "mem[%016lx] -> %lx\n", addr, *val);
}
return 0;
}
Hi Aliaksey,
We're unable to use libunwind (because it crashes all the time) and frame pointer (because it is causing a 3% performance degradation when compiled with -fno-omit-frame-pointer). Therefore, I want to configure and compile gperftools with the "--enable-stacktrace-by-backtrace" option. But then, I see a note in the configure file saying "No libunwind and no frame pointer, expect crashy profiler". Can you explain why the profiler is more likely to crash with this approach? Just wanted to let you know that there's not a single try catch block in all of our code.
https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
Hi. I think you're likely to have more luck with updated libunwind than libgcc or backtrace() facility (glibc's backtrace is using libgcc as well). I.e. try getting their latest release or even building it from their git master branch.
If you still hit problems, it looks like libunwind project isn't dead it might be best to just report any crashing issues to them. One possible issue (but arguably not excuse for libunwind) is that some asm functions in glibc don't bother with unwind annotations. Are you crashing in something like memset/strlen/etc ? If so and if newest libunwind doesn't help, then please mention that to libunwind bug report.
As for weakness of libgcc, indeed main risk is getting cpu profiler "tick" while exception is being thrown. If you're certain that you don't have them, it might be worth a try. But note that we have seen crash reports with backtrace()/libgcc as well. Could be that they're not expecting backtracing/unwinding from signal handler for example.
Thanks,
Krishna
--
You received this message because you are subscribed to the Google Groups "gperftools" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/gperftools/3975b99c-0a07-4daf-a332-38ddf92e0260%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to gperftools+***@googlegroups.com.
To post to this group, send email to ***@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/gperftools/3ea0ec9c-c8bc-4218-a6bf-f87f220e4841%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Todd Lipcon
2018-02-04 06:57:32 UTC
Permalink
We recently hit a crash in our project on this same code path and found
that it's been fixed in libunwind trunk with commit
836c91c43d7a996028aa7e8d1f53630a6b8e7cbe. Specifically, the issue is that
the validate_mem call checks using mincore whether the address is mapped,
but this returns true even for a page that's mapped without read
permissions. It seems that there are various mappings with PROT_NONE --
maybe some sort of guard/fence pages or somesuch, we didn't really
investigate that bit.

Our plan is to either upgrade to libunwind trunk (we static-link it) or
backport that patch specifically into libunwind 1.1a (the version that we
are using).

-Todd

-Todd




On Sat, Feb 3, 2018 at 3:49 PM, Aliaksey Kandratsenka <
Post by Aliaksey Kandratsenka
Hi libunwind experts.
We've had couple recent reports of crashes that look like gperftools
stack trace capturing code somehow triggers libunwind attempt to write
into bogus memory location. I.e. see original email below (and store
https://groups.google.com/d/msg/gperftools/JnxBrEBrlBs/LLfJI2gvDAAJ
"Bogus" part is probably irrelevant. It is not uncommon to deal with
codes with invalid unwind info. But I am really curious what might be
happening with that write. gperftools is using basic stacktrace
capturing API (unw_local_init/unw_step/unw_get_reg). So we should not
https://github.com/gperftools/gperftools/blob/master/src/
stacktrace_libunwind-inl.h
Is there something that gperftools might be doing wrong there? Or is
it known bug (or feature) in libunwind?
---------- Forwarded message ----------
Date: Thu, Dec 21, 2017 at 3:26 PM
Subject: Re: Is there any downside to using libgcc's backtrace?
Thanks for the response, Aliaksey. I forgot to mention that I was
indeed using the latest version of libunwind, which is 1.2.1. And I
built it from their git master branch.
Below is the exact line in libunwind where it crashes in file Ginit.c.
I will submit a bug report as well. I will try gperftools with libgcc
backtrace. I already tried frame pointers and it works superb -- no
crashes. Our software is a large multi-threaded CAD program where
performance is important. Our software also has a peak memory of ~200
GB because we have to store and manipulate circuit layout objects.
static int
access_mem (unw_addr_space_t as, unw_word_t addr, unw_word_t *val, int write,
void *arg)
{
if (unlikely (write))
{
Debug (16, "mem[%016lx] <- %lx\n", addr, *val);
*(unw_word_t *) addr = *val;
}
else
{
/* validate address */
const struct cursor *c = (const struct cursor *)arg;
if (likely (c != NULL) && unlikely (c->validate)
&& unlikely (validate_mem (addr))) {
Debug (16, "mem[%016lx] -> invalid\n", addr);
return -1;
}
*val = *(unw_word_t *) addr;
Debug (16, "mem[%016lx] -> %lx\n", addr, *val);
}
return 0;
}
Hi Aliaksey,
We're unable to use libunwind (because it crashes all the time) and
frame pointer (because it is causing a 3% performance degradation when
compiled with -fno-omit-frame-pointer). Therefore, I want to configure and
compile gperftools with the "--enable-stacktrace-by-backtrace" option.
But then, I see a note in the configure file saying "No libunwind and no
frame pointer, expect crashy profiler". Can you explain why the profiler is
more likely to crash with this approach? Just wanted to let you know that
there's not a single try catch block in all of our code.
There's no mention of any crashes with libgcc's backtrace in your wiki
https://github.com/gperftools/gperftools/wiki/gperftools'-
stacktrace-capturing-methods-and-their-issues
Hi. I think you're likely to have more luck with updated libunwind than
libgcc or backtrace() facility (glibc's backtrace is using libgcc as well).
I.e. try getting their latest release or even building it from their git
master branch.
If you still hit problems, it looks like libunwind project isn't dead it
might be best to just report any crashing issues to them. One possible
issue (but arguably not excuse for libunwind) is that some asm functions in
glibc don't bother with unwind annotations. Are you crashing in something
like memset/strlen/etc ? If so and if newest libunwind doesn't help, then
please mention that to libunwind bug report.
As for weakness of libgcc, indeed main risk is getting cpu profiler
"tick" while exception is being thrown. If you're certain that you don't
have them, it might be worth a try. But note that we have seen crash
reports with backtrace()/libgcc as well. Could be that they're not
expecting backtracing/unwinding from signal handler for example.
Thanks,
Krishna
--
You received this message because you are subscribed to the Google
Groups "gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send
To view this discussion on the web visit https://groups.google.com/d/
msgid/gperftools/3975b99c-0a07-4daf-a332-38ddf92e0260%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send
To view this discussion on the web visit
https://groups.google.com/d/msgid/gperftools/3ea0ec9c-
c8bc-4218-a6bf-f87f220e4841%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"gperftools" group.
To unsubscribe from this group and stop receiving emails from it, send an
To view this discussion on the web visit https://groups.google.com/d/
msgid/gperftools/CADpJO7zFxZ7zT-ozUaQV870E4KpJN_yP%
2B1rGdx8ZpHZ8MVNMXg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Loading...