Discussion:
[Libunwind-devel] unw_get_proc_info is not signal-safe with debug-frames enabled
Doug Moore
7 years ago
Permalink
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.

So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?

Is the memory allocated for 'tab' freed anywhere?  It's not obvious that
it is.

Thanks,

Doug Moore

Rice University
Paul Pluzhnikov
7 years ago
Permalink
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
FWIW, dl_iterate_phdr in GLIBC is not async signal safe either (may
call malloc).

https://lists.nongnu.org/archive/html/libunwind-devel/2010-05/msg00006.html
https://lists.nongnu.org/archive/html/libunwind-devel/2016-02/msg00024.html

etc.
--
Paul Pluzhnikov
Sergey Korolev
7 years ago
Permalink
Doug,

Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
...
Doug Moore
7 years ago
Permalink
Sergey,

Your patch does remove memory allocation from the file, but oddly, I
still have problems with malloc being invoked. The second time I
tested with your patch, I got hung here:

#0 0x0000ffffb5a85d74 in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x0000ffffb5a0ea38 in malloc () from /lib64/libc.so.6
#2 0x0000ffffb59c15bc in qsort_r () from /lib64/libc.so.6
#3 0x0000ffffb592c8b0 in _ULaarch64_dwarf_find_debug_frame (found=0,
di_debug=***@entry=0xffffeb09a080, ip=***@entry=281473734119984,
segbase=***@entry=281473734053888, obj_name=0xffffb67dc440
"/home/dougm/.local/lib/hpctoolkit/ext-libs/libdwarf.so.1",
start=<optimized out>,
end=<optimized out>) at dwarf/Gfind_proc_info-lsb.c:380
#4 0x0000ffffb592cab0 in _ULaarch64_dwarf_callback
(info=0xffffeb099ef8, size=<optimized out>, ptr=0xffffeb09a018) at
dwarf/Gfind_proc_info-lsb.c:667
#5 0x0000ffffb5abac88 in dl_iterate_phdr () from /lib64/libc.so.6
#6 0x0000ffffb592cff4 in _ULaarch64_dwarf_find_proc_info
(as=0xffffb5948230 <local_addr_space>, ip=***@entry=281473734119984,
pi=***@entry=0xffffeb09a530,
need_unwind_info=1, arg=0xffffeb09a9a0) at dwarf/Gfind_proc_info-lsb.c:693
#7 0x0000ffffb592a158 in fetch_proc_info (c=***@entry=0xffffeb09a1d0,
ip=281473734119984) at dwarf/Gparser.c:454
#8 0x0000ffffb592ba08 in _ULaarch64_dwarf_reg_states_iterate
(c=0xffffeb09a1d0, cb=0xffffb6750844 <dwarf_reg_states_callback>,
token=0xffffeb09a1b8)
at dwarf/Gparser.c:1034

So, perhaps qsort isn't safe either, as odd as that seems.

Doug
...
Sergey Korolev
7 years ago
Permalink
Doug,

please try this patch also.
...
Doug Moore
7 years ago
Permalink
Sergey,

The two patches together seem have to solved the problems of my test
case, which I've run 14 times now without a problem. I hope your
changes find their way into libunwind soon.

Thanks for the help.

Doug Moore
Rice University
...
Doug Moore
7 years ago
Permalink
Sergey,

Thanks again.  We've applied both patches and with them solved some
customer problems.

I've put your two patches into a github push request, in the hope that
the libunwind team will accept that request so that the "official"
libunwind contains these fixes. The idea that we're depending on a
"patched" libunwind scares some people, and breaks some of their
development tools.

Doug Moore

Rice University
...
Loading...