Discussion:
[Libunwind-devel] unw_get_proc_info is not signal-safe with debug-frames enabled
Doug Moore
2018-09-06 22:55:50 UTC
Permalink
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.

So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?

Is the memory allocated for 'tab' freed anywhere?  It's not obvious that
it is.

Thanks,

Doug Moore

Rice University
Paul Pluzhnikov
2018-09-06 23:49:56 UTC
Permalink
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
FWIW, dl_iterate_phdr in GLIBC is not async signal safe either (may
call malloc).

https://lists.nongnu.org/archive/html/libunwind-devel/2010-05/msg00006.html
https://lists.nongnu.org/archive/html/libunwind-devel/2016-02/msg00024.html

etc.
--
Paul Pluzhnikov
Sergey Korolev
2018-09-07 07:37:45 UTC
Permalink
Doug,

Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?
Is the memory allocated for 'tab' freed anywhere? It's not obvious that
it is.
Thanks,
Doug Moore
Rice University
_______________________________________________
Libunwind-devel mailing list
https://lists.nongnu.org/mailman/listinfo/libunwind-devel
Doug Moore
2018-09-07 08:18:23 UTC
Permalink
Sergey,

Your patch does remove memory allocation from the file, but oddly, I
still have problems with malloc being invoked. The second time I
tested with your patch, I got hung here:

#0 0x0000ffffb5a85d74 in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x0000ffffb5a0ea38 in malloc () from /lib64/libc.so.6
#2 0x0000ffffb59c15bc in qsort_r () from /lib64/libc.so.6
#3 0x0000ffffb592c8b0 in _ULaarch64_dwarf_find_debug_frame (found=0,
di_debug=***@entry=0xffffeb09a080, ip=***@entry=281473734119984,
segbase=***@entry=281473734053888, obj_name=0xffffb67dc440
"/home/dougm/.local/lib/hpctoolkit/ext-libs/libdwarf.so.1",
start=<optimized out>,
end=<optimized out>) at dwarf/Gfind_proc_info-lsb.c:380
#4 0x0000ffffb592cab0 in _ULaarch64_dwarf_callback
(info=0xffffeb099ef8, size=<optimized out>, ptr=0xffffeb09a018) at
dwarf/Gfind_proc_info-lsb.c:667
#5 0x0000ffffb5abac88 in dl_iterate_phdr () from /lib64/libc.so.6
#6 0x0000ffffb592cff4 in _ULaarch64_dwarf_find_proc_info
(as=0xffffb5948230 <local_addr_space>, ip=***@entry=281473734119984,
pi=***@entry=0xffffeb09a530,
need_unwind_info=1, arg=0xffffeb09a9a0) at dwarf/Gfind_proc_info-lsb.c:693
#7 0x0000ffffb592a158 in fetch_proc_info (c=***@entry=0xffffeb09a1d0,
ip=281473734119984) at dwarf/Gparser.c:454
#8 0x0000ffffb592ba08 in _ULaarch64_dwarf_reg_states_iterate
(c=0xffffeb09a1d0, cb=0xffffb6750844 <dwarf_reg_states_callback>,
token=0xffffeb09a1b8)
at dwarf/Gparser.c:1034

So, perhaps qsort isn't safe either, as odd as that seems.

Doug
Post by Sergey Korolev
Doug,
Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?
Is the memory allocated for 'tab' freed anywhere? It's not obvious that
it is.
Thanks,
Doug Moore
Rice University
_______________________________________________
Libunwind-devel mailing list
https://lists.nongnu.org/mailman/listinfo/libunwind-devel
Sergey Korolev
2018-09-07 12:44:05 UTC
Permalink
Doug,

please try this patch also.
Post by Doug Moore
Sergey,
Your patch does remove memory allocation from the file, but oddly, I
still have problems with malloc being invoked. The second time I
#0 0x0000ffffb5a85d74 in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x0000ffffb5a0ea38 in malloc () from /lib64/libc.so.6
#2 0x0000ffffb59c15bc in qsort_r () from /lib64/libc.so.6
#3 0x0000ffffb592c8b0 in _ULaarch64_dwarf_find_debug_frame (found=0,
"/home/dougm/.local/lib/hpctoolkit/ext-libs/libdwarf.so.1",
start=<optimized out>,
end=<optimized out>) at dwarf/Gfind_proc_info-lsb.c:380
#4 0x0000ffffb592cab0 in _ULaarch64_dwarf_callback
(info=0xffffeb099ef8, size=<optimized out>, ptr=0xffffeb09a018) at
dwarf/Gfind_proc_info-lsb.c:667
#5 0x0000ffffb5abac88 in dl_iterate_phdr () from /lib64/libc.so.6
#6 0x0000ffffb592cff4 in _ULaarch64_dwarf_find_proc_info
need_unwind_info=1, arg=0xffffeb09a9a0) at
dwarf/Gfind_proc_info-lsb.c:693
ip=281473734119984) at dwarf/Gparser.c:454
#8 0x0000ffffb592ba08 in _ULaarch64_dwarf_reg_states_iterate
(c=0xffffeb09a1d0, cb=0xffffb6750844 <dwarf_reg_states_callback>,
token=0xffffeb09a1b8)
at dwarf/Gparser.c:1034
So, perhaps qsort isn't safe either, as odd as that seems.
Doug
Post by Sergey Korolev
Doug,
Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
Post by Sergey Korolev
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?
Is the memory allocated for 'tab' freed anywhere? It's not obvious that
it is.
Thanks,
Doug Moore
Rice University
_______________________________________________
Libunwind-devel mailing list
https://lists.nongnu.org/mailman/listinfo/libunwind-devel
Doug Moore
2018-09-07 16:09:12 UTC
Permalink
Sergey,

The two patches together seem have to solved the problems of my test
case, which I've run 14 times now without a problem. I hope your
changes find their way into libunwind soon.

Thanks for the help.

Doug Moore
Rice University
Post by Sergey Korolev
Doug,
please try this patch also.
Post by Doug Moore
Sergey,
Your patch does remove memory allocation from the file, but oddly, I
still have problems with malloc being invoked. The second time I
#0 0x0000ffffb5a85d74 in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x0000ffffb5a0ea38 in malloc () from /lib64/libc.so.6
#2 0x0000ffffb59c15bc in qsort_r () from /lib64/libc.so.6
#3 0x0000ffffb592c8b0 in _ULaarch64_dwarf_find_debug_frame (found=0,
"/home/dougm/.local/lib/hpctoolkit/ext-libs/libdwarf.so.1",
start=<optimized out>,
end=<optimized out>) at dwarf/Gfind_proc_info-lsb.c:380
#4 0x0000ffffb592cab0 in _ULaarch64_dwarf_callback
(info=0xffffeb099ef8, size=<optimized out>, ptr=0xffffeb09a018) at
dwarf/Gfind_proc_info-lsb.c:667
#5 0x0000ffffb5abac88 in dl_iterate_phdr () from /lib64/libc.so.6
#6 0x0000ffffb592cff4 in _ULaarch64_dwarf_find_proc_info
need_unwind_info=1, arg=0xffffeb09a9a0) at
dwarf/Gfind_proc_info-lsb.c:693
ip=281473734119984) at dwarf/Gparser.c:454
#8 0x0000ffffb592ba08 in _ULaarch64_dwarf_reg_states_iterate
(c=0xffffeb09a1d0, cb=0xffffb6750844 <dwarf_reg_states_callback>,
token=0xffffeb09a1b8)
at dwarf/Gparser.c:1034
So, perhaps qsort isn't safe either, as odd as that seems.
Doug
Post by Sergey Korolev
Doug,
Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
Post by Sergey Korolev
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls calloc, and
callloc is not signal safe on the aarch64 device I'm testing on.
So would there be a problem with replacing the calloc/realloc memory
management of 'tab' in Gfind_proc_info-lsb.c with mmap and munmap and
memcpy?
Is the memory allocated for 'tab' freed anywhere? It's not obvious that
it is.
Thanks,
Doug Moore
Rice University
_______________________________________________
Libunwind-devel mailing list
https://lists.nongnu.org/mailman/listinfo/libunwind-devel
Doug Moore
2018-09-13 22:28:17 UTC
Permalink
Sergey,

Thanks again.  We've applied both patches and with them solved some
customer problems.

I've put your two patches into a github push request, in the hope that
the libunwind team will accept that request so that the "official"
libunwind contains these fixes. The idea that we're depending on a
"patched" libunwind scares some people, and breaks some of their
development tools.

Doug Moore

Rice University
Post by Sergey Korolev
Doug,
please try this patch also.
Sergey,
Your patch does remove memory allocation from the file, but oddly, I
still have problems with malloc being invoked.  The second time I
#0  0x0000ffffb5a85d74 in __lll_lock_wait_private () from
/lib64/libc.so.6
#1  0x0000ffffb5a0ea38 in malloc () from /lib64/libc.so.6
#2  0x0000ffffb59c15bc in qsort_r () from /lib64/libc.so.6
#3  0x0000ffffb592c8b0 in _ULaarch64_dwarf_find_debug_frame (found=0,
"/home/dougm/.local/lib/hpctoolkit/ext-libs/libdwarf.so.1",
start=<optimized out>,
     end=<optimized out>) at dwarf/Gfind_proc_info-lsb.c:380
#4  0x0000ffffb592cab0 in _ULaarch64_dwarf_callback
(info=0xffffeb099ef8, size=<optimized out>, ptr=0xffffeb09a018) at
dwarf/Gfind_proc_info-lsb.c:667
#5  0x0000ffffb5abac88 in dl_iterate_phdr () from /lib64/libc.so.6
#6  0x0000ffffb592cff4 in _ULaarch64_dwarf_find_proc_info
     need_unwind_info=1, arg=0xffffeb09a9a0) at
dwarf/Gfind_proc_info-lsb.c:693
ip=281473734119984) at dwarf/Gparser.c:454
#8  0x0000ffffb592ba08 in _ULaarch64_dwarf_reg_states_iterate
(c=0xffffeb09a1d0, cb=0xffffb6750844 <dwarf_reg_states_callback>,
token=0xffffeb09a1b8)
     at dwarf/Gparser.c:1034
So, perhaps qsort isn't safe either, as odd as that seems.
Doug
Post by Sergey Korolev
Doug,
Can you try this patch?
http://lists.nongnu.org/archive/html/libunwind-devel/2018-06/msg00005.html
Post by Sergey Korolev
Post by Doug Moore
It seems that unw_get_proc_info calls dwarf_make_proc_info, calls
fetch_proc_info, calls dwarf_find_proc_info, calls dl_iterate_phdr,
calls dwarf_callback, calls dwarf_find_debug_frame, calls
calloc, and
Post by Sergey Korolev
Post by Doug Moore
callloc is not signal safe on the aarch64 device I'm testing on.
So would there be a problem with replacing the calloc/realloc
memory
Post by Sergey Korolev
Post by Doug Moore
management of 'tab' in Gfind_proc_info-lsb.c with mmap and
munmap and
Post by Sergey Korolev
Post by Doug Moore
memcpy?
Is the memory allocated for 'tab' freed anywhere? It's not
obvious that
Post by Sergey Korolev
Post by Doug Moore
it is.
Thanks,
Doug Moore
Rice University
_______________________________________________
Libunwind-devel mailing list
https://lists.nongnu.org/mailman/listinfo/libunwind-devel
Loading...