gdb built with musl libc segfault


Lluis Campos
 

Hi again,

On 04.04.2019 12:58, Lluis Campos wrote:

On 03.04.2019 18:59, Khem Raj wrote:
On Tue, Apr 2, 2019 at 4:51 AM Lluis Campos <lluis.campos@northern.tech> wrote:
Hi all,

This is my very first question in the Yocto mailing list. Very exited!
Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not
using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc)                     = 0
set_tid_address(0x76f1bfa0)             = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my
local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential
packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 6 entries */, 2048)    = 144
getdents64(3, /* 0 entries */, 2048)    = 0
close(3)                                = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096)              = 11
access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or
directory)
access("/usr/bin/gdb", X_OK)            = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being
wrongly linked with gnu libc instead of musl, and then cannot run in my
device.

Any ideas on how to debug the issue?
We have switched to using PIE by default in last few releases, can you
try master but comment out
require conf/distro/include/security_flags.inc
in your distro conf file.
Hi Khem,

Thanks for your help.

Commenting out this line seems to work!

FYI,

Enabling back the require of security_flags.in, and instead applying no security flags only to my recipe is enough.

I have added to my conf/local.conf:

SECURITY_CFLAGS_pn-mender = "${SECURITY_NOPIE_CFLAGS}"
SECURITY_LDFLAGS_pn-mender = ""

So I will continue this way until I find the root cause on my CGO code.

Thanks again for your help.




I will try to figure out exactly which exact option is causing the issue.



Thanks!


Lluís Campos
mender.io
Northen Tech

--
_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto


Lluis Campos
 

On 03.04.2019 18:59, Khem Raj wrote:
On Tue, Apr 2, 2019 at 4:51 AM Lluis Campos <lluis.campos@northern.tech> wrote:
Hi all,

This is my very first question in the Yocto mailing list. Very exited!
Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not
using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc) = 0
set_tid_address(0x76f1bfa0) = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my
local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential
packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, /* 6 entries */, 2048) = 144
getdents64(3, /* 0 entries */, 2048) = 0
close(3) = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096) = 11
access("/usr/local/bin/gdb", X_OK) = -1 ENOENT (No such file or
directory)
access("/usr/bin/gdb", X_OK) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being
wrongly linked with gnu libc instead of musl, and then cannot run in my
device.

Any ideas on how to debug the issue?
We have switched to using PIE by default in last few releases, can you
try master but comment out
require conf/distro/include/security_flags.inc
in your distro conf file.
Hi Khem,

Thanks for your help.

Commenting out this line seems to work!

I will try to figure out exactly which exact option is causing the issue.



Thanks!


Lluís Campos
mender.io
Northen Tech

--
_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto


Khem Raj
 

On Wed, Apr 3, 2019 at 9:59 AM Khem Raj <raj.khem@gmail.com> wrote:

On Tue, Apr 2, 2019 at 4:51 AM Lluis Campos <lluis.campos@northern.tech> wrote:

Hi all,

This is my very first question in the Yocto mailing list. Very exited!
Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not
using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc) = 0
set_tid_address(0x76f1bfa0) = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my
local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential
packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, /* 6 entries */, 2048) = 144
getdents64(3, /* 0 entries */, 2048) = 0
close(3) = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096) = 11
access("/usr/local/bin/gdb", X_OK) = -1 ENOENT (No such file or
directory)
access("/usr/bin/gdb", X_OK) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being
wrongly linked with gnu libc instead of musl, and then cannot run in my
device.

Any ideas on how to debug the issue?
We have switched to using PIE by default in last few releases, can you
try master but comment out
require conf/distro/include/security_flags.inc
in your distro conf file.
I built a fresh image using yoe distro master today and gdb seems to
work fine on rpi3/musl
https://github.com/YoeDistro/yoe-distro


Thanks!


Lluís Campos
mender.io
Northen Tech

--
_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto


Khem Raj
 

On Tue, Apr 2, 2019 at 4:51 AM Lluis Campos <lluis.campos@northern.tech> wrote:

Hi all,

This is my very first question in the Yocto mailing list. Very exited!
Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not
using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc) = 0
set_tid_address(0x76f1bfa0) = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my
local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential
packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
getdents64(3, /* 6 entries */, 2048) = 144
getdents64(3, /* 0 entries */, 2048) = 0
close(3) = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096) = 11
access("/usr/local/bin/gdb", X_OK) = -1 ENOENT (No such file or
directory)
access("/usr/bin/gdb", X_OK) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being
wrongly linked with gnu libc instead of musl, and then cannot run in my
device.

Any ideas on how to debug the issue?
We have switched to using PIE by default in last few releases, can you
try master but comment out
require conf/distro/include/security_flags.inc
in your distro conf file.

Thanks!


Lluís Campos
mender.io
Northen Tech

--
_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto


 

On 02/04/2019 14:51, Lluis Campos wrote:
Hi Paul,
On 02.04.2019 14:49, Paul Barker wrote:
On 02/04/2019 12:45, Lluis Campos wrote:
Hi all,

This is my very first question in the Yocto mailing list. Very exited! Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc)                     = 0
set_tid_address(0x76f1bfa0)             = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 6 entries */, 2048)    = 144
getdents64(3, /* 0 entries */, 2048)    = 0
close(3)                                = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096)              = 11
access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or directory)
access("/usr/bin/gdb", X_OK)            = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being wrongly linked with gnu libc instead of musl, and then cannot run in my device.

Any ideas on how to debug the issue?
Hi Lluis,

This is an issue I've seen before with runc and gdb.

In runc we saw SIGILL which I tracked down to some hideous setjmp/longjmp magic written in C. cgo is used to include this C code in with the Go code that comprises the rest of the application.
Our application is written in Go and we use CGO as well. So it sounds quite similar.
Does your application use setjmp/longjmp or C++ exceptions in the sections built with CGO?

I've dug into the runc disassembly as well as adding extra prints and can see that it's at calls to those functions that the program counter jumps off into the weeds resulting in SIGILL. For gdb there's usage of C++ exception handling around gdb_main() and strace shows that the crash is very very early in execution so I suspect it's setjmp/longjmp again, used by C++ exception handling.


In gdb we saw SIGSEGV which is what you've got above.

I think things are being correctly linked against musl but then there's some runtime issue in recent musl versions, possibly in conjunction with recent kernel headers.

Are you using the thud or master branch?
I am using thud branch. I haven't actually tried with master but I will do it later today
I've reproduced the issue on master. I've also ruled out the issue in sumo branch even if we uprev musl to the same version used on master. It's likely a gcc/musl incompatibility of some kind introduced in a recent gcc version.

I'm passing this one to a colleague for now but I'll try to have another look myself next week. It's captured in our issue tracker for Oryx here: https://gitlab.com/oryx/oryx/issues/14.

Thanks,

--
Paul Barker
Managing Director & Principal Engineer
Beta Five Ltd


Lluis Campos
 

Hi Paul,


On 02.04.2019 14:49, Paul Barker wrote:
On 02/04/2019 12:45, Lluis Campos wrote:
Hi all,

This is my very first question in the Yocto mailing list. Very exited! Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc)                     = 0
set_tid_address(0x76f1bfa0)             = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 6 entries */, 2048)    = 144
getdents64(3, /* 0 entries */, 2048)    = 0
close(3)                                = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096)              = 11
access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or directory)
access("/usr/bin/gdb", X_OK)            = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being wrongly linked with gnu libc instead of musl, and then cannot run in my device.

Any ideas on how to debug the issue?
Hi Lluis,

This is an issue I've seen before with runc and gdb.

In runc we saw SIGILL which I tracked down to some hideous setjmp/longjmp magic written in C. cgo is used to include this C code in with the Go code that comprises the rest of the application.
Our application is written in Go and we use CGO as well. So it sounds quite similar.



In gdb we saw SIGSEGV which is what you've got above.

I think things are being correctly linked against musl but then there's some runtime issue in recent musl versions, possibly in conjunction with recent kernel headers.

Are you using the thud or master branch?
I am using thud branch. I haven't actually tried with master but I will do it later today


Thanks,

Lluís


 

On 02/04/2019 12:45, Lluis Campos wrote:
Hi all,
This is my very first question in the Yocto mailing list. Very exited! Please let me know if I should use other list for this.
I am building an image using musl libc instead of gnu libc. I am not using yocto-tiny distro, instead I achieve this by setting on my local.conf:
TCLIBC = "musl"
My app (mender) got a segfault just starting. See output from strace:
root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc)                     = 0
set_tid_address(0x76f1bfa0)             = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault
To be able to debug the process, I added gdb to my image adding to my local.conf:
CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential packagegroup-core-tools-debug"
Then, ironically, gdb itself also segfaults:
root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 6 entries */, 2048)    = 144
getdents64(3, /* 0 entries */, 2048)    = 0
close(3)                                = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096)              = 11
access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or directory)
access("/usr/bin/gdb", X_OK)            = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++
So, what is going on here? My guess is that some recipes are being wrongly linked with gnu libc instead of musl, and then cannot run in my device.
Any ideas on how to debug the issue?
Hi Lluis,

This is an issue I've seen before with runc and gdb.

In runc we saw SIGILL which I tracked down to some hideous setjmp/longjmp magic written in C. cgo is used to include this C code in with the Go code that comprises the rest of the application.

In gdb we saw SIGSEGV which is what you've got above.

I think things are being correctly linked against musl but then there's some runtime issue in recent musl versions, possibly in conjunction with recent kernel headers.

Are you using the thud or master branch?

--
Paul Barker
Managing Director & Principal Engineer
Beta Five Ltd


Lluis Campos
 

Hi all,

This is my very first question in the Yocto mailing list. Very exited! Please let me know if I should use other list for this.


I am building an image using musl libc instead of gnu libc. I am not using yocto-tiny distro, instead I achieve this by setting on my local.conf:

TCLIBC = "musl"


My app (mender) got a segfault just starting. See output from strace:

root@raspberrypi3:~# strace mender
execve("/usr/bin/mender", ["mender"], 0x7ee65e10 /* 13 vars */) = 0
set_tls(0x76f1bffc)                     = 0
set_tid_address(0x76f1bfa0)             = 3020
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x530bc8} ---
+++ killed by SIGSEGV +++
Segmentation fault


To be able to debug the process, I added gdb to my image adding to my local.conf:

CORE_IMAGE_EXTRA_INSTALL += "packagegroup-core-buildessential packagegroup-core-tools-debug"


Then, ironically, gdb itself also segfaults:

root@raspberrypi3:~# strace gdb 2>&1 | tail
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 6 entries */, 2048)    = 144
getdents64(3, /* 0 entries */, 2048)    = 0
close(3)                                = 0
ioctl(0, TIOCGWINSZ, {ws_row=25, ws_col=74, ws_xpixel=0, ws_ypixel=0}) = 0
getcwd("/home/root", 4096)              = 11
access("/usr/local/bin/gdb", X_OK)      = -1 ENOENT (No such file or directory)
access("/usr/bin/gdb", X_OK)            = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x7e35aff0} ---
+++ killed by SIGSEGV +++


So, what is going on here? My guess is that some recipes are being wrongly linked with gnu libc instead of musl, and then cannot run in my device.

Any ideas on how to debug the issue?

Thanks!


Lluís Campos
mender.io
Northen Tech