"stack smashing detected" when building aarch64 kernel


Rasmus Villemoes
 

I've recently started getting an internal compiler error when building
an aarch64 kernel. It only happens once in a while, and re-running the
task usually just succeeds, so I don't know how to reproduce or trigger
this at will.

Two examples:

===
CC [M] drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm20b.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/rl.o
CC [M] drivers/gpu/drm/nouveau/nvkm/subdev/fb/gp100.o
*** stack smashing detected ***: <unknown> terminated
In file included from .../kernel-source/arch/arm64/include/asm/atomic.h:15,
from .../kernel-source/include/linux/atomic.h:7,
from
.../kernel-source/include/asm-generic/bitops/atomic.h:5,
from .../kernel-source/arch/arm64/include/asm/bitops.h:26,
from .../kernel-source/include/linux/bitops.h:29,
from .../kernel-source/include/linux/kernel.h:12,
from .../kernel-source/include/linux/uio.h:8,
from .../kernel-source/include/linux/socket.h:8,
from .../kernel-source/include/uapi/linux/if.h:25,
from .../kernel-source/net/mac80211/led.c:7:
.../kernel-source/include/net/inet_sock.h: In function 'inet_sk_state_load':
.../kernel-source/arch/arm64/include/asm/barrier.h:114:8: internal
compiler error: Aborted
114 | union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \
| ^
.../kernel-source/include/asm-generic/barrier.h:142:29: note: in
expansion of macro '__smp_load_acquire'
142 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
.../kernel-source/include/net/inet_sock.h:312:9: note: in expansion of
macro 'smp_load_acquire'
312 | return smp_load_acquire(&sk->sk_state);
| ^~~~~~~~~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
.../kernel-source/scripts/Makefile.build:279: recipe for target
'net/mac80211/led.o' failed
make[3]: *** [net/mac80211/led.o] Error 1
make[3]: *** Waiting for unfinished jobs....
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lag.o
===
CC [M] drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o
CC [M] drivers/gpu/drm/drm_modeset_helper.o
CC [M] drivers/gpu/drm/drm_scdc_helper.o
*** stack smashing detected ***: <unknown> terminated
In file included from
.../kernel-source/include/linux/regulator/consumer.h:35,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvif/os.h:27,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/os.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/oclass.h:3,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/device.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/nvkm/nvfw/ls.c:22:
.../kernel-source/include/linux/suspend.h:364:36: internal compiler
error: Aborted
364 | extern void mark_free_pages(struct zone *zone);
| ^~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
.../kernel-source/scripts/Makefile.build:280: recipe for target
'drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o' failed
make[5]: *** [drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o] Error 1
make[5]: *** Waiting for unfinished jobs....
CC [M] drivers/gpu/drm/drm_gem_framebuffer_helper.o
===

This is with hardknott, aarch64-oe-linux-gcc (GCC) 10.2.0, building
5.10.* kernels (5.10.45 and 5.10.65 in the cases above IIRC). The build
is visiting drivers/gpu/drm/ in both cases, but in the former case it's
not actually a TU in there that fails, but one in net/, so I'm not even
sure it it has to do with something peculiar to the drivers/gpu/drm/
modules.

Has anyone seem something like this, or any ideas for figuring out
what's going on?

Rasmus


Khem Raj
 

its hard to say what might be going on. Perhaps enable builds with V=1
so you can see if its always failing to compile at same file.
or atleast you can get one file where it fails then you can use
preprocessed file to build it in a loop and see if you can get it to
fail more.

On Thu, Sep 23, 2021 at 8:16 AM Rasmus Villemoes via
lists.yoctoproject.org
<rasmus.villemoes=prevas.dk@lists.yoctoproject.org> wrote:


I've recently started getting an internal compiler error when building
an aarch64 kernel. It only happens once in a while, and re-running the
task usually just succeeds, so I don't know how to reproduce or trigger
this at will.

Two examples:

===
CC [M] drivers/gpu/drm/nouveau/nvkm/subdev/fb/gm20b.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/rl.o
CC [M] drivers/gpu/drm/nouveau/nvkm/subdev/fb/gp100.o
*** stack smashing detected ***: <unknown> terminated
In file included from .../kernel-source/arch/arm64/include/asm/atomic.h:15,
from .../kernel-source/include/linux/atomic.h:7,
from
.../kernel-source/include/asm-generic/bitops/atomic.h:5,
from .../kernel-source/arch/arm64/include/asm/bitops.h:26,
from .../kernel-source/include/linux/bitops.h:29,
from .../kernel-source/include/linux/kernel.h:12,
from .../kernel-source/include/linux/uio.h:8,
from .../kernel-source/include/linux/socket.h:8,
from .../kernel-source/include/uapi/linux/if.h:25,
from .../kernel-source/net/mac80211/led.c:7:
.../kernel-source/include/net/inet_sock.h: In function 'inet_sk_state_load':
.../kernel-source/arch/arm64/include/asm/barrier.h:114:8: internal
compiler error: Aborted
114 | union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \
| ^
.../kernel-source/include/asm-generic/barrier.h:142:29: note: in
expansion of macro '__smp_load_acquire'
142 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^~~~~~~~~~~~~~~~~~
.../kernel-source/include/net/inet_sock.h:312:9: note: in expansion of
macro 'smp_load_acquire'
312 | return smp_load_acquire(&sk->sk_state);
| ^~~~~~~~~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
.../kernel-source/scripts/Makefile.build:279: recipe for target
'net/mac80211/led.o' failed
make[3]: *** [net/mac80211/led.o] Error 1
make[3]: *** Waiting for unfinished jobs....
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lag.o
===
CC [M] drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o
CC [M] drivers/gpu/drm/drm_modeset_helper.o
CC [M] drivers/gpu/drm/drm_scdc_helper.o
*** stack smashing detected ***: <unknown> terminated
In file included from
.../kernel-source/include/linux/regulator/consumer.h:35,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvif/os.h:27,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/os.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/oclass.h:3,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/device.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/include/nvkm/core/subdev.h:4,
from
.../kernel-source/drivers/gpu/drm/nouveau/nvkm/nvfw/ls.c:22:
.../kernel-source/include/linux/suspend.h:364:36: internal compiler
error: Aborted
364 | extern void mark_free_pages(struct zone *zone);
| ^~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://gcc.gnu.org/bugs/> for instructions.
.../kernel-source/scripts/Makefile.build:280: recipe for target
'drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o' failed
make[5]: *** [drivers/gpu/drm/nouveau/nvkm/nvfw/ls.o] Error 1
make[5]: *** Waiting for unfinished jobs....
CC [M] drivers/gpu/drm/drm_gem_framebuffer_helper.o
===

This is with hardknott, aarch64-oe-linux-gcc (GCC) 10.2.0, building
5.10.* kernels (5.10.45 and 5.10.65 in the cases above IIRC). The build
is visiting drivers/gpu/drm/ in both cases, but in the former case it's
not actually a TU in there that fails, but one in net/, so I'm not even
sure it it has to do with something peculiar to the drivers/gpu/drm/
modules.

Has anyone seem something like this, or any ideas for figuring out
what's going on?

Rasmus