qemux86-64 ltp backtrace


Richard Purdie
 

Hi,

We saw an LTP image failure today:

https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/2937

It was in the contoller tests (cgroup ones). I logged in and was able to grab
the backtrace below. More logs are here:

https://autobuilder.yocto.io/pub/failed-builds-data/alma8-ty-2/qemux86-64-ltp-20220224/

(there were earlier OOMs in the build, normal for ltp testing and quite a while
before this other failure).

Not sure if it is helpful/useful but wanted to log it and save the logs before
they disappeared.

Cheers,

Richard

[ 2117.798342] BUG: unable to handle page fault for address: 000027ff010040a8
[ 2117.802960] #PF: supervisor instruction fetch in kernel mode
[ 2117.803589] #PF: error_code(0x0010) - not-present page
[ 2117.804169] PGD 0 P4D 0
[ 2117.804471] Oops: 0010 [#1] PREEMPT SMP PTI
[ 2117.804945] CPU: 2 PID: 7 Comm: kworker/u8:0 Not tainted 5.15.22-yocto-standard #1
[ 2117.805782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[ 2117.807015] Workqueue: events_unbound flush_memcg_stats_dwork
[ 2117.807652] RIP: 0010:0x27ff010040a8
[ 2117.808052] Code: Unable to access opcode bytes at RIP 0x27ff0100407e.
[ 2117.808820] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002
[ 2117.809402] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000
[ 2117.810185] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280
[ 2117.810961] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000
[ 2117.811740] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003
[ 2117.812520] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0
[ 2117.813301] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000
[ 2117.814186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2117.814817] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0
[ 2117.815601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2117.816383] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2117.817167] Call Trace:
[ 2117.817444] <TASK>
[ 2117.817684] ? cgroup_rstat_flush_locked+0x20b/0x2c0
[ 2117.818245] cgroup_rstat_flush_irqsafe+0x29/0x40
[ 2117.818764] __mem_cgroup_flush_stats+0x3e/0x60
[ 2117.819271] flush_memcg_stats_dwork+0xe/0x30
[ 2117.819757] process_one_work+0x1d5/0x3e0
[ 2117.820209] worker_thread+0x53/0x3f0
[ 2117.820616] ? rescuer_thread+0x360/0x360
[ 2117.821060] kthread+0x13b/0x160
[ 2117.821424] ? set_kthread_struct+0x50/0x50
[ 2117.821887] ret_from_fork+0x22/0x30
[ 2117.822290] </TASK>
[ 2117.822544] Modules linked in: bnep
[ 2117.822939] CR2: 000027ff010040a8
[ 2117.823319] ---[ end trace f11fe9e87c0e7a6f ]---
[ 2117.823832] RIP: 0010:0x27ff010040a8
[ 2117.824244] Code: Unable to access opcode bytes at RIP 0x27ff0100407e.
[ 2117.824958] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002
[ 2117.825544] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000
[ 2117.826325] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280
[ 2117.827101] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000
[ 2117.827882] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003
[ 2117.828669] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0
[ 2117.829450] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000
[ 2117.830343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2117.830974] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0
[ 2117.831756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2117.832540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2117.833320] note: kworker/u8:0[7] exited with preempt_count 3


Bruce Ashfield <bruce.ashfield@...>
 

On Thu, Feb 24, 2022 at 6:04 AM Richard Purdie
<richard.purdie@...> wrote:

Hi,

We saw an LTP image failure today:

https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/2937

It was in the contoller tests (cgroup ones). I logged in and was able to grab
the backtrace below. More logs are here:

https://autobuilder.yocto.io/pub/failed-builds-data/alma8-ty-2/qemux86-64-ltp-20220224/

(there were earlier OOMs in the build, normal for ltp testing and quite a while
before this other failure).

Not sure if it is helpful/useful but wanted to log it and save the logs before
they disappeared.
There's not a lot of useful information in the trace, but even under
memory pressure / OOM, we shouldn't get a kernel oops, so it very well
could be a kernel issue when resources are low.

As we all know, there's probably not much we can do except watch and
see if it repeats, since it is unlikely we can reproduce it on demand.

Bruce


Cheers,

Richard

[ 2117.798342] BUG: unable to handle page fault for address: 000027ff010040a8
[ 2117.802960] #PF: supervisor instruction fetch in kernel mode
[ 2117.803589] #PF: error_code(0x0010) - not-present page
[ 2117.804169] PGD 0 P4D 0
[ 2117.804471] Oops: 0010 [#1] PREEMPT SMP PTI
[ 2117.804945] CPU: 2 PID: 7 Comm: kworker/u8:0 Not tainted 5.15.22-yocto-standard #1
[ 2117.805782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[ 2117.807015] Workqueue: events_unbound flush_memcg_stats_dwork
[ 2117.807652] RIP: 0010:0x27ff010040a8
[ 2117.808052] Code: Unable to access opcode bytes at RIP 0x27ff0100407e.
[ 2117.808820] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002
[ 2117.809402] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000
[ 2117.810185] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280
[ 2117.810961] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000
[ 2117.811740] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003
[ 2117.812520] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0
[ 2117.813301] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000
[ 2117.814186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2117.814817] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0
[ 2117.815601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2117.816383] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2117.817167] Call Trace:
[ 2117.817444] <TASK>
[ 2117.817684] ? cgroup_rstat_flush_locked+0x20b/0x2c0
[ 2117.818245] cgroup_rstat_flush_irqsafe+0x29/0x40
[ 2117.818764] __mem_cgroup_flush_stats+0x3e/0x60
[ 2117.819271] flush_memcg_stats_dwork+0xe/0x30
[ 2117.819757] process_one_work+0x1d5/0x3e0
[ 2117.820209] worker_thread+0x53/0x3f0
[ 2117.820616] ? rescuer_thread+0x360/0x360
[ 2117.821060] kthread+0x13b/0x160
[ 2117.821424] ? set_kthread_struct+0x50/0x50
[ 2117.821887] ret_from_fork+0x22/0x30
[ 2117.822290] </TASK>
[ 2117.822544] Modules linked in: bnep
[ 2117.822939] CR2: 000027ff010040a8
[ 2117.823319] ---[ end trace f11fe9e87c0e7a6f ]---
[ 2117.823832] RIP: 0010:0x27ff010040a8
[ 2117.824244] Code: Unable to access opcode bytes at RIP 0x27ff0100407e.
[ 2117.824958] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002
[ 2117.825544] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000
[ 2117.826325] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280
[ 2117.827101] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000
[ 2117.827882] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003
[ 2117.828669] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0
[ 2117.829450] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000
[ 2117.830343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2117.830974] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0
[ 2117.831756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2117.832540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2117.833320] note: kworker/u8:0[7] exited with preempt_count 3


--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II