Re: ltp failures on autobuilder


Randy MacLeod
 

On 2021-06-10 1:02 p.m., Richard Purdie wrote:
Noting down what we know about the ltp issue:
We've seen intermittent issues on the autobuilder where some ltp tests fail or
hang. I've been trying to figure out how to reproduce the issue and narrow down
the cause.
I was able to isolate a patch which reproduces the issue for me:
http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/t222&id=d7d65aae104caa03afc28837b0abe0b486d5a8b8
with master-next, setting:
IMAGE_INSTALL_append = ' ltp'
TEST_SUITES = 'ping ssh ltp'
then
bitbake core-image-sato; bitbake core-image-sato -c testimage
where the issue shows up as a kernel "BUG:" in the logs in WORKDIR/testimage/qemu_*
The above patch runs the minimum of ltp tests I could find which replicate the issue.
I've reproduced this on 5.10.1 -> 5.10.42, 5.4.123 and 5.13-rc5.
(and we've ruled out linux-yocto with plain kernels)
Also reproduced on both qemu 6.0.0 and 5.2.0.
My build machine is an Ubuntu 20.04.2 LTS with:
Linux version 5.4.0-74-generic (buildd@lgw01-amd64-038) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021
I tried to reproduce this on a Ubuntu-18.04.3 system with:
Linux ala-lpggp3 5.4.0-72-generic #80~18.04.1-Ubuntu SMP
Mon Apr 12 23:26:25 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Using poky-contrib:

$ git status
On branch rpurdie/t222
Your branch is up to date with 'origin/rpurdie/t222'.

nothing to commit, working tree clean

$ git log --oneline -3
d7d65aae10 (HEAD -> rpurdie/t222, origin/rpurdie/t222)
ltp: Simplify for kernel crash reproducer
e175e2855d linx-yocto/5.10: re-import aufs to v5.10
753ae7dcd5 linux-yocto: test-only. override LINUX_VERSION for qemux86-64

---

My local.conf was generated, then I added:

IMAGE_INSTALL_append = ' ltp'
TEST_SUITES = 'ping ssh ltp'
INHERIT += "testimage"


and with 11 runs of:

$ bitbake core-image-sato -c testimage

I did not see the error in any of the qemu logs.

(cd tmp/work/qemux86_64-poky-linux/core-image-sato/1.0-r0/testimage/; ls qemu_boot_log.202106101*)
qemu_boot_log.20210610155947 qemu_boot_log.20210610170535 qemu_boot_log.20210610172026 qemu_boot_log.20210610173508 qemu_boot_log.20210610174959 qemu_boot_log.20210610180444
qemu_boot_log.20210610165758 qemu_boot_log.20210610171302 qemu_boot_log.20210610172743 qemu_boot_log.20210610174235 qemu_boot_log.20210610175720

$ rgrep BUG: tmp/work/qemux86_64-poky-linux/core-image-sato/1.0-r0/testimage/*


There is an OOM run as RP sees as well and the typical lead-up to that is:

[ 225.248350]
hrtimer: interrupt took 4935186 ns
[ 249.250283]
option changes via remount are deprecated (pid=3001 comm=mount)
[ 250.200586]
option changes via remount are deprecated (pid=3019 comm=mount)
[ 283.695208]
memcg_test_1 invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL),
order=0, oom_score_adj=0
[ 283.702108]
CPU: 1 PID: 3798 Comm: memcg_test_1 Not tainted
5.10.42-yocto-standard #1


Note that I am running this on a server without x11 forwarding.
Is your testing done on a local machine Richard? I doubt it matters
but I just want to be sure we understand how you are testing.


I am going to try on another server running ubu-21.04.

../Randy


Cheers,
Richard

--
# Randy MacLeod
# Wind River Linux

Join {swat@lists.yoctoproject.org to automatically receive all group messages.