Date   

Re: [PATCH 0/4] Re-implement prserv on top of asyncrpc

Paul Barker <pbarker@...>
 

On Mon, 31 May 2021 at 12:25, Richard Purdie
<richard.purdie@...> wrote:

Hi Paul,

On Fri, 2021-05-28 at 09:42 +0100, Paul Barker wrote:
These changes replace the old XML-based RPC system in prserv with the
new asyncrpc implementation originally used by hashserv. A couple of
improvments are required in asyncrpc to support this.

I finally stumbled across the issue which led to the hanging builds
seen on the autobuilder when testing the initial RFC series.
It was a fairly dumb mistake on my behalf and I'm not sure how it
didn't trigger in my initial testing! The
`PRServerClient.handle_export()` function was missing a call to
`self.write_message()` so the client just ended up stuck waiting for a
response that was never to come. This issue is fixed here.

I've ran these changes through both `bitbake-selftest` and
`oe-selftest -a` and all looks good on my end. A couple of failures
were seen in oe-selftest but these are related to my host system
configuration (socat not installed, firewall blocking ports, etc) so
I'm fairly confident they aren't caused by this patch series.
Thanks for these. Unfortunately I think there is still a gremlin somewhere
as this was included in an autobuilder test build that is showing as this:

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2203

i.e. all four selftests have not finished and I'd have expected them to
by now.
(╯°□°)╯︵ ┻━┻


I'm trying not to work today so I haven't debugged them or confirmed where
they are hanging but it seems likely related.
If you're planning to take the day off don't worry about investigating
these. I'll take a look at the patches again on Wednesday. I think the
best approach may be to add some timeouts and maybe more error
handling to the asyncrpc code I extracted from hashserv - if we can
turn these hangs into a proper error then we can reduce the amount of
autobuilder time they take to test and hopefully we'll get a better
insight into what is actually going wrong. My guess is that there's
something in the autobuilder config or just the level of load on the
machines which is aggravating this as the tests finish successfully on
my build machine (with a few expected test failures as noted
previously).

Thanks,

--
Paul Barker
Konsulko Group


Re: [PATCH 0/4] Re-implement prserv on top of asyncrpc

Richard Purdie
 

Hi Paul,

On Fri, 2021-05-28 at 09:42 +0100, Paul Barker wrote:
These changes replace the old XML-based RPC system in prserv with the
new asyncrpc implementation originally used by hashserv. A couple of
improvments are required in asyncrpc to support this.

I finally stumbled across the issue which led to the hanging builds
seen on the autobuilder when testing the initial RFC series.
It was a fairly dumb mistake on my behalf and I'm not sure how it
didn't trigger in my initial testing! The
`PRServerClient.handle_export()` function was missing a call to
`self.write_message()` so the client just ended up stuck waiting for a
response that was never to come. This issue is fixed here.

I've ran these changes through both `bitbake-selftest` and
`oe-selftest -a` and all looks good on my end. A couple of failures
were seen in oe-selftest but these are related to my host system
configuration (socat not installed, firewall blocking ports, etc) so
I'm fairly confident they aren't caused by this patch series.
Thanks for these. Unfortunately I think there is still a gremlin somewhere
as this was included in an autobuilder test build that is showing as this:

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2203

i.e. all four selftests have not finished and I'd have expected them to 
by now.

I'm trying not to work today so I haven't debugged them or confirmed where
they are hanging but it seems likely related.

Cheers,

Richard


SWAT Rotation

Alexandre Belloni
 

Hello 민재,

As discussed last week, you are on SWAT duty this week, could you
confirm you'll be able to work on the topic?

I hope you moved without any issue.

Regards,

--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


qemuarm failure on autobuilder - analysis

Richard Purdie
 

Today's failure for analysis is a qemuarm failure where all the test images 
failed at the same time:

https://autobuilder.yoctoproject.org/typhoon/#/builders/53/builds/3493

We have's Randy+team's logging for this here:

https://autobuilder.yocto.io/pub/non-release/20210526-2/testresults/qemuarm/2021-05-26--08-58/host_stats_1_top.txt

What is interesting is the load average is peaked at the point the three 
qemu-system-arm are running at 300+ compared to the usual 50-80.

What is it doing at the time? In parallel to qemuarm there appears to 
be:

reproducibile-fedora 
(in stage B, non-sstate, i.e. from scratch) 
building llvm, webkitgkt, kernel-devsrc, qemu, kea
musl-x86-64
building webkitgtk, piglit, cmake, kernel-devsrc, llvm, stress-ng

which is a pretty heavy workload as those are all pretty heavy targets.

I question whether cmake's rpmbuild really should be using 7g of RES 
memory (15g VIRT).

The python bitbake-worker processes at near 100% cpu are interesting, 
I suspect those are python tasks for recipes. The code is meant to 
rename the process so we can better identify it but that is a tricky
thing under linux and hints it may not be working.

Bottom line for this one suggests it was load related.

Cheers,

Richard


Re: SWAT statistics for week 19

Alexandre Belloni
 

Hi,

On 25/05/2021 19:33:53-0400, Randy MacLeod wrote:
On 2021-05-18 6:26 p.m., Alexandre Belloni wrote:
Hi,

On 18/05/2021 23:21:53+0100, Ross Burton wrote:
Quick idea for swatbot: a top ten list of open bugs which have the
highest number of instances.
I'm maintaining a spreadsheet that goes a bit beyond that. I'm also
tracking the frequency of the bugs in the last months and we started to
close few of the older AB-INT issues. I'll share that publicly soon.
Hi Alexandre,

Any update on the list/spreadsheet?
Ah sorry, I forgot to send it. It is up to date with what was triaged
last week:

https://docs.google.com/spreadsheets/d/1bviDvW1SRwflofKLx9SwPUTWE3sBkvL3eb1PKnmZJcM/edit?usp=sharing

Tony is getting going on valgrind and he'll start with:
   https://bugzilla.yoctoproject.org/show_bug.cgi?id=14294

   [Bug 14294] valgrind memcheck/tests/linux/timerfd-syscall ptest
intermittent failure

unless there's another ptest issue that is more urgent.
This is probably the best one to start with, the following one would be
14311 which has more occurrences but is about multiple (related) ptests.

Regards,

--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: SWAT statistics for week 19

Randy MacLeod
 

On 2021-05-18 6:26 p.m., Alexandre Belloni wrote:
Hi,

On 18/05/2021 23:21:53+0100, Ross Burton wrote:
Quick idea for swatbot: a top ten list of open bugs which have the
highest number of instances.
I'm maintaining a spreadsheet that goes a bit beyond that. I'm also
tracking the frequency of the bugs in the last months and we started to
close few of the older AB-INT issues. I'll share that publicly soon.
Hi Alexandre,

Any update on the list/spreadsheet?

Tony is getting going on valgrind and he'll start with:
   https://bugzilla.yoctoproject.org/show_bug.cgi?id=14294

   [Bug 14294] valgrind memcheck/tests/linux/timerfd-syscall ptest intermittent failure

unless there's another ptest issue that is more urgent.


../Randy


Ross

On Tue, 18 May 2021 at 23:19, Alexandre Belloni
<alexandre.belloni@...> wrote:
Hello,

Here are the statistics for last week. Chee Yang was on SWAT duty.

160 failures were triaged:

* 119 by Chee Yang
- 38 for meson changes
- 24 for an issue in meta-arm after an upgrade of u-boot
- 11 for the btrfs-tools upgrade
- 6 for ovmf reproducibility issues
- 2 for meta-oe YP compatibility issues
- 4 new occurrences of bug 14310
- 4 new occurrences of bug 14251
- 3 new occurrences of bug 13802
- 3 new occurrences of bug 14273
- 2 new occurrences of bug 14208
- 2 new occurrences of bug 14381
- 1 new occurrence of bug 14145
- 1 new occurrence of bug 14163
- 1 new occurrence of bug 14165
- 1 new occurrence of bug 14177
- 1 new occurrence of bug 14197
- 1 new occurrence of bug 14201
- 1 new occurrence of bug 14250
- 1 new occurrence of bug 14294
- 1 new occurrence of bug 14296
- 1 new occurrence of bug 14311
- 4 occurrences of new bug 14388
- 2 occurrences of new bug 14393
- 1 occurrence of new bug 14389
- 1 occurrence of new bug 14390
- 1 occurrence of new bug 14391

* 41 by Richard
- 20 for an issue in meta-arm after an upgrade of u-boot
- 10 for issues he fixed
- 4 for the libepoxy upgrade
- 2 for YP compatibility issues in meta-AGL
- 2 for patches merged out of order
- 2 for branch names changed upstream
- 1 because gitlab was down

Regards,

--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com






--
# Randy MacLeod
# Wind River Linux


ubuntu2004-arm-1 load increase and possible instablity

Michael Halstead <mhalstead@...>
 

The ubuntu2004-arm-1 worker has been unstable in the past and we reduced the number of simultaneous builds from 3 to 1 to see if that would stop the crashes. It didn't at first but now the crashes have stopped. Perhaps due to kernel updates. I'm planning to increase the simultaneous builds back to 3 when the controller is next idle. This may cause the crashes to begin again and I want the SWAT team to be aware of the change.

--
Michael Halstead
Linux Foundation / Yocto Project
Systems Operations Engineer


Re: Further rcu stall on autobuilder

Richard Purdie
 

On Mon, 2021-05-24 at 15:29 +0100, Richard Purdie via lists.yoctoproject.org wrote:
On Mon, 2021-05-24 at 09:21 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:56 PM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 12:51 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:47 PM Richard Purdie
<richard.purdie@...> wrote:
A set of SRCREVs sounds like the best plan, I think it might be worth testing
to see if things improve or not.
I created the attached recipes. Built and booted on qemux86-64 with no
issues.

I assume you'll do the appropriate preferred version in the test
branches to make
sure they are used instead of 5.10 ?
About the time you were writing this, I'd hacked up:

http://git.yoctoproject.org/cgit.cgi/poky/commit/?h=master-next&id=de3e2253482b6d9df1137128a9fde35dec8fd915

and put it into a build on the autobuilder. It caused meta-arm to blow up
and I suspect there may be other fallout but we'll see...

FWIW, I checked with Alexandre and it seems all the rcu failure issues
are on qemuXXX builds but not qemuXXX-alt. The former is 5.10, the latter 
5.4.

I'm starting to strongly suspect there is some issue with 5.10 as we don't
see this with dunfell or with poky-alt :/. I'd wonder why nobody else has
noticed though...
I switched to Bruce's 5.12 patches. Unfortunately even with 5.12:

https://autobuilder.yoctoproject.org/typhoon/#/builders/81/builds/2118/steps/12/logs/stdio

:(

Also,
https://autobuilder.yoctoproject.org/typhoon/#/builders/110/builds/2362
and the corresponding:
https://autobuilder.yocto.io/pub/non-release/20210523-10/testresults/qemuarm-alt/2021-05-24--01-52/host_stats_1_top.txt
is interesting. That was a qemuarm-alt image (5.4 kernel) which could be a genuine load 
issue. It is getting 300% cpu though so hardly resource starved.

Ideas welcome at this point.

Cheers,

Richard


Re: Further ltp hang - kernel issue?

Bruce Ashfield <bruce.ashfield@...>
 

On Mon, May 24, 2021 at 11:31 AM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 07:42 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 6:36 AM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 11:33 +0100, Richard Purdie via lists.yoctoproject.org wrote:
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1932
Oddly enough,
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1933
on centos7-ty-4 (master build) is locked up with pretty much exactly
the same issue/ps output/tests/dmesg.

The first one above was debian10-ty-1 with master-next.

Recent kernel version bump?
I can't think of anything specific that would cause those issues, but the
Wind River guys did report some bad iommu patches that were part of
5.10.37

I've merged .38, which has the fixes, but I haven't sent the bumps yet.
It is worth trying the attached SRCREV patch, to see if there's any
change in behaviour.
Thanks for the patch, I ran with it for a number of runs. I have not seen .38
break in the way master or master-next with .37 did. I've ran several and 50%
of the time .37 would hang in ltp.

Can we upgrade to .38 ASAP please? :)
sent. I cherry picked it from my queue and sent it individually.

I'll continue testing the rest of my updates.

Bruce

This is obviously a separate issue to the rcu stalls but I also think that
is 5.10 related.

Cheers,

Richard


--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II


Re: Further ltp hang - kernel issue?

Richard Purdie
 

On Sun, 2021-05-23 at 07:42 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 6:36 AM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 11:33 +0100, Richard Purdie via lists.yoctoproject.org wrote:
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1932
Oddly enough,
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1933
on centos7-ty-4 (master build) is locked up with pretty much exactly
the same issue/ps output/tests/dmesg.

The first one above was debian10-ty-1 with master-next.

Recent kernel version bump?
I can't think of anything specific that would cause those issues, but the
Wind River guys did report some bad iommu patches that were part of
5.10.37

I've merged .38, which has the fixes, but I haven't sent the bumps yet.
It is worth trying the attached SRCREV patch, to see if there's any
change in behaviour.
Thanks for the patch, I ran with it for a number of runs. I have not seen .38
break in the way master or master-next with .37 did. I've ran several and 50%
of the time .37 would hang in ltp.

Can we upgrade to .38 ASAP please? :)

This is obviously a separate issue to the rcu stalls but I also think that
is 5.10 related.

Cheers,

Richard


Re: Further rcu stall on autobuilder

Richard Purdie
 

On Mon, 2021-05-24 at 09:21 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:56 PM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 12:51 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:47 PM Richard Purdie
<richard.purdie@...> wrote:
A set of SRCREVs sounds like the best plan, I think it might be worth testing
to see if things improve or not.
I created the attached recipes. Built and booted on qemux86-64 with no
issues.

I assume you'll do the appropriate preferred version in the test
branches to make
sure they are used instead of 5.10 ?
About the time you were writing this, I'd hacked up:

http://git.yoctoproject.org/cgit.cgi/poky/commit/?h=master-next&id=de3e2253482b6d9df1137128a9fde35dec8fd915

and put it into a build on the autobuilder. It caused meta-arm to blow up
and I suspect there may be other fallout but we'll see...

FWIW, I checked with Alexandre and it seems all the rcu failure issues
are on qemuXXX builds but not qemuXXX-alt. The former is 5.10, the latter 
5.4.

I'm starting to strongly suspect there is some issue with 5.10 as we don't
see this with dunfell or with poky-alt :/. I'd wonder why nobody else has
noticed though...

Cheers,

Richard


Re: Further rcu stall on autobuilder

Bruce Ashfield <bruce.ashfield@...>
 

On Sun, May 23, 2021 at 12:56 PM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 12:51 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:47 PM Richard Purdie
<richard.purdie@...> wrote:

We've got yet another rcu stall failure on the autobuilder:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2123/steps/15/logs/stdio

and looking at the dmesg in the qemu log:

[ 20.424033] Freeing unused kernel image (rodata/data gap) memory: 652K
[ 20.425229] Run /sbin/init as init process
INIT: version 2.99 booting
FBIOPUT_VSCREENINFO failed, double buffering disabledStarting udev
[ 20.547298] udevd[161]: starting version 3.2.10
[ 20.553329] udevd[162]: starting eudev-3.2.10
[ 20.751260] EXT4-fs (vda): re-mounted. Opts: (null)
[ 20.752548] ext4 filesystem being remounted at / supports timestamps until 2038 (0x7fffffff)
INIT: Entering runlevel: 5
Configuring network interfaces... RTNETLINK answers: File exists
Starting random number generator daemon.
Starting OpenBSD Secure Shell server: sshd
done.
Starting rpcbind daemon...done.
starting statd: done
Starting atd: OK
[ 21.921925] Installing knfsd (copyright (C) 1996 okir@...).
starting 8 nfsd kernel threads: [ 23.066283] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 23.068096] NFSD: Using legacy client tracking operations.
[ 23.069086] NFSD: starting 90-second grace period (net f0000098)
done
starting mountd: [ 45.272151] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 45.273423] rcu: 1-...0: (10 ticks this GP) idle=7ba/1/0x4000000000000000 softirq=598/612 fqs=5249
[ 45.274951] (detected by 2, t=21002 jiffies, g=-195, q=13)
[ 138.202149] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 332.762209] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:

This is with the kvm clock source disabled (in master-next) and with Bruce's
5.10.38 upgrade so that kind of rules out either of those two things for this
issue. It also can't be the qemu platform or cpu emulation used since we've
changed that.

What is really odd is that it never actually prints the stalled tasks. That
seems really strange. It is obviously alive enough to print a stall message
later but stalls out and is terminated after 1500s.

Really open to ideas at this point. Should we try a newer kernel version
for testing in -next, see if we can isolate this to 5.10?
If you want to switch to linux-yocto-dev, it is on 5.12.x, and I have
a local 5.13-rcX version of -dev.

We could whip together a SRCREV recipe for it, if you don't want to
use the AUTOREV.

I'm not going to do a full versioned linux-yocto for 5.12, but we can
special case this if we want to go that route.
A set of SRCREVs sounds like the best plan, I think it might be worth testing
to see if things improve or not.
I created the attached recipes. Built and booted on qemux86-64 with no
issues.

I assume you'll do the appropriate preferred version in the test
branches to make
sure they are used instead of 5.10 ?

Bruce


What is also odd is that in that in that same build, another qemu instance
hung in syslinux loading bzImage. We've seen this before occasionally and
it seems to keep happening periodically. That would seem more like a qemu
bug yet we're on the latest qemu release :/.

In neither case did Randy's stall detector trigger as far as I can tell.

Cheers,

Richard

--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II


Re: SWAT Rotation

Alexandre Belloni
 

Hello Jaga,

Are you available for SWAT this week?

I'll be looking at some of the failures today.

On 22/05/2021 11:44:56+0900, 김민재 wrote:
Hi Alexandre


I am sorry. I can't work next week. Because my house is moving to another place
in this weekend. So, I can swat work on June 1st.

Can I delay my rotation by just one week?
Sure, no problem


--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: Further rcu stall on autobuilder

Richard Purdie
 

On Sun, 2021-05-23 at 12:51 -0400, Bruce Ashfield wrote:
On Sun, May 23, 2021 at 12:47 PM Richard Purdie
<richard.purdie@...> wrote:

We've got yet another rcu stall failure on the autobuilder:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2123/steps/15/logs/stdio

and looking at the dmesg in the qemu log:

[ 20.424033] Freeing unused kernel image (rodata/data gap) memory: 652K
[ 20.425229] Run /sbin/init as init process
INIT: version 2.99 booting
FBIOPUT_VSCREENINFO failed, double buffering disabledStarting udev
[ 20.547298] udevd[161]: starting version 3.2.10
[ 20.553329] udevd[162]: starting eudev-3.2.10
[ 20.751260] EXT4-fs (vda): re-mounted. Opts: (null)
[ 20.752548] ext4 filesystem being remounted at / supports timestamps until 2038 (0x7fffffff)
INIT: Entering runlevel: 5
Configuring network interfaces... RTNETLINK answers: File exists
Starting random number generator daemon.
Starting OpenBSD Secure Shell server: sshd
done.
Starting rpcbind daemon...done.
starting statd: done
Starting atd: OK
[ 21.921925] Installing knfsd (copyright (C) 1996 okir@...).
starting 8 nfsd kernel threads: [ 23.066283] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 23.068096] NFSD: Using legacy client tracking operations.
[ 23.069086] NFSD: starting 90-second grace period (net f0000098)
done
starting mountd: [ 45.272151] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 45.273423] rcu: 1-...0: (10 ticks this GP) idle=7ba/1/0x4000000000000000 softirq=598/612 fqs=5249
[ 45.274951] (detected by 2, t=21002 jiffies, g=-195, q=13)
[ 138.202149] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 332.762209] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:

This is with the kvm clock source disabled (in master-next) and with Bruce's
5.10.38 upgrade so that kind of rules out either of those two things for this
issue. It also can't be the qemu platform or cpu emulation used since we've
changed that.

What is really odd is that it never actually prints the stalled tasks. That
seems really strange. It is obviously alive enough to print a stall message
later but stalls out and is terminated after 1500s.

Really open to ideas at this point. Should we try a newer kernel version
for testing in -next, see if we can isolate this to 5.10?
If you want to switch to linux-yocto-dev, it is on 5.12.x, and I have
a local 5.13-rcX version of -dev.

We could whip together a SRCREV recipe for it, if you don't want to
use the AUTOREV.

I'm not going to do a full versioned linux-yocto for 5.12, but we can
special case this if we want to go that route.
A set of SRCREVs sounds like the best plan, I think it might be worth testing
to see if things improve or not.

What is also odd is that in that in that same build, another qemu instance
hung in syslinux loading bzImage. We've seen this before occasionally and
it seems to keep happening periodically. That would seem more like a qemu
bug yet we're on the latest qemu release :/.

In neither case did Randy's stall detector trigger as far as I can tell.

Cheers,

Richard


Re: Further rcu stall on autobuilder

Bruce Ashfield <bruce.ashfield@...>
 

On Sun, May 23, 2021 at 12:47 PM Richard Purdie
<richard.purdie@...> wrote:

We've got yet another rcu stall failure on the autobuilder:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2123/steps/15/logs/stdio

and looking at the dmesg in the qemu log:

[ 20.424033] Freeing unused kernel image (rodata/data gap) memory: 652K
[ 20.425229] Run /sbin/init as init process
INIT: version 2.99 booting
FBIOPUT_VSCREENINFO failed, double buffering disabledStarting udev
[ 20.547298] udevd[161]: starting version 3.2.10
[ 20.553329] udevd[162]: starting eudev-3.2.10
[ 20.751260] EXT4-fs (vda): re-mounted. Opts: (null)
[ 20.752548] ext4 filesystem being remounted at / supports timestamps until 2038 (0x7fffffff)
INIT: Entering runlevel: 5
Configuring network interfaces... RTNETLINK answers: File exists
Starting random number generator daemon.
Starting OpenBSD Secure Shell server: sshd
done.
Starting rpcbind daemon...done.
starting statd: done
Starting atd: OK
[ 21.921925] Installing knfsd (copyright (C) 1996 okir@...).
starting 8 nfsd kernel threads: [ 23.066283] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 23.068096] NFSD: Using legacy client tracking operations.
[ 23.069086] NFSD: starting 90-second grace period (net f0000098)
done
starting mountd: [ 45.272151] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 45.273423] rcu: 1-...0: (10 ticks this GP) idle=7ba/1/0x4000000000000000 softirq=598/612 fqs=5249
[ 45.274951] (detected by 2, t=21002 jiffies, g=-195, q=13)
[ 138.202149] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 332.762209] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:

This is with the kvm clock source disabled (in master-next) and with Bruce's
5.10.38 upgrade so that kind of rules out either of those two things for this
issue. It also can't be the qemu platform or cpu emulation used since we've
changed that.

What is really odd is that it never actually prints the stalled tasks. That
seems really strange. It is obviously alive enough to print a stall message
later but stalls out and is terminated after 1500s.

Really open to ideas at this point. Should we try a newer kernel version
for testing in -next, see if we can isolate this to 5.10?
If you want to switch to linux-yocto-dev, it is on 5.12.x, and I have
a local 5.13-rcX version of -dev.

We could whip together a SRCREV recipe for it, if you don't want to
use the AUTOREV.

I'm not going to do a full versioned linux-yocto for 5.12, but we can
special case this if we want to go that route.

Bruce



Cheers,

Richard

--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II


Further rcu stall on autobuilder

Richard Purdie
 

We've got yet another rcu stall failure on the autobuilder:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2123/steps/15/logs/stdio

and looking at the dmesg in the qemu log:

[ 20.424033] Freeing unused kernel image (rodata/data gap) memory: 652K
[ 20.425229] Run /sbin/init as init process
INIT: version 2.99 booting
FBIOPUT_VSCREENINFO failed, double buffering disabledStarting udev
[ 20.547298] udevd[161]: starting version 3.2.10
[ 20.553329] udevd[162]: starting eudev-3.2.10
[ 20.751260] EXT4-fs (vda): re-mounted. Opts: (null)
[ 20.752548] ext4 filesystem being remounted at / supports timestamps until 2038 (0x7fffffff)
INIT: Entering runlevel: 5
Configuring network interfaces... RTNETLINK answers: File exists
Starting random number generator daemon.
Starting OpenBSD Secure Shell server: sshd
done.
Starting rpcbind daemon...done.
starting statd: done
Starting atd: OK
[ 21.921925] Installing knfsd (copyright (C) 1996 okir@...).
starting 8 nfsd kernel threads: [ 23.066283] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[ 23.068096] NFSD: Using legacy client tracking operations.
[ 23.069086] NFSD: starting 90-second grace period (net f0000098)
done
starting mountd: [ 45.272151] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 45.273423] rcu: 1-...0: (10 ticks this GP) idle=7ba/1/0x4000000000000000 softirq=598/612 fqs=5249
[ 45.274951] (detected by 2, t=21002 jiffies, g=-195, q=13)
[ 138.202149] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 332.762209] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:

This is with the kvm clock source disabled (in master-next) and with Bruce's 
5.10.38 upgrade so that kind of rules out either of those two things for this
issue. It also can't be the qemu platform or cpu emulation used since we've
changed that.

What is really odd is that it never actually prints the stalled tasks. That
seems really strange. It is obviously alive enough to print a stall message
later but stalls out and is terminated after 1500s.

Really open to ideas at this point. Should we try a newer kernel version
for testing in -next, see if we can isolate this to 5.10?

Cheers,

Richard


Re: Further ltp hang - kernel issue?

Bruce Ashfield <bruce.ashfield@...>
 

On Sun, May 23, 2021 at 6:36 AM Richard Purdie
<richard.purdie@...> wrote:

On Sun, 2021-05-23 at 11:33 +0100, Richard Purdie via lists.yoctoproject.org wrote:
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1932
Oddly enough,
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1933
on centos7-ty-4 (master build) is locked up with pretty much exactly
the same issue/ps output/tests/dmesg.

The first one above was debian10-ty-1 with master-next.

Recent kernel version bump?
I can't think of anything specific that would cause those issues, but the
Wind River guys did report some bad iommu patches that were part of
5.10.37

I've merged .38, which has the fixes, but I haven't sent the bumps yet.
It is worth trying the attached SRCREV patch, to see if there's any
change in behaviour.

Bruce


Cheers,

Richard

--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II


Re: Further ltp hang - kernel issue?

Richard Purdie
 

On Sun, 2021-05-23 at 11:33 +0100, Richard Purdie via lists.yoctoproject.org wrote:
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1932
Oddly enough, 
https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1933 
on centos7-ty-4 (master build) is locked up with pretty much exactly 
the same issue/ps output/tests/dmesg.

The first one above was debian10-ty-1 with master-next.

Recent kernel version bump?

Cheers,

Richard


Further ltp hang - kernel issue?

Richard Purdie
 

https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/1932

We have another ltp hang on the autobuilder. ps ax output:

9261 ? I 0:00 [kworker/2:0-rcu_gp]
9298 ? I 0:01 [kworker/1:3-events]
11312 ? S 0:00 /sbin/syslogd -n -O /var/log/messages
11315 ? S 0:00 /sbin/klogd -n
13085 ? I 0:00 [kworker/2:4-mm_percpu_wq]
16291 ? S 0:00 /usr/sbin/dropbear -r /etc/dropbear/dropbear_rsa_host_key -p 22 -B
16292 ? S 0:00 /bin/sh /opt/ltp/runltp -f commands -p -q -r /opt/ltp -l /opt/ltp/results/commands
-I 1 -d /opt/ltp
16334 ? S 0:00 /opt/ltp/bin/ltp-pan -q -e -S -a 16292 -n 16292 -p -f /opt/ltp/ltp-
i2b5MCkGm9/alltests -l /opt/ltp/results/commands -C /opt/ltp/output/LTP_RUN_ON-commands.failed -T
/opt/ltp/output/LTP_RUN_ON-commands.t
17878 ? S< 0:00 [loop0]
17884 ? D 0:00 mount -t ext2 /dev/loop0 mntpoint
17885 ? I< 0:00 [ext4-rsv-conver]
17906 ? S< 0:00 [loop1]
17912 ? D 0:00 mount -t ext3 /dev/loop1 mntpoint
17913 ? S 0:00 [jbd2/loop1-8]
17914 ? I< 0:00 [ext4-rsv-conver]
17916 ? I 0:00 [kworker/u8:2-events_unbound]
17936 ? S< 0:00 [loop2]
17942 ? D 0:00 mount -t ext4 /dev/loop2 mntpoint
17943 ? S 0:00 [jbd2/loop2-8]
17944 ? I< 0:00 [ext4-rsv-conver]
17965 ? S< 0:00 [loop3]
17967 ? D 0:00 grep -q /dev/loop3 /proc/mounts
17988 ? S< 0:00 [loop4]
17991 ? D 0:00 mkfs.vfat /dev/loop4
18011 ? S< 0:00 [loop5]
18013 ? D 0:00 grep -q /dev/loop5 /proc/mounts
18033 ? S< 0:00 [loop6]
18035 ? D 0:00 grep -q /dev/loop6 /proc/mounts
18548 ? D 0:00 unshare --mount mount --bind dir_A dir_B
18606 ? R 0:00 /usr/sbin/dropbear -r /etc/dropbear/dropbear_rsa_host_key -p 22 -B

so we're in loopback mount tests but the small piece of kernel dmesg below 
suggests this one is a kernel issue. I only pasted a sample, the buffer is filled
with these. Added Bruce to cc:, any ideas?


[ 2131.374105] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.374650] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.376669] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.377255] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.378017] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.378813] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.379614] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.380391] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.381146] FS: 00007fd92f66b580(0000) GS:ffff88e8fec80000(0000) knlGS:0000000000000000
[ 2131.382038] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.382702] CR2: 0000000000000008 CR3: 0000000035466000 CR4: 00000000001506e0
[ 2131.384305] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.385060] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.423042] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.426829] #PF: supervisor read access in kernel mode
[ 2131.427630] #PF: error_code(0x0000) - not-present page
[ 2131.428337] PGD 0 P4D 0
[ 2131.428692] Oops: 0000 [#1018] PREEMPT SMP PTI
[ 2131.429302] CPU: 3 PID: 15773 Comm: grep Tainted: G D 5.10.37-yocto-standard #1
[ 2131.430453] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.431970] RIP: 0010:kernfs_sop_show_options+0x15/0x50
[ 2131.432674] Code: 10 48 c7 c0 f2 ff ff ff eb 97 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 46 30 48 85 c0 74 07 48 8b 80 48 02 00 00 <48> 8b 50 08 48 85 d2 48 0f 44 d0 31 c0 48 8b 72 50 48 8b 56 30 48
[ 2131.435158] RSP: 0018:ffff8eb78439fd58 EFLAGS: 00010246
[ 2131.435885] RAX: 0000000000000000 RBX: ffff88e8c1328ca0 RCX: 0000000000000000
[ 2131.436842] RDX: ffffffff814d9640 RSI: ffff88e8c147f900 RDI: ffff88e8d20369d8
[ 2131.437803] RBP: ffff8eb78439fda0 R08: 0000000000000008 R09: ffffffff82627360
[ 2131.438772] R10: 000000000000022b R11: 000000000000000c R12: ffff88e8d20369d8
[ 2131.439733] R13: ffff88e8daa3f000 R14: ffffffff8242b970 R15: ffff88e8c1328c80
[ 2131.440507] FS: 00007f1246cd2740(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.441366] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.441973] CR2: 0000000000000008 CR3: 000000003d71a000 CR4: 00000000001506e0
[ 2131.442738] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.443499] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.444266] Call Trace:
[ 2131.444555] ? show_vfsmnt+0x1b1/0x200
[ 2131.444943] m_show+0x1a/0x20
[ 2131.445274] seq_read_iter+0x2c8/0x490
[ 2131.445689] new_sync_read+0x10d/0x190
[ 2131.446078] vfs_read+0x128/0x180
[ 2131.446445] ksys_read+0x67/0xe0
[ 2131.446798] __x64_sys_read+0x19/0x20
[ 2131.447199] do_syscall_64+0x38/0x50
[ 2131.447596] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.448112] RIP: 0033:0x7f1246dc0052
[ 2131.449276] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.451250] RSP: 002b:00007fff2f611858 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.452043] RAX: ffffffffffffffda RBX: 0000559259536320 RCX: 00007f1246dc0052
[ 2131.452817] RDX: 0000000000000400 RSI: 0000559259536500 RDI: 0000000000000003
[ 2131.453591] RBP: 00007f1246e8f300 R08: 0000000000000003 R09: 00007f1246e8da60
[ 2131.454348] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000000
[ 2131.455090] R13: 0000000000000d68 R14: 00007f1246e8e700 R15: 0000000000000d68
[ 2131.455972] Modules linked in: bnep
[ 2131.456367] CR2: 0000000000000008
[ 2131.456766] ---[ end trace bec3d6eec9de38c7 ]---
[ 2131.457269] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.457791] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.459769] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.460332] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.461082] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.461865] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.462654] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.463417] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.464174] FS: 00007f1246cd2740(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.465052] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.465686] CR2: 0000000000000008 CR3: 000000003d71a000 CR4: 00000000001506e0
[ 2131.466477] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.467258] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.473108] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.475231] #PF: supervisor read access in kernel mode
[ 2131.476231] #PF: error_code(0x0000) - not-present page
[ 2131.477057] PGD 0 P4D 0
[ 2131.477479] Oops: 0000 [#1019] PREEMPT SMP PTI
[ 2131.478197] CPU: 0 PID: 15777 Comm: mount Tainted: G D 5.10.37-yocto-standard #1
[ 2131.479593] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.481374] RIP: 0010:kernfs_sop_show_path+0x1c/0x60
[ 2131.482177] Code: b6 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 49 89 f0 48 8b 76 30 48 89 e5 48 85 f6 74 07 48 8b b6 48 02 00 00 <48> 8b 46 08 48 85 c0 48 0f 44 c6 48 8b 50 50 48 8b 42 30 48 85 c0
[ 2131.485150] RSP: 0018:ffff8eb7843b7d48 EFLAGS: 00010246
[ 2131.485938] RAX: ffffffff814d9700 RBX: ffff88e8c1328ca0 RCX: 0000000000000001
[ 2131.486660] RDX: 0000000000001000 RSI: 0000000000000000 RDI: ffff88e8c3efa2d0
[ 2131.487398] RBP: ffff8eb7843b7d48 R08: ffff88e8c147f900 R09: 0000000000000001
[ 2131.488123] R10: ffffffff8263d97c R11: ffff88e8dc3e1317 R12: ffff88e8c3efa2d0
[ 2131.488845] R13: ffff88e8daa3f000 R14: ffff88e8c3efa2f8 R15: ffff88e8fd7e2600
[ 2131.489567] FS: 00007f659ae24580(0000) GS:ffff88e8fec00000(0000) knlGS:0000000000000000
[ 2131.490386] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.490971] CR2: 0000000000000008 CR3: 000000001210e000 CR4: 00000000001506f0
[ 2131.491698] DR0: 0000000000000001 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.492420] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 2131.493140] Call Trace:
[ 2131.493401] show_mountinfo+0x8b/0x330
[ 2131.493799] m_show+0x1a/0x20
[ 2131.494108] seq_read_iter+0x2c8/0x490
[ 2131.494494] new_sync_read+0x10d/0x190
[ 2131.494880] vfs_read+0x128/0x180
[ 2131.495228] ksys_read+0x67/0xe0
[ 2131.495562] __x64_sys_read+0x19/0x20
[ 2131.495941] do_syscall_64+0x38/0x50
[ 2131.496312] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.496826] RIP: 0033:0x7f659af8d052
[ 2131.497209] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.499081] RSP: 002b:00007ffcce6434b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.499854] RAX: ffffffffffffffda RBX: 0000000000004000 RCX: 00007f659af8d052
[ 2131.500575] RDX: 0000000000004000 RSI: 0000563edb40d310 RDI: 0000000000000003
[ 2131.501296] RBP: 00007ffcce643540 R08: 0000563edb40d310 R09: 00007f659b05aa60
[ 2131.502017] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000003
[ 2131.502747] R13: 0000000000000000 R14: 00007f659ae24500 R15: 0000563edb40d310
[ 2131.503477] Modules linked in: bnep
[ 2131.503841] CR2: 0000000000000008
[ 2131.504206] ---[ end trace bec3d6eec9de38c8 ]---
[ 2131.504685] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.505169] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.507080] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.507630] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.508362] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.509089] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.509816] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.510549] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.511304] FS: 00007f659ae24580(0000) GS:ffff88e8fec00000(0000) knlGS:0000000000000000
[ 2131.512133] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.512727] CR2: 0000000000000008 CR3: 000000001210e000 CR4: 00000000001506f0
[ 2131.514232] DR0: 0000000000000001 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.514962] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 2131.551760] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.553085] #PF: supervisor read access in kernel mode
[ 2131.554013] #PF: error_code(0x0000) - not-present page
[ 2131.554944] PGD 0 P4D 0
[ 2131.555432] Oops: 0000 [#1020] PREEMPT SMP PTI
[ 2131.556229] CPU: 3 PID: 15802 Comm: grep Tainted: G D 5.10.37-yocto-standard #1
[ 2131.557750] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.559732] RIP: 0010:kernfs_sop_show_options+0x15/0x50
[ 2131.560667] Code: 10 48 c7 c0 f2 ff ff ff eb 97 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 46 30 48 85 c0 74 07 48 8b 80 48 02 00 00 <48> 8b 50 08 48 85 d2 48 0f 44 d0 31 c0 48 8b 72 50 48 8b 56 30 48
[ 2131.563701] RSP: 0018:ffff8eb784427d58 EFLAGS: 00010246
[ 2131.564254] RAX: 0000000000000000 RBX: ffff88e8c1328ca0 RCX: 0000000000000000
[ 2131.565000] RDX: ffffffff814d9640 RSI: ffff88e8c147f900 RDI: ffff88e8d20369d8
[ 2131.565765] RBP: ffff8eb784427da0 R08: 0000000000000008 R09: ffffffff82627360
[ 2131.566514] R10: 000000000000022b R11: 000000000000000c R12: ffff88e8d20369d8
[ 2131.567264] R13: ffff88e8daa3f000 R14: ffffffff8242b970 R15: ffff88e8c1328c80
[ 2131.568013] FS: 00007f28130cc740(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.568873] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.569486] CR2: 0000000000000008 CR3: 0000000004708000 CR4: 00000000001506e0
[ 2131.570243] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.571002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.571782] Call Trace:
[ 2131.572062] ? show_vfsmnt+0x1b1/0x200
[ 2131.572466] m_show+0x1a/0x20
[ 2131.572799] seq_read_iter+0x2c8/0x490
[ 2131.573204] new_sync_read+0x10d/0x190
[ 2131.573627] vfs_read+0x128/0x180
[ 2131.573971] ksys_read+0x67/0xe0
[ 2131.574327] __x64_sys_read+0x19/0x20
[ 2131.574726] do_syscall_64+0x38/0x50
[ 2131.575098] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.575773] RIP: 0033:0x7f28131ba052
[ 2131.576143] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.578113] RSP: 002b:00007fffc6474718 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.579699] RAX: ffffffffffffffda RBX: 00005629054f3320 RCX: 00007f28131ba052
[ 2131.580455] RDX: 0000000000000400 RSI: 00005629054f3500 RDI: 0000000000000003
[ 2131.581214] RBP: 00007f2813289300 R08: 0000000000000003 R09: 00007f2813287a60
[ 2131.581962] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000000
[ 2131.582729] R13: 0000000000000d68 R14: 00007f2813288700 R15: 0000000000000d68
[ 2131.583489] Modules linked in: bnep
[ 2131.583869] CR2: 0000000000000008
[ 2131.584254] ---[ end trace bec3d6eec9de38c9 ]---
[ 2131.584752] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.585275] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.587273] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.587845] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.588620] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.589385] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.590137] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.590909] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.591706] FS: 00007f28130cc740(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.592577] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.593210] CR2: 0000000000000008 CR3: 0000000004708000 CR4: 00000000001506e0
[ 2131.593989] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.594785] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.600460] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.602355] #PF: supervisor read access in kernel mode
[ 2131.603527] #PF: error_code(0x0000) - not-present page
[ 2131.604353] PGD 0 P4D 0
[ 2131.604782] Oops: 0000 [#1021] PREEMPT SMP PTI
[ 2131.605619] CPU: 0 PID: 15806 Comm: mount Tainted: G D 5.10.37-yocto-standard #1
[ 2131.607001] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.608803] RIP: 0010:kernfs_sop_show_path+0x1c/0x60
[ 2131.609603] Code: b6 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 49 89 f0 48 8b 76 30 48 89 e5 48 85 f6 74 07 48 8b b6 48 02 00 00 <48> 8b 46 08 48 85 c0 48 0f 44 c6 48 8b 50 50 48 8b 42 30 48 85 c0
[ 2131.612590] RSP: 0018:ffff8eb78443fd48 EFLAGS: 00010246
[ 2131.613432] RAX: ffffffff814d9700 RBX: ffff88e8c1328ca0 RCX: 0000000000000001
[ 2131.614578] RDX: 0000000000001000 RSI: 0000000000000000 RDI: ffff88e8c3efa2d0
[ 2131.615575] RBP: ffff8eb78443fd48 R08: ffff88e8c147f900 R09: 0000000000000001
[ 2131.616319] R10: ffffffff8263d97c R11: ffff88e8dc3e1317 R12: ffff88e8c3efa2d0
[ 2131.617042] R13: ffff88e8daa3f000 R14: ffff88e8c3efa2f8 R15: ffff88e8fd7e2600
[ 2131.617768] FS: 00007f5eb7d5c580(0000) GS:ffff88e8fec00000(0000) knlGS:0000000000000000
[ 2131.618589] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.619207] CR2: 0000000000000008 CR3: 000000001131a000 CR4: 00000000001506f0
[ 2131.619955] DR0: 0000000000000001 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.620679] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 2131.621404] Call Trace:
[ 2131.621666] show_mountinfo+0x8b/0x330
[ 2131.622056] m_show+0x1a/0x20
[ 2131.622366] seq_read_iter+0x2c8/0x490
[ 2131.622755] new_sync_read+0x10d/0x190
[ 2131.623149] vfs_read+0x128/0x180
[ 2131.623540] ksys_read+0x67/0xe0
[ 2131.623875] __x64_sys_read+0x19/0x20
[ 2131.624258] do_syscall_64+0x38/0x50
[ 2131.624630] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.625147] RIP: 0033:0x7f5eb7ec5052
[ 2131.625519] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.627433] RSP: 002b:00007ffdd589ed58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.628220] RAX: ffffffffffffffda RBX: 0000000000004000 RCX: 00007f5eb7ec5052
[ 2131.628947] RDX: 0000000000004000 RSI: 00005621afedf310 RDI: 0000000000000003
[ 2131.629672] RBP: 00007ffdd589ede0 R08: 00005621afedf310 R09: 00007f5eb7f92a60
[ 2131.630399] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000003
[ 2131.631122] R13: 0000000000000000 R14: 00007f5eb7d5c500 R15: 00005621afedf310
[ 2131.631890] Modules linked in: bnep
[ 2131.632255] CR2: 0000000000000008
[ 2131.632632] ---[ end trace bec3d6eec9de38ca ]---
[ 2131.633120] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.633613] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.635650] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.636194] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.636924] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.637661] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.638406] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.639136] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.639919] FS: 00007f5eb7d5c580(0000) GS:ffff88e8fec00000(0000) knlGS:0000000000000000
[ 2131.640750] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.641353] CR2: 0000000000000008 CR3: 000000001131a000 CR4: 00000000001506f0
[ 2131.642090] DR0: 0000000000000001 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.642824] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
[ 2131.678894] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.683398] #PF: supervisor read access in kernel mode
[ 2131.683942] #PF: error_code(0x0000) - not-present page
[ 2131.684492] PGD 0 P4D 0
[ 2131.684774] Oops: 0000 [#1022] PREEMPT SMP PTI
[ 2131.685247] CPU: 2 PID: 15831 Comm: grep Tainted: G D 5.10.37-yocto-standard #1
[ 2131.686238] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.687474] RIP: 0010:kernfs_sop_show_options+0x15/0x50
[ 2131.688025] Code: 10 48 c7 c0 f2 ff ff ff eb 97 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 48 8b 46 30 48 85 c0 74 07 48 8b 80 48 02 00 00 <48> 8b 50 08 48 85 d2 48 0f 44 d0 31 c0 48 8b 72 50 48 8b 56 30 48
[ 2131.689986] RSP: 0018:ffff8eb784497d58 EFLAGS: 00010246
[ 2131.690559] RAX: 0000000000000000 RBX: ffff88e8c1328ca0 RCX: 0000000000000000
[ 2131.691304] RDX: ffffffff814d9640 RSI: ffff88e8c147f900 RDI: ffff88e8c3e080f0
[ 2131.692045] RBP: ffff8eb784497da0 R08: 0000000000000008 R09: ffffffff82627360
[ 2131.692944] R10: 000000000000022b R11: 000000000000000c R12: ffff88e8c3e080f0
[ 2131.693799] R13: ffff88e8daa3f000 R14: ffffffff8242b970 R15: ffff88e8c1328c80
[ 2131.694580] FS: 00007fd2e715d740(0000) GS:ffff88e8fed00000(0000) knlGS:0000000000000000
[ 2131.695576] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.696176] CR2: 0000000000000008 CR3: 0000000014322000 CR4: 00000000001506e0
[ 2131.696937] DR0: 0000558661313290 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.697707] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 2131.698722] Call Trace:
[ 2131.699033] ? show_vfsmnt+0x1b1/0x200
[ 2131.699454] m_show+0x1a/0x20
[ 2131.699792] seq_read_iter+0x2c8/0x490
[ 2131.700180] new_sync_read+0x10d/0x190
[ 2131.700608] vfs_read+0x128/0x180
[ 2131.700954] ksys_read+0x67/0xe0
[ 2131.701312] __x64_sys_read+0x19/0x20
[ 2131.701713] do_syscall_64+0x38/0x50
[ 2131.702084] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.702645] RIP: 0033:0x7fd2e724b052
[ 2131.703015] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.705018] RSP: 002b:00007ffe0fbf7368 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.705826] RAX: ffffffffffffffda RBX: 00005633a5489320 RCX: 00007fd2e724b052
[ 2131.706656] RDX: 0000000000000400 RSI: 00005633a5489500 RDI: 0000000000000003
[ 2131.707696] RBP: 00007fd2e731a300 R08: 0000000000000003 R09: 00007fd2e7318a60
[ 2131.708451] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000000
[ 2131.709983] R13: 0000000000000d68 R14: 00007fd2e7319700 R15: 0000000000000d68
[ 2131.710763] Modules linked in: bnep
[ 2131.711127] CR2: 0000000000000008
[ 2131.711551] ---[ end trace bec3d6eec9de38cb ]---
[ 2131.712048] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.712553] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.714535] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.715097] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.715874] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.716655] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.717412] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.718169] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.718940] FS: 00007fd2e715d740(0000) GS:ffff88e8fed00000(0000) knlGS:0000000000000000
[ 2131.719820] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.720438] CR2: 0000000000000008 CR3: 0000000014322000 CR4: 00000000001506e0
[ 2131.721210] DR0: 0000558661313290 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.721962] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 2131.728339] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 2131.730096] #PF: supervisor read access in kernel mode
[ 2131.731361] #PF: error_code(0x0000) - not-present page
[ 2131.732385] PGD 0 P4D 0
[ 2131.732907] Oops: 0000 [#1023] PREEMPT SMP PTI
[ 2131.733802] CPU: 3 PID: 15835 Comm: mount Tainted: G D 5.10.37-yocto-standard #1
[ 2131.735537] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2131.737738] RIP: 0010:kernfs_sop_show_path+0x1c/0x60
[ 2131.738721] Code: b6 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 49 89 f0 48 8b 76 30 48 89 e5 48 85 f6 74 07 48 8b b6 48 02 00 00 <48> 8b 46 08 48 85 c0 48 0f 44 c6 48 8b 50 50 48 8b 42 30 48 85 c0
[ 2131.742383] RSP: 0018:ffff8eb78449fd48 EFLAGS: 00010246
[ 2131.743437] RAX: ffffffff814d9700 RBX: ffff88e8c1328ca0 RCX: 0000000000000001
[ 2131.744301] RDX: 0000000000001000 RSI: 0000000000000000 RDI: ffff88e8d20369d8
[ 2131.745041] RBP: ffff8eb78449fd48 R08: ffff88e8c147f900 R09: 0000000000000001
[ 2131.745803] R10: ffffffff8263d97c R11: ffff88e8f572f317 R12: ffff88e8d20369d8
[ 2131.746567] R13: ffff88e8daa3f000 R14: ffff88e8d2036a00 R15: ffff88e8ce719c00
[ 2131.747316] FS: 00007f4d5c6b7580(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.748152] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.748776] CR2: 0000000000000008 CR3: 00000000122e0000 CR4: 00000000001506e0
[ 2131.749540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.750281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2131.751023] Call Trace:
[ 2131.751305] show_mountinfo+0x8b/0x330
[ 2131.751715] m_show+0x1a/0x20
[ 2131.752024] seq_read_iter+0x2c8/0x490
[ 2131.752434] new_sync_read+0x10d/0x190
[ 2131.752837] vfs_read+0x128/0x180
[ 2131.753180] ksys_read+0x67/0xe0
[ 2131.753555] __x64_sys_read+0x19/0x20
[ 2131.753933] do_syscall_64+0x38/0x50
[ 2131.754322] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2131.754860] RIP: 0033:0x7f4d5c820052
[ 2131.755251] Code: c0 e9 d2 fe ff ff 48 8d 3d 6b e1 09 00 50 e8 c5 d3 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[ 2131.757310] RSP: 002b:00007ffda90edce8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 2131.758096] RAX: ffffffffffffffda RBX: 0000000000004000 RCX: 00007f4d5c820052
[ 2131.758856] RDX: 0000000000004000 RSI: 0000561bd49c6310 RDI: 0000000000000003
[ 2131.759624] RBP: 00007ffda90edd70 R08: 0000561bd49c6310 R09: 00007f4d5c8eda60
[ 2131.760369] R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000003
[ 2131.761109] R13: 0000000000000000 R14: 00007f4d5c6b7500 R15: 0000561bd49c6310
[ 2131.761871] Modules linked in: bnep
[ 2131.762251] CR2: 0000000000000008
[ 2131.762652] ---[ end trace bec3d6eec9de38cc ]---
[ 2131.763146] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2131.763657] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2131.765645] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2131.766251] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2131.767005] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2131.767784] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2131.768569] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2131.769320] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2131.770077] FS: 00007f4d5c6b7580(0000) GS:ffff88e8fed80000(0000) knlGS:0000000000000000
[ 2131.770953] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2131.771597] CR2: 0000000000000008 CR3: 00000000122e0000 CR4: 00000000001506e0
[ 2131.772352] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2131.773108] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2136.795714] BUG: kernel NULL pointer dereference, address: 000000000000000d
[ 2136.798484] #PF: supervisor read access in kernel mode
[ 2136.799299] #PF: error_code(0x0000) - not-present page
[ 2136.799845] PGD 0 P4D 0
[ 2136.800129] Oops: 0000 [#1024] PREEMPT SMP PTI
[ 2136.800604] CPU: 1 PID: 15843 Comm: rm Tainted: G D 5.10.37-yocto-standard #1
[ 2136.801508] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 2136.802706] RIP: 0010:security_inode_getattr+0xd/0x50
[ 2136.803272] Code: 4d 85 f6 75 e3 31 c0 5b 41 5c 41 5d 41 5e 5d c3 31 c0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 47 08 48 8b 40 30 <f6> 40 0d 02 75 35 55 48 89 e5 41 54 49 89 fc 53 48 8b 1d dc 91 fe
[ 2136.805275] RSP: 0018:ffff8eb7844d7e20 EFLAGS: 00010246
[ 2136.805910] RAX: 0000000000000000 RBX: ffff8eb7844d7e80 RCX: 000000002a1ce201
[ 2136.806665] RDX: 000000002a1ce1c1 RSI: ffffffff8144b20c RDI: ffff8eb7844d7e30
[ 2136.807432] RBP: ffff8eb7844d7e70 R08: 0000000000000064 R09: 0000000000000000
[ 2136.808197] R10: ffff88e8c1481f00 R11: 000070756f726763 R12: 0000000000000000
[ 2136.808951] R13: 0000000000000005 R14: 0000564ec0f757e8 R15: 0000000000000900
[ 2136.809714] FS: 00007f4bdb1965c0(0000) GS:ffff88e8fec80000(0000) knlGS:0000000000000000
[ 2136.810575] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2136.811204] CR2: 000000000000000d CR3: 000000002cc44000 CR4: 00000000001506e0
[ 2136.812083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2136.812843] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2136.813615] Call Trace:
[ 2136.813896] ? vfs_statx+0x87/0x120
[ 2136.814289] __do_sys_newfstatat+0x36/0x70
[ 2136.814738] ? fsnotify_find_mark+0x16/0x80
[ 2136.815210] ? iterate_dir+0x121/0x1c0
[ 2136.815634] ? fput+0x13/0x20
[ 2136.815970] ? filp_close+0x60/0x70
[ 2136.816362] __x64_sys_newfstatat+0x1c/0x20
[ 2136.816824] do_syscall_64+0x38/0x50
[ 2136.817225] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 2136.817785] RIP: 0033:0x7f4bdb0bb70e
[ 2136.818188] Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b 15 29
[ 2136.820303] RSP: 002b:00007fffa2df5358 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
[ 2136.821123] RAX: ffffffffffffffda RBX: 0000564ec0f756e0 RCX: 00007f4bdb0bb70e
[ 2136.821897] RDX: 0000564ec0f75758 RSI: 0000564ec0f757e8 RDI: 0000000000000005
[ 2136.822667] RBP: 0000564ec0f742f0 R08: 0000000000000000 R09: 00007f4bdb189a60
[ 2136.823445] R10: 0000000000000100 R11: 0000000000000246 R12: 0000564ec0f75758
[ 2136.824234] R13: 0000000000000000 R14: 0000000000000000 R15: 0000564ec0f7d870
[ 2136.825004] Modules linked in: bnep
[ 2136.825393] CR2: 000000000000000d
[ 2136.826208] ---[ end trace bec3d6eec9de38cd ]---
[ 2136.826731] RIP: 0010:d_alloc_parallel+0xd5/0x570
[ 2136.827320] Code: 00 48 89 83 a0 00 00 00 e8 28 9e 9c 00 4c 89 65 80 65 48 8b 04 25 00 6d 01 00 48 89 85 70 ff ff ff e8 3f f4 e7 ff 48 8b 43 30 <44> 8b a0 28 02 00 00 44 8b 3d ed 1c 3b 01 41 f6 c7 01 74 0f f3 90
[ 2136.829415] RSP: 0018:ffff8eb785587c50 EFLAGS: 00010202
[ 2136.829986] RAX: 0000000000000000 RBX: ffff88e8c147f900 RCX: 0000000000000000
[ 2136.830836] RDX: ffff88e8c147f9a0 RSI: ffffffff82433240 RDI: ffff88e8c147f958
[ 2136.831689] RBP: ffff8eb785587cf0 R08: 00000000000000c0 R09: ffff8eb785587dc0
[ 2136.832512] R10: ffff88e8c147f900 R11: ffffff8b919a989e R12: ffff88e8c147f900
[ 2136.833351] R13: 000000009655cd16 R14: ffffffff82bff370 R15: ffff88e8c147f958
[ 2136.834121] FS: 00007f4bdb1965c0(0000) GS:ffff88e8fec80000(0000) knlGS:0000000000000000
[ 2136.835062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2136.835757] CR2: 000000000000000d CR3: 000000002cc44000 CR4: 00000000001506e0
[ 2136.836574] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2136.837397] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2157.360870] EXT4-fs (loop0): mounting ext2 file system using the ext4 subsystem
[ 2157.363369] EXT4-fs (loop0): mounted filesystem without journal. Opts: (null)
[ 2157.364829] ext2 filesystem being mounted at /opt/ltp/ltp-i2b5MCkGm9/LTP_df01.8NtMZ3LBLa/mntpoint supports timestamps until 2038 (0x7fffffff)
[ 2457.550406] EXT4-fs (loop1): mounting ext3 file system using the ext4 subsystem
[ 2457.562617] EXT4-fs (loop1): mounted filesystem with ordered data mode. Opts: (null)
[ 2457.563892] ext3 filesystem being mounted at /opt/ltp/ltp-i2b5MCkGm9/LTP_df01.JKoAa3Ab5v/mntpoint supports timestamps until 2038 (0x7fffffff)
[ 2757.421401] EXT4-fs (loop2): mounted filesystem with ordered data mode. Opts: (null)
[ 2757.422417] ext4 filesystem being mounted at /opt/ltp/ltp-i2b5MCkGm9/LTP_df01.d8q4Q6tnah/mntpoint supports timestamps until 2038 (0x7fffffff)


How to provide info for a hung ltp build

Richard Purdie
 

Hi All,

When we encounter a hung ltp build I wanted to document what we need to do
as a best practise for debugging it. What we need to do is:

a) ssh to the worker where the build is hanging

b) Look at the output of "ps ax" or similar and determine the hung
process which is hanging. You can filter with "ps ax | grep /qemuarm64-ltp/"
since the path for an ltp build will contain it's name (changing to x86 where
appropriate).

c) From the qemu process commandline, spot it's IP address. Often it is 192.168.7.2
but the last digit can/will vary.

d) "ssh root@....2" to attempt to login to the qemu VM. You may need to handle
host cert mismatches as normal for ssh.

e) Within the vm, spot where it is hanging. Often, "top" will show nothing actively
using the cpu. The output of "ps" is key, where we can attempt to spot which ltp
test is/was running. "cgroup_xattr" and "proc01" are two examples of test names 
which we've seen hang and have now disabled. If you can't see what is hanging,
save the ps output into the bug and ping me+Alexandre for further analysis.

f) Another tip if we know the process that is hanging is to run 
"ls -la /proc/<pid>/fd" which will list the open files the test has open.

I appreciate not everyone has worker ssh access so if you do not, please let 
someone who does (Alexandre, Ross, Micheal, Armin, Saul, myself) know if
you spot one of these.

Cheers,

Richard

141 - 160 of 291