Date   

Re: Info for swat for hung build

Richard Purdie
 

On Sat, 2021-10-30 at 14:14 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
On Sat, 2021-10-30 at 14:04 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
I noticed:

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2779

has hung. Since this will be gone by the time anyone looks, inspection on the
system suggests it is hung in perl do_compile building a core-image-minimal:

$ pstree -p 3934003
run.do_compile.(3934003)───make(3934023)───make(4035804)───make(909354)───true(909946)

909354 ? SN 0:00 make -C ext/XS-Typemap/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
909946 ? ZN 0:00 [true] <defunct>
3933716 ? SNs 0:02 python3 /home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/bin/bitbake-worker decafbad
3934003 ? SN 0:00 /bin/sh /home/pokybuild/yocto-worker/oe-selftest-centos/build/build-st-3687560/tmp/work/core2-64-poky-linux/perl/5.34.0-r0/temp/run.do_compile.3933716
3934023 ? SN 0:00 make -j 16 -l 52
4035804 ? SN 0:00 make perl nonxs_ext utilities extensions pods

so the true exit code was never looked at by make?

I think it was just starting to run wic tests (wic.Wic, not wic.Wic2).
This was on centos8-ty-2.

gdb -p 909354
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-16.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 909354
Reading symbols from /usr/bin/make...Reading symbols from .gnu_debugdata for
/usr/bin/make...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
0x00007f79a1bc146d in pselect () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install make-4.2.1-11.el8.x86_64
(gdb) bt
#0 0x00007f796a1bc146(gdb) btd in pselect () from /lib64/libc.so.6
#1 0x000055851c9f700d in jobserver_acquire ()
#2 0x000055851c9f3935 in new_job ()
#3 0x000055851c9ffc07 in update_file ()
#4 0x000055851ca00055 in check_dep ()
#5 0x000055851c9fefe6 in update_file ()
#6 0x000055851ca00055 in check_dep ()
#7 0x000055851c9fefe6 in update_file ()
#8 0x000055851ca00055 in check_dep ()
#9 0x000055851c9fefe6 in update_file ()
#10 0x000055851ca00055 in check_dep ()
#11 0x000055851c9fefe6 in update_file ()
#12 0x000055851ca00055 in check_dep ()
#13 0x000055851c9fefe6 in update_file ()
#14 0x000055851ca004cf in update_goal_chain ()
#15 0x000055851c9e3f56 in main ()
(gdb)

Centos 8 has make 4.2.1 which made me wonder about:
https://git.savannah.gnu.org/cgit/make.git/commit/src/posixos.c?id=d79fe162c009788888faaf0317253b6f0cac7092

?
Sending an ECHLD (sig 17) to the process had it exit and start building again. 

Logs show it was stuck in wic.Wic.test_bootloader_config.

That means it could be the above bug and something got messed up with the job
counting? Or the ECHLD from the true process was somehow missed and there is a
race in make somewhere but that seems less likely?...

Cheers,

Richard


Re: Info for swat for hung build

Richard Purdie
 

On Sat, 2021-10-30 at 14:04 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
I noticed:

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2779

has hung. Since this will be gone by the time anyone looks, inspection on the
system suggests it is hung in perl do_compile building a core-image-minimal:

$ pstree -p 3934003
run.do_compile.(3934003)───make(3934023)───make(4035804)───make(909354)───true(909946)

909354 ? SN 0:00 make -C ext/XS-Typemap/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
909946 ? ZN 0:00 [true] <defunct>
3933716 ? SNs 0:02 python3 /home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/bin/bitbake-worker decafbad
3934003 ? SN 0:00 /bin/sh /home/pokybuild/yocto-worker/oe-selftest-centos/build/build-st-3687560/tmp/work/core2-64-poky-linux/perl/5.34.0-r0/temp/run.do_compile.3933716
3934023 ? SN 0:00 make -j 16 -l 52
4035804 ? SN 0:00 make perl nonxs_ext utilities extensions pods

so the true exit code was never looked at by make?

I think it was just starting to run wic tests (wic.Wic, not wic.Wic2).
This was on centos8-ty-2.

gdb -p 909354
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-16.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 909354
Reading symbols from /usr/bin/make...Reading symbols from .gnu_debugdata for
/usr/bin/make...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
0x00007f79a1bc146d in pselect () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install make-4.2.1-11.el8.x86_64
(gdb) bt
#0 0x00007f796a1bc146(gdb) btd in pselect () from /lib64/libc.so.6
#1 0x000055851c9f700d in jobserver_acquire ()
#2 0x000055851c9f3935 in new_job ()
#3 0x000055851c9ffc07 in update_file ()
#4 0x000055851ca00055 in check_dep ()
#5 0x000055851c9fefe6 in update_file ()
#6 0x000055851ca00055 in check_dep ()
#7 0x000055851c9fefe6 in update_file ()
#8 0x000055851ca00055 in check_dep ()
#9 0x000055851c9fefe6 in update_file ()
#10 0x000055851ca00055 in check_dep ()
#11 0x000055851c9fefe6 in update_file ()
#12 0x000055851ca00055 in check_dep ()
#13 0x000055851c9fefe6 in update_file ()
#14 0x000055851ca004cf in update_goal_chain ()
#15 0x000055851c9e3f56 in main ()
(gdb)

Centos 8 has make 4.2.1 which made me wonder about:
https://git.savannah.gnu.org/cgit/make.git/commit/src/posixos.c?id=d79fe162c009788888faaf0317253b6f0cac7092

?

Cheers,

Richard


Info for swat for hung build

Richard Purdie
 

I noticed:

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2779

has hung. Since this will be gone by the time anyone looks, inspection on the
system suggests it is hung in perl do_compile building a core-image-minimal:

$ pstree -p 3934003
run.do_compile.(3934003)───make(3934023)───make(4035804)───make(909354)───true(909946)

909354 ? SN 0:00 make -C ext/XS-Typemap/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
909946 ? ZN 0:00 [true] <defunct>
3933716 ? SNs 0:02 python3 /home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/bin/bitbake-worker decafbad
3934003 ? SN 0:00 /bin/sh /home/pokybuild/yocto-worker/oe-selftest-centos/build/build-st-3687560/tmp/work/core2-64-poky-linux/perl/5.34.0-r0/temp/run.do_compile.3933716
3934023 ? SN 0:00 make -j 16 -l 52
4035804 ? SN 0:00 make perl nonxs_ext utilities extensions pods

so the true exit code was never looked at by make?

I think it was just starting to run wic tests (wic.Wic, not wic.Wic2).

Cheers,

Richard


Re: Swatbot

Larson, Chris
 

Yes, please, I’d appreciate that.

--
Christopher “kergoth” Larson
chris_larson@..., chris.larson@..., kergoth@...
Founder - BitBake, OpenEmbedded, OpenZaurus
Senior Software Engineer, Mentor Graphics, a Siemens Business
On Oct 22, 2021, 10:53 AM -0700, Alexandre Belloni <alexandre.belloni@...>, wrote:

Hello,

On 22/10/2021 17:37:34+0000, Larson, Chris wrote:
I’m starting on triage for swat now, it’ll actually be my first time
doing it, so bear with me. I’m awaiting a response to a password reset
attempt on swatbot at the moment (though honestly not sure I ever had
an account.. do I need to register? 😊.


I sent your account password on Tue, 8 Jun 2021 21:56:26 +0200, I can
resend if you need.


--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com






Re: Swatbot

Alexandre Belloni
 

Hello,

On 22/10/2021 17:37:34+0000, Larson, Chris wrote:
I’m starting on triage for swat now, it’ll actually be my first time
doing it, so bear with me. I’m awaiting a response to a password reset
attempt on swatbot at the moment (though honestly not sure I ever had
an account.. do I need to register? 😊.
I sent your account password on Tue, 8 Jun 2021 21:56:26 +0200, I can
resend if you need.


--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Swatbot

Larson, Chris
 

I’m starting on triage for swat now, it’ll actually be my first time doing it, so bear with me. I’m awaiting a response to a password reset attempt on swatbot at the moment (though honestly not sure I ever had an account.. do I need to register? 😊.

 

-Chris


Re: Intermittent network issue on centos7-ty-4

Richard Purdie
 

On Tue, 2021-10-19 at 13:24 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts. 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.
Checking this again, I think the timestamps are not an hour different and I was
misreading some tests I made. The do_install task finished at 17:07 and the
do_package finished by 17:11 so we have a 4 minute window which does not match
the last dhcp refresh before the error at 16:55. It was a nice idea :/.
Not sure if this means anything or not (run on centos7-ty-4 for around 10 mins):

-bash-4.2$ while true; do ping typhoon.yocto.io -c 1 > /dev/null || date; sleep 0.1; done
Tue 19 Oct 12:53:35 GMT 2021
Tue 19 Oct 12:54:32 GMT 2021
Tue 19 Oct 12:55:39 GMT 2021
Tue 19 Oct 12:57:01 GMT 2021
Tue 19 Oct 12:59:54 GMT 2021
Tue 19 Oct 13:00:46 GMT 2021
Tue 19 Oct 13:02:54 GMT 2021
Tue 19 Oct 13:03:11 GMT 2021

Cheers,

Richard


Re: Intermittent network issue on centos7-ty-4

Richard Purdie
 

On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts. 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.
Checking this again, I think the timestamps are not an hour different and I was
misreading some tests I made. The do_install task finished at 17:07 and the
do_package finished by 17:11 so we have a 4 minute window which does not match
the last dhcp refresh before the error at 16:55. It was a nice idea :/.

Cheers,

Richard


Intermittent network issue on centos7-ty-4

Richard Purdie
 

Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts. 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.

Cheers,

Richard


Re: SWAT Rotation schedule

Paul Eggleton
 

Yep, haven't got very far yet but I'm on it 😊

Cheers
Paul

-----Original Message-----
From: swat@... <swat@...> On Behalf Of Alexandre Belloni
Sent: Tuesday, 19 October 2021 9:06 am
To: Paul Eggleton <Paul.Eggleton@...>
Subject: SWAT Rotation schedule

[You don't often get email from alexandre.belloni@.... Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

Hello Paul,

A quick reminder that you are on SWAT duty but I see you started triaging yesterday ;)

Valerii, I've added you in the schedule following a discussion with
Oleksiy:

┌──────┬────────────┐
│ Week │ Start │
┌───────────────────────────────┼──────┼────────────┤
│ Alejandro Hernandez Samaniego │ 37 │ 17/09/2021 │
│ Oleksiy Obitotskyy │ 38 │ 24/09/2021 │
│ Naveen Saini │ 39 │ 01/10/2021 │
│ Thomas Perrot │ 40 │ 08/10/2021 │
│ Paul Eggleton │ 41 │ 15/10/2021 │
│ Christopher Larson │ 42 │ 22/10/2021 │
│ Jon Mason │ 43 │ 29/10/2021 │
│ Lee Chee Yang │ 44 │ 05/11/2021 │
│ Minjae Kim │ 45 │ 12/11/2021 │
│ Jaga │ 46 │ 19/11/2021 │
│ Leo Sandoval │ 47 │ 26/11/2021 │
│ Ross Burton │ 48 │ 03/12/2021 │
│ Valerii Chernous │ 49 │ 10/12/2021 │
│ Anibal Limon │ 50 │ 17/12/2021 │
│ Saul Wold │ 51 │ 24/12/2021 │
└───────────────────────────────┴──────┴────────────┘

Please check the table and let me know if you are not available for the selected week. SWAT duty will be from Friday to Thursday and the goal is to triage all the failures on swatbot before the weekly triage call happening at 2:30pm UTC.

Thanks!

--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbootlin.com%2F&;data=04%7C01%7Cpaul.eggleton%40microsoft.com%7C6cfd8cfecda5417e04d608d99272aa38%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637701843481103286%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=NYz%2BobYcufHZWJ7LL6483IJhPW7Ck9%2FT0Ysj83HoGnE%3D&amp;reserved=0


Hung meta-arm build

Richard Purdie
 

https://autobuilder.yoctoproject.org/typhoon/#/builders/113/builds/1579

On the worker (opensuse152-ty-1):

pstree -p shows:

├─Cooker(30044)─┬─Worker(5328)─┬─nativesdk-perl:(44339)───run.do_compile.(47448)───make(47505)───make(41311)─┬─make(47131)───true(387)
│ │ │ └─make(48106)───true(48776)
│ │ └─{Worker}(5476)
│ ├─pseudo(19403)───Worker (Fakeroo(19419)───{Worker (Fakeroo}(20828)
│ └─{Cooker}(34065)

and the state of the indvidual processes:

pokybuild@opensuse152-ty-1:~> ps ax | grep XXX
387 ? ZN 0:00 [true] <defunct>
48776 ? ZN 0:00 [true] <defunct>
47131 ? SN 0:00 make -C cpan/Unicode-Collate/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
48106 ? SN 0:00 make -C ext/I18N-Langinfo/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
47505 ? SN 0:00 make -j 16 -l 52
47448 ? SN 0:00 /bin/sh /home/pokybuild/yocto-worker/meta-arm/build/build/tmp/work/i686-nativesdk-pokysdk-linux/nativesdk-perl/5.34.0-r0/temp/run.do_compile.44339

so it looks like a make deadlock? :/

Cheers,

Richard


bitbake-worker failure traceback

Richard Purdie
 

I saw:

https://autobuilder.yoctoproject.org/typhoon/#/builders/108/builds/2167

failed and logged in the grab the cookerdaemon log whilst they were still there.
In the log there was the traceback:

Traceback (most recent call last):
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 513, in <module>
worker.serve()
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 402, in serve
self.handle_item(b"newtaskhashes", self.handle_newtaskhashes)
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 420, in handle_item
func(self.queue[(len(item) + 2):index])
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 444, in handle_newtaskhashes
self.workerdata["newhashes"] = pickle.loads(data)
_pickle.UnpicklingError: invalid load key, '\x00'.
invalid load key, '\x00'./home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/lib/bb/runqueue.py:1329: ResourceWarning: unclosed file <_io.BufferedWriter name=131>
self.worker = {}

Not sure why that could happen but it is helpful to have the data and we could
maybe improve that exception handling to dump the invalid data so the next time
it happens we get further info.

Cheers,

Richard


Re: musl-qemux86: do_compile_kernelmodules() failed

Michael Halstead <mhalstead@...>
 

Yes, after the tmp directories were filled and we had inode exhaustion I changed the tmpfiles.d configuration to delete tmp files immediately. I expected this to happen on boot only but it also happens once every day. I've changed settings to only remove files 3 days old or older. No builds last over 3 days so I expect this will fix the problem. 


On Mon, Oct 4, 2021 at 5:24 AM Richard Purdie <richard.purdie@...> wrote:
Yes, it seems we have a problem on Centos 8 where files in /tmp are being
removed unexpectedly for unknown reasons and causing a variety of build
failures.

Michael: Were there any recent changes in the updates to the Centos 8 system
that would explain this?

Cheers,

Richard


On Mon, 2021-10-04 at 09:00 +0000, Naveen Saini wrote:
> It seems CentOS specific !
>  
> Same kernelmodules() failure during reproducible-centos:
> https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766
>  
> And oe-selftest-centos failures too only on centos:
> https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644
>  
> Regards,
> Naveen
>  
>  
> From: swat@... <swat@...>On Behalf Of
> Naveen Saini
> Sent: Monday, October 4, 2021 4:44 PM
> To: swat@...
> Cc: Bruce Ashfield <bruce.ashfield@...>
> Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed
>  
> Hi Richard,
>  
> On master-next, do_compile_kernelmodules() is failing (with musl).
>  
> https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099
>  
> Error log:
>  
> | Assembler messages:
> | Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
> | make[3]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-
> source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1
> | make[2]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-
> source/scripts/Makefile.build:514: drivers/net] Error 2
> | make[1]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858:
> drivers] Error 2
> | make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-
> shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2
> | make: Leaving directory '/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'
> | ERROR: oe_runmake failed
> | WARNING: /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'
> | WARNING: Backtrace (BB generated script):
> |            #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 220
> |            #2: die, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 204
> |            #3: oe_runmake, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 199
> |            #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 170
> |            #5: main, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 223
> NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0:task
> do_compile_kernelmodules: Failed
> ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-
> kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit
> code '1'
>  
> Regards,
> Naveen >  




--
Michael Halstead
Linux Foundation / Yocto Project
Systems Operations Engineer


centos8 failure summary

Richard Purdie
 

Hi,

I tried to summarise all of the "/tmp" file issues we've seen. We have 4 of
them, 3 on centos8-ty-2 and one on centos8-ty-1. Issues seen once in python
code, twice in kernel modules in the assembler and once in automake over an m4
file.

No idea what is going on :(

Cheers,

Richard


centos8-ty-1 oe-selftest pseudo.Pseudo.test_pseudo_pyc_creation

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644/steps/14/logs/stdio

NOTE: Running task 7 of 8 (/tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb:do_install)
ERROR: Traceback (most recent call last):
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/bin/bitbake-worker", line 253, in child
the_data = bb_cache.loadDataFull(fn, appends)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 332, in loadDataFull
bb_data = self.load_bbfile(virtualfn, appends, virtonly=True)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 345, in load_bbfile
datastores = parse_recipe(bb_data, bbfile, appends, mc)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 308, in parse_recipe
bb_data = bb.parse.handle(bbfile, bb_data)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/__init__.py", line 107, in handle
return h['handle'](fn, data, include)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/parse_py/BBHandler.py", line 114, in handle
abs_fn = resolve_file(fn, d)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/__init__.py", line 131, in resolve_file
raise IOError(errno.ENOENT, "file %s not found" % fn)
FileNotFoundError: [Errno 2] file /tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb not found
ERROR: Task (/tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb:do_install) failed with exit code '1'

centos8-ty-2 reproducible-centos

https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766

/home/pokybuild/yocto-worker/reproducible-centos/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules
/home/pokybuild/yocto-worker/reproducible-centos/build/meta/recipes-core/dbus/dbus_1.12.20.bb:do_configure

| CC [M] fs/nls/nls_cp737.o
| Assembler messages:
| Error: can't open /tmp/ccUONJmc.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work-shared/qemux86-64/kernel-source/scripts/Makefile.build:271: fs/nls/nls_cp737.o] Error 1

| autoreconf: running: /home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/bin/autoconf --include=/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/dbus-1.12.20/m4/ --include=/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/share/aclocal/ --force
| autom4te: error: cannot open /tmp/arGZ1Myy/am4tld4nPo/traces.m4: No such file or directory
| autoreconf: error: /home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/bin/autoconf failed with exit status: 1

musl-qemux86 centos8-ty-2

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099/steps/11/logs/stdio

| Assembler messages:
| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1


Re: musl-qemux86: do_compile_kernelmodules() failed

Richard Purdie
 

Yes, it seems we have a problem on Centos 8 where files in /tmp are being
removed unexpectedly for unknown reasons and causing a variety of build
failures.

Michael: Were there any recent changes in the updates to the Centos 8 system
that would explain this?

Cheers,

Richard

On Mon, 2021-10-04 at 09:00 +0000, Naveen Saini wrote:
It seems CentOS specific !
 
Same kernelmodules() failure during reproducible-centos:
https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766
 
And oe-selftest-centos failures too only on centos:
https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644
 
Regards,
Naveen
 
 
From: swat@... <swat@...>On Behalf Of
Naveen Saini
Sent: Monday, October 4, 2021 4:44 PM
To: swat@...
Cc: Bruce Ashfield <bruce.ashfield@...>
Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed
 
Hi Richard,
 
On master-next, do_compile_kernelmodules() is failing (with musl).
 
https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099
 
Error log:
 
| Assembler messages:
| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-
source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1
| make[2]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-
source/scripts/Makefile.build:514: drivers/net] Error 2
| make[1]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858:
drivers] Error 2
| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-
shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2
| make: Leaving directory '/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'
| ERROR: oe_runmake failed
| WARNING: /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'
| WARNING: Backtrace (BB generated script):
|            #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 220
|            #2: die, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 204
|            #3: oe_runmake, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 199
|            #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 170
|            #5: main, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 223
NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0:task
do_compile_kernelmodules: Failed
ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-
kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit
code '1'
 
Regards,
Naveen


Re: musl-qemux86: do_compile_kernelmodules() failed

Naveen Saini
 

It seems CentOS specific !

 

Same kernelmodules() failure during reproducible-centos:

https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766

 

And oe-selftest-centos failures too only on centos:

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644

 

Regards,

Naveen

 

 

From: swat@... <swat@...> On Behalf Of Naveen Saini
Sent: Monday, October 4, 2021 4:44 PM
To: swat@...
Cc: Bruce Ashfield <bruce.ashfield@...>
Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed

 

Hi Richard,

 

On master-next, do_compile_kernelmodules() is failing (with musl).

 

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099

 

Error log:

 

| Assembler messages:

| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory

| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1

| make[2]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:514: drivers/net] Error 2

| make[1]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858: drivers] Error 2

| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2

| make: Leaving directory '/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'

| ERROR: oe_runmake failed

| WARNING: /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'

| WARNING: Backtrace (BB generated script):

|            #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 220

|            #2: die, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 204

|            #3: oe_runmake, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 199

|            #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 170

|            #5: main, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 223

NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0: task do_compile_kernelmodules: Failed

ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit code '1'

 

Regards,

Naveen


musl-qemux86: do_compile_kernelmodules() failed

Naveen Saini
 

Hi Richard,

 

On master-next, do_compile_kernelmodules() is failing (with musl).

 

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099

 

Error log:

 

| Assembler messages:

| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory

| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1

| make[2]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:514: drivers/net] Error 2

| make[1]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858: drivers] Error 2

| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2

| make: Leaving directory '/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'

| ERROR: oe_runmake failed

| WARNING: /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'

| WARNING: Backtrace (BB generated script):

|            #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 220

|            #2: die, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 204

|            #3: oe_runmake, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 199

|            #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 170

|            #5: main, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 223

NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0: task do_compile_kernelmodules: Failed

ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit code '1'

 

Regards,

Naveen


Re: The postinstall intercept hook failures on master-next

Naveen Saini
 

It is happening again on master now, for most builds:

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2685

 

Regards,

Naveen

 

From: swat@... <swat@...> On Behalf Of Naveen Saini
Sent: Friday, October 1, 2021 3:27 PM
To: swat@...
Subject: [swat] The postinstall intercept hook failures on master-next

 

Hi Richard,

 

I can see a number of postinstall intercept hook failures on master-next on oe-selftest builds on all distros and reproducible builds.

 

Error log:

do_rootfs: The postinstall intercept hook 'update_pixbuf_cache' failed

 

do_rootfs: The postinstall intercept hook 'update_font_cache' failed

 

do_rootfs: The postinstall intercept hook 'update_gio_module_cache' failed

 

do_rootfs: The postinstall intercept hook 'update_udev_hwdb' failed

 

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2678

 

https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/2646

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2636

https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/2617

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2611

 

I already raised a bug for  'update_udev_hwdb'

https://bugzilla.yoctoproject.org/show_bug.cgi?id=14583

 

I can not see more logs to look deeper, should I raise bugs for all of them ?

 

Regards,

Naveen

 

 


Re: YP Autobuilder/SwatBoart Enhancement

Alexandre Belloni
 

On 01/10/2021 12:52:46+0000, Naveen Saini wrote:
Ahh..right !! Thanks.

But still I think, in case of build cancellation, this provision can make debugging easier. If still this does not make much sense, I can close this bug.
I agree, this would meake it easier and I'm not dismissing the idea but
it is not simple to implement in the current architecture. Let's leave
it open and discuss on Thursday

Regards,
Naveen

-----Original Message-----
From: swat@... <swat@...> On Behalf
Of Alexandre Belloni
Sent: Friday, October 1, 2021 6:46 PM
To: swat@...
Subject: Re: [swat] YP Autobuilder/SwatBoart Enhancement

Hello,

On 01/10/2021 09:02:07+0000, Naveen Saini wrote:
Hi Team,

I created a bug#14584 to enhance Autobuilder/SwatBoat process.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Description:
Currently when user cancel a build, there is no provision to enter reason,
why it was cancelled. If there is no need to triage such cancelled builds then
allow to enable 'Not for SWAT'

Sometime it becomes difficult to analyse the build failures, when mater-
next branch is force rebased (drop conflict patch). In that case, there is no
way to know which commits were included in cancelled builds.
I explained on
https://wiki.yoctoproject.org/wiki/Yocto_Build_Failure_Swat_Team how to
get that:

$ git clone git://git.yoctoproject.org/poky $ cd poky $ git fetch origin
47482eff9897ccde946e9247724babc3a586d318
$ git log FETCH_HEAD

This should work unless the git garbage collector ran.


--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel
engineering https://bootlin.com







--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: YP Autobuilder/SwatBoart Enhancement

Naveen Saini
 

Ahh..right !! Thanks.

But still I think, in case of build cancellation, this provision can make debugging easier. If still this does not make much sense, I can close this bug.

Regards,
Naveen

-----Original Message-----
From: swat@... <swat@...> On Behalf
Of Alexandre Belloni
Sent: Friday, October 1, 2021 6:46 PM
To: swat@...
Subject: Re: [swat] YP Autobuilder/SwatBoart Enhancement

Hello,

On 01/10/2021 09:02:07+0000, Naveen Saini wrote:
Hi Team,

I created a bug#14584 to enhance Autobuilder/SwatBoat process.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Description:
Currently when user cancel a build, there is no provision to enter reason,
why it was cancelled. If there is no need to triage such cancelled builds then
allow to enable 'Not for SWAT'

Sometime it becomes difficult to analyse the build failures, when mater-
next branch is force rebased (drop conflict patch). In that case, there is no
way to know which commits were included in cancelled builds.
I explained on
https://wiki.yoctoproject.org/wiki/Yocto_Build_Failure_Swat_Team how to
get that:

$ git clone git://git.yoctoproject.org/poky $ cd poky $ git fetch origin
47482eff9897ccde946e9247724babc3a586d318
$ git log FETCH_HEAD

This should work unless the git garbage collector ran.


--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel
engineering https://bootlin.com