Date   

Re: Intermittent network issue on centos7-ty-4

Richard Purdie
 

On Tue, 2021-10-19 at 13:24 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts.Β 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.
Checking this again, I think the timestamps are not an hour different and I was
misreading some tests I made. The do_install task finished at 17:07 and the
do_package finished by 17:11 so we have a 4 minute window which does not match
the last dhcp refresh before the error at 16:55. It was a nice idea :/.
Not sure if this means anything or not (run on centos7-ty-4 for around 10 mins):

-bash-4.2$ while true; do ping typhoon.yocto.io -c 1 > /dev/null || date; sleep 0.1; done
Tue 19 Oct 12:53:35 GMT 2021
Tue 19 Oct 12:54:32 GMT 2021
Tue 19 Oct 12:55:39 GMT 2021
Tue 19 Oct 12:57:01 GMT 2021
Tue 19 Oct 12:59:54 GMT 2021
Tue 19 Oct 13:00:46 GMT 2021
Tue 19 Oct 13:02:54 GMT 2021
Tue 19 Oct 13:03:11 GMT 2021

Cheers,

Richard


Re: Intermittent network issue on centos7-ty-4

Richard Purdie
 

On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.org
wrote:
Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts.Β 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.
Checking this again, I think the timestamps are not an hour different and I was
misreading some tests I made. The do_install task finished at 17:07 and the
do_package finished by 17:11 so we have a 4 minute window which does not match
the last dhcp refresh before the error at 16:55. It was a nice idea :/.

Cheers,

Richard


Intermittent network issue on centos7-ty-4

Richard Purdie
 

Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was:

https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083

Showing these warnings:

stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known
stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known

The worry being why was the DNS suddenly not working. Looking at the log:

/home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319

The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e.
when it was last written to).

We didn't spot anything yesterday but I realised the systemd journalctl
timestamps are an hour different that the timestamp I displayed from the
filesystem thanks for daylight savings. Looking at journalctl an hour different,
I see:

Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root.
Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly)
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron
Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79)
Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0)
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.'
Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound
Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds.
Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d
Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service...
Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service.
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts)
Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts...
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root.
Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root.
Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root.

i.e. it renewed the DHCP lease about that time and ran network manager scripts.Β 
This looks a bit like a smoking gun to me.

Not sure what to do about it but I think we may know where the networking
glitches come from.

Cheers,

Richard


Re: SWAT Rotation schedule

Paul Eggleton
 

Yep, haven't got very far yet but I'm on it 😊

Cheers
Paul

-----Original Message-----
From: swat@lists.yoctoproject.org <swat@lists.yoctoproject.org> On Behalf Of Alexandre Belloni
Sent: Tuesday, 19 October 2021 9:06 am
To: Paul Eggleton <Paul.Eggleton@microsoft.com>
Subject: SWAT Rotation schedule

[You don't often get email from alexandre.belloni@bootlin.com. Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.]

Hello Paul,

A quick reminder that you are on SWAT duty but I see you started triaging yesterday ;)

Valerii, I've added you in the schedule following a discussion with
Oleksiy:

β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Week β”‚ Start β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Alejandro Hernandez Samaniego β”‚ 37 β”‚ 17/09/2021 β”‚
β”‚ Oleksiy Obitotskyy β”‚ 38 β”‚ 24/09/2021 β”‚
β”‚ Naveen Saini β”‚ 39 β”‚ 01/10/2021 β”‚
β”‚ Thomas Perrot β”‚ 40 β”‚ 08/10/2021 β”‚
β”‚ Paul Eggleton β”‚ 41 β”‚ 15/10/2021 β”‚
β”‚ Christopher Larson β”‚ 42 β”‚ 22/10/2021 β”‚
β”‚ Jon Mason β”‚ 43 β”‚ 29/10/2021 β”‚
β”‚ Lee Chee Yang β”‚ 44 β”‚ 05/11/2021 β”‚
β”‚ Minjae Kim β”‚ 45 β”‚ 12/11/2021 β”‚
β”‚ Jaga β”‚ 46 β”‚ 19/11/2021 β”‚
β”‚ Leo Sandoval β”‚ 47 β”‚ 26/11/2021 β”‚
β”‚ Ross Burton β”‚ 48 β”‚ 03/12/2021 β”‚
β”‚ Valerii Chernous β”‚ 49 β”‚ 10/12/2021 β”‚
β”‚ Anibal Limon β”‚ 50 β”‚ 17/12/2021 β”‚
β”‚ Saul Wold β”‚ 51 β”‚ 24/12/2021 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Please check the table and let me know if you are not available for the selected week. SWAT duty will be from Friday to Thursday and the goal is to triage all the failures on swatbot before the weekly triage call happening at 2:30pm UTC.

Thanks!

--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbootlin.com%2F&;data=04%7C01%7Cpaul.eggleton%40microsoft.com%7C6cfd8cfecda5417e04d608d99272aa38%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637701843481103286%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=NYz%2BobYcufHZWJ7LL6483IJhPW7Ck9%2FT0Ysj83HoGnE%3D&amp;reserved=0


Hung meta-arm build

Richard Purdie
 

https://autobuilder.yoctoproject.org/typhoon/#/builders/113/builds/1579

On the worker (opensuse152-ty-1):

pstree -p shows:

β”œβ”€Cooker(30044)─┬─Worker(5328)─┬─nativesdk-perl:(44339)───run.do_compile.(47448)───make(47505)───make(41311)─┬─make(47131)───true(387)
β”‚ β”‚ β”‚ └─make(48106)───true(48776)
β”‚ β”‚ └─{Worker}(5476)
β”‚ β”œβ”€pseudo(19403)───Worker (Fakeroo(19419)───{Worker (Fakeroo}(20828)
β”‚ └─{Cooker}(34065)

and the state of the indvidual processes:

pokybuild@opensuse152-ty-1:~> ps ax | grep XXX
387 ? ZN 0:00 [true] <defunct>
48776 ? ZN 0:00 [true] <defunct>
47131 ? SN 0:00 make -C cpan/Unicode-Collate/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
48106 ? SN 0:00 make -C ext/I18N-Langinfo/ all PERL_CORE=1 LIBPERL=libperl.so.5.34.0 LINKTYPE=dynamic
47505 ? SN 0:00 make -j 16 -l 52
47448 ? SN 0:00 /bin/sh /home/pokybuild/yocto-worker/meta-arm/build/build/tmp/work/i686-nativesdk-pokysdk-linux/nativesdk-perl/5.34.0-r0/temp/run.do_compile.44339

so it looks like a make deadlock? :/

Cheers,

Richard


bitbake-worker failure traceback

Richard Purdie
 

I saw:

https://autobuilder.yoctoproject.org/typhoon/#/builders/108/builds/2167

failed and logged in the grab the cookerdaemon log whilst they were still there.
In the log there was the traceback:

Traceback (most recent call last):
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 513, in <module>
worker.serve()
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 402, in serve
self.handle_item(b"newtaskhashes", self.handle_newtaskhashes)
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 420, in handle_item
func(self.queue[(len(item) + 2):index])
File "/home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/bin/bitbake-worker", line 444, in handle_newtaskhashes
self.workerdata["newhashes"] = pickle.loads(data)
_pickle.UnpicklingError: invalid load key, '\x00'.
invalid load key, '\x00'./home/pokybuild/yocto-worker/qemux86-world-alt/build/bitbake/lib/bb/runqueue.py:1329: ResourceWarning: unclosed file <_io.BufferedWriter name=131>
self.worker = {}

Not sure why that could happen but it is helpful to have the data and we could
maybe improve that exception handling to dump the invalid data so the next time
it happens we get further info.

Cheers,

Richard


Re: musl-qemux86: do_compile_kernelmodules() failed

Michael Halstead <mhalstead@...>
 

Yes, after the tmp directories were filled and we had inode exhaustionΒ I changed theΒ tmpfiles.d configuration to delete tmp files immediately. I expected this to happen on boot only but it also happens once every day. I've changed settings to only remove files 3 days old or older. No builds last over 3 days so I expect this will fix the problem.Β 


On Mon, Oct 4, 2021 at 5:24 AM Richard Purdie <richard.purdie@...> wrote:
Yes, it seems we have a problem on Centos 8 where files in /tmp are being
removed unexpectedly for unknown reasons and causing a variety of build
failures.

Michael: Were there any recent changes in the updates to the Centos 8 system
that would explain this?

Cheers,

Richard


On Mon, 2021-10-04 at 09:00 +0000, Naveen Saini wrote:
> It seems CentOS specific !
> Β 
> Same kernelmodules() failure during reproducible-centos:
> https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766
> Β 
> And oe-selftest-centos failures too only on centos:
> https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644
> Β 
> Regards,
> Naveen
> Β 
> Β 
> From: swat@... <swat@...>On Behalf Of
> Naveen Saini
> Sent: Monday, October 4, 2021 4:44 PM
> To: swat@...
> Cc: Bruce Ashfield <bruce.ashfield@...>
> Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed
> Β 
> Hi Richard,
> Β 
> On master-next, do_compile_kernelmodules() is failing (with musl).
> Β 
> https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099
> Β 
> Error log:
> Β 
> | Assembler messages:
> | Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
> | make[3]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-
> source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1
> | make[2]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-
> source/scripts/Makefile.build:514: drivers/net] Error 2
> | make[1]: *** [/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858:
> drivers] Error 2
> | make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-
> shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2
> | make: Leaving directory '/home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'
> | ERROR: oe_runmake failed
> | WARNING: /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'
> | WARNING: Backtrace (BB generated script):
> | Β Β Β Β Β Β Β Β Β Β  #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 220
> | Β Β Β Β Β Β Β Β Β Β  #2: die, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 204
> | Β Β Β Β Β Β Β Β Β Β  #3: oe_runmake, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 199
> | Β Β Β Β Β Β Β Β Β Β  #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 170
> | Β Β Β Β Β Β Β Β Β Β  #5: main, /home/pokybuild/yocto-worker/musl-
> qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
> yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
> r0/temp/run.do_compile_kernelmodules.1482957, line 223
> NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0:task
> do_compile_kernelmodules: Failed
> ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-
> kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit
> code '1'
> Β 
> Regards,
> Naveen > Β 




--
Michael Halstead
Linux Foundation / Yocto Project
Systems Operations Engineer


centos8 failure summary

Richard Purdie
 

Hi,

I tried to summarise all of the "/tmp" file issues we've seen. We have 4 of
them, 3 on centos8-ty-2 and one on centos8-ty-1. Issues seen once in python
code, twice in kernel modules in the assembler and once in automake over an m4
file.

No idea what is going on :(

Cheers,

Richard


centos8-ty-1 oe-selftest pseudo.Pseudo.test_pseudo_pyc_creation

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644/steps/14/logs/stdio

NOTE: Running task 7 of 8 (/tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb:do_install)
ERROR: Traceback (most recent call last):
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/bin/bitbake-worker", line 253, in child
the_data = bb_cache.loadDataFull(fn, appends)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 332, in loadDataFull
bb_data = self.load_bbfile(virtualfn, appends, virtonly=True)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 345, in load_bbfile
datastores = parse_recipe(bb_data, bbfile, appends, mc)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/cache.py", line 308, in parse_recipe
bb_data = bb.parse.handle(bbfile, bb_data)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/__init__.py", line 107, in handle
return h['handle'](fn, data, include)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/parse_py/BBHandler.py", line 114, in handle
abs_fn = resolve_file(fn, d)
File "/home/pokybuild/yocto-worker/oe-selftest-centos/build/bitbake/lib/bb/parse/__init__.py", line 131, in resolve_file
raise IOError(errno.ENOENT, "file %s not found" % fn)
FileNotFoundError: [Errno 2] file /tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb not found
ERROR: Task (/tmp/devtoolqa7t333xgr/core-copy/meta/recipes-devtools/cdrtools/cdrtools-native_3.01.bb:do_install) failed with exit code '1'

centos8-ty-2 reproducible-centos

https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766

/home/pokybuild/yocto-worker/reproducible-centos/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules
/home/pokybuild/yocto-worker/reproducible-centos/build/meta/recipes-core/dbus/dbus_1.12.20.bb:do_configure

| CC [M] fs/nls/nls_cp737.o
| Assembler messages:
| Error: can't open /tmp/ccUONJmc.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work-shared/qemux86-64/kernel-source/scripts/Makefile.build:271: fs/nls/nls_cp737.o] Error 1

| autoreconf: running: /home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/bin/autoconf --include=/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/dbus-1.12.20/m4/ --include=/home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/share/aclocal/ --force
| autom4te: error: cannot open /tmp/arGZ1Myy/am4tld4nPo/traces.m4: No such file or directory
| autoreconf: error: /home/pokybuild/yocto-worker/reproducible-centos/build/build-st/reproducibleA/tmp/work/core2-64-poky-linux/dbus/1.12.20-r0/recipe-sysroot-native/usr/bin/autoconf failed with exit status: 1

musl-qemux86 centos8-ty-2

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099/steps/11/logs/stdio

| Assembler messages:
| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1


Re: musl-qemux86: do_compile_kernelmodules() failed

Richard Purdie
 

Yes, it seems we have a problem on Centos 8 where files in /tmp are being
removed unexpectedly for unknown reasons and causing a variety of build
failures.

Michael: Were there any recent changes in the updates to the Centos 8 system
that would explain this?

Cheers,

Richard

On Mon, 2021-10-04 at 09:00 +0000, Naveen Saini wrote:
It seems CentOS specific !
Β 
Same kernelmodules() failure during reproducible-centos:
https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766
Β 
And oe-selftest-centos failures too only on centos:
https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644
Β 
Regards,
Naveen
Β 
Β 
From: swat@lists.yoctoproject.org <swat@lists.yoctoproject.org>On Behalf Of
Naveen Saini
Sent: Monday, October 4, 2021 4:44 PM
To: swat@lists.yoctoproject.org
Cc: Bruce Ashfield <bruce.ashfield@gmail.com>
Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed
Β 
Hi Richard,
Β 
On master-next, do_compile_kernelmodules() is failing (with musl).
Β 
https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099
Β 
Error log:
Β 
| Assembler messages:
| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory
| make[3]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-
source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1
| make[2]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-
source/scripts/Makefile.build:514: drivers/net] Error 2
| make[1]: *** [/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858:
drivers] Error 2
| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-
shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2
| make: Leaving directory '/home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'
| ERROR: oe_runmake failed
| WARNING: /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'
| WARNING: Backtrace (BB generated script):
| Β Β Β Β Β Β Β Β Β Β  #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 220
| Β Β Β Β Β Β Β Β Β Β  #2: die, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 204
| Β Β Β Β Β Β Β Β Β Β  #3: oe_runmake, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 199
| Β Β Β Β Β Β Β Β Β Β  #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 170
| Β Β Β Β Β Β Β Β Β Β  #5: main, /home/pokybuild/yocto-worker/musl-
qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-
yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-
r0/temp/run.do_compile_kernelmodules.1482957, line 223
NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0:task
do_compile_kernelmodules: Failed
ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-
kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit
code '1'
Β 
Regards,
Naveen


Re: musl-qemux86: do_compile_kernelmodules() failed

Naveen Saini
 

It seems CentOS specific !

Β 

Same kernelmodules() failure during reproducible-centos:

https://autobuilder.yoctoproject.org/typhoon/#/builders/115/builds/766

Β 

And oe-selftest-centos failures too only on centos:

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2644

Β 

Regards,

Naveen

Β 

Β 

From: swat@... <swat@...> On Behalf Of Naveen Saini
Sent: Monday, October 4, 2021 4:44 PM
To: swat@...
Cc: Bruce Ashfield <bruce.ashfield@...>
Subject: [swat] musl-qemux86: do_compile_kernelmodules() failed

Β 

Hi Richard,

Β 

On master-next, do_compile_kernelmodules() is failing (with musl).

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099

Β 

Error log:

Β 

| Assembler messages:

| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory

| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1

| make[2]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:514: drivers/net] Error 2

| make[1]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858: drivers] Error 2

| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2

| make: Leaving directory '/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'

| ERROR: oe_runmake failed

| WARNING: /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'

| WARNING: Backtrace (BB generated script):

| Β Β Β Β Β Β Β Β Β Β  #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 220

| Β Β Β Β Β Β Β Β Β Β  #2: die, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 204

| Β Β Β Β Β Β Β Β Β Β  #3: oe_runmake, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 199

| Β Β Β Β Β Β Β Β Β Β  #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 170

| Β Β Β Β Β Β Β Β Β Β  #5: main, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 223

NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0: task do_compile_kernelmodules: Failed

ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit code '1'

Β 

Regards,

Naveen


musl-qemux86: do_compile_kernelmodules() failed

Naveen Saini
 

Hi Richard,

Β 

On master-next, do_compile_kernelmodules() is failing (with musl).

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/64/builds/4099

Β 

Error log:

Β 

| Assembler messages:

| Error: can't open /tmp/ccQzs5nT.s for reading: No such file or directory

| make[3]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:271: drivers/net/tun.o] Error 1

| make[2]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/scripts/Makefile.build:514: drivers/net] Error 2

| make[1]: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:1858: drivers] Error 2

| make: *** [/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work-shared/qemux86/kernel-source/Makefile:220: __sub-make] Error 2

| make: Leaving directory '/home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/linux-qemux86-standard-build'

| ERROR: oe_runmake failed

| WARNING: /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957:220 exit 1 from 'exit 1'

| WARNING: Backtrace (BB generated script):

| Β Β Β Β Β Β Β Β Β Β  #1: bbfatal_log, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 220

| Β Β Β Β Β Β Β Β Β Β  #2: die, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 204

| Β Β Β Β Β Β Β Β Β Β  #3: oe_runmake, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 199

| Β Β Β Β Β Β Β Β Β Β  #4: do_compile_kernelmodules, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 170

| Β Β Β Β Β Β Β Β Β Β  #5: main, /home/pokybuild/yocto-worker/musl-qemux86/build/build/tmp/work/qemux86-poky-linux-musl/linux-yocto/5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0/temp/run.do_compile_kernelmodules.1482957, line 223

NOTE: recipe linux-yocto-5.14.6+gitAUTOINC+42d2cf670e_7ae156be3b-r0: task do_compile_kernelmodules: Failed

ERROR: Task (/home/pokybuild/yocto-worker/musl-qemux86/build/meta/recipes-kernel/linux/linux-yocto_5.14.bb:do_compile_kernelmodules) failed with exit code '1'

Β 

Regards,

Naveen


Re: The postinstall intercept hook failures on master-next

Naveen Saini
 

It is happening again on master now, for most builds:

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2685

Β 

Regards,

Naveen

Β 

From: swat@... <swat@...> On Behalf Of Naveen Saini
Sent: Friday, October 1, 2021 3:27 PM
To: swat@...
Subject: [swat] The postinstall intercept hook failures on master-next

Β 

Hi Richard,

Β 

I can see a number of postinstall intercept hook failures on master-next on oe-selftest builds on all distros and reproducible builds.

Β 

Error log:

do_rootfs: The postinstall intercept hook 'update_pixbuf_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_font_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_gio_module_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_udev_hwdb' failed

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2678

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/2646

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2636

https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/2617

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2611

Β 

I already raised a bug for Β 'update_udev_hwdb'

https://bugzilla.yoctoproject.org/show_bug.cgi?id=14583

Β 

I can not see more logs to look deeper, should I raise bugs for all of them ?

Β 

Regards,

Naveen

Β 

Β 


Re: YP Autobuilder/SwatBoart Enhancement

Alexandre Belloni
 

On 01/10/2021 12:52:46+0000, Naveen Saini wrote:
Ahh..right !! Thanks.

But still I think, in case of build cancellation, this provision can make debugging easier. If still this does not make much sense, I can close this bug.
I agree, this would meake it easier and I'm not dismissing the idea but
it is not simple to implement in the current architecture. Let's leave
it open and discuss on Thursday

Regards,
Naveen

-----Original Message-----
From: swat@lists.yoctoproject.org <swat@lists.yoctoproject.org> On Behalf
Of Alexandre Belloni
Sent: Friday, October 1, 2021 6:46 PM
To: swat@lists.yoctoproject.org
Subject: Re: [swat] YP Autobuilder/SwatBoart Enhancement

Hello,

On 01/10/2021 09:02:07+0000, Naveen Saini wrote:
Hi Team,

I created a bug#14584 to enhance Autobuilder/SwatBoat process.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Description:
Currently when user cancel a build, there is no provision to enter reason,
why it was cancelled. If there is no need to triage such cancelled builds then
allow to enable 'Not for SWAT'

Sometime it becomes difficult to analyse the build failures, when mater-
next branch is force rebased (drop conflict patch). In that case, there is no
way to know which commits were included in cancelled builds.
I explained on
https://wiki.yoctoproject.org/wiki/Yocto_Build_Failure_Swat_Team how to
get that:

$ git clone git://git.yoctoproject.org/poky $ cd poky $ git fetch origin
47482eff9897ccde946e9247724babc3a586d318
$ git log FETCH_HEAD

This should work unless the git garbage collector ran.


--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel
engineering https://bootlin.com







--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: YP Autobuilder/SwatBoart Enhancement

Naveen Saini
 

Ahh..right !! Thanks.

But still I think, in case of build cancellation, this provision can make debugging easier. If still this does not make much sense, I can close this bug.

Regards,
Naveen

-----Original Message-----
From: swat@lists.yoctoproject.org <swat@lists.yoctoproject.org> On Behalf
Of Alexandre Belloni
Sent: Friday, October 1, 2021 6:46 PM
To: swat@lists.yoctoproject.org
Subject: Re: [swat] YP Autobuilder/SwatBoart Enhancement

Hello,

On 01/10/2021 09:02:07+0000, Naveen Saini wrote:
Hi Team,

I created a bug#14584 to enhance Autobuilder/SwatBoat process.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Description:
Currently when user cancel a build, there is no provision to enter reason,
why it was cancelled. If there is no need to triage such cancelled builds then
allow to enable 'Not for SWAT'

Sometime it becomes difficult to analyse the build failures, when mater-
next branch is force rebased (drop conflict patch). In that case, there is no
way to know which commits were included in cancelled builds.
I explained on
https://wiki.yoctoproject.org/wiki/Yocto_Build_Failure_Swat_Team how to
get that:

$ git clone git://git.yoctoproject.org/poky $ cd poky $ git fetch origin
47482eff9897ccde946e9247724babc3a586d318
$ git log FETCH_HEAD

This should work unless the git garbage collector ran.


--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel
engineering https://bootlin.com




Re: YP Autobuilder/SwatBoart Enhancement

Alexandre Belloni
 

Hello,

On 01/10/2021 09:02:07+0000, Naveen Saini wrote:
Hi Team,

I created a bug#14584 to enhance Autobuilder/SwatBoat process.
https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Description:
Currently when user cancel a build, there is no provision to enter reason, why it was cancelled. If there is no need to triage such cancelled builds then allow to enable 'Not for SWAT'

Sometime it becomes difficult to analyse the build failures, when mater-next branch is force rebased (drop conflict patch). In that case, there is no way to know which commits were included in cancelled builds.
I explained on
https://wiki.yoctoproject.org/wiki/Yocto_Build_Failure_Swat_Team how to
get that:

$ git clone git://git.yoctoproject.org/poky
$ cd poky
$ git fetch origin 47482eff9897ccde946e9247724babc3a586d318
$ git log FETCH_HEAD

This should work unless the git garbage collector ran.


--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


YP Autobuilder/SwatBoart Enhancement

Naveen Saini
 

Hi Team,

Β 

I created a bug#14584 to enhance Autobuilder/SwatBoat process.

https://bugzilla.yoctoproject.org/show_bug.cgi?id=14584

Β 

Description:

Currently when user cancel a build, there is no provision to enter reason, why it was cancelled. If there is no need to triage such cancelled builds then allow to enable 'Not for SWAT'

Β 

Sometime it becomes difficult to analyse the build failures, when mater-next branch is force rebased (drop conflict patch). In that case, there is no way to know which commits were included in cancelled builds.

Β 

Regards,

Naveen

Β 

Β 

Β 

Β 


The postinstall intercept hook failures on master-next

Naveen Saini
 

Hi Richard,

Β 

I can see a number of postinstall intercept hook failures on master-next on oe-selftest builds on all distros and reproducible builds.

Β 

Error log:

do_rootfs: The postinstall intercept hook 'update_pixbuf_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_font_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_gio_module_cache' failed

Β 

do_rootfs: The postinstall intercept hook 'update_udev_hwdb' failed

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/83/builds/2678

Β 

https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/2646

https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/2636

https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/2617

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/2611

Β 

I already raised a bug for Β 'update_udev_hwdb'

https://bugzilla.yoctoproject.org/show_bug.cgi?id=14583

Β 

I can not see more logs to look deeper, should I raise bugs for all of them ?

Β 

Regards,

Naveen

Β 

Β 


Re: SWAT Rotation schedule

Naveen Saini
 

Acknowledged.

Regards,
Naveen

-----Original Message-----
From: swat@lists.yoctoproject.org <swat@lists.yoctoproject.org> On Behalf
Of Alexandre Belloni
Sent: Thursday, September 30, 2021 7:13 AM
To: Saini, Naveen Kumar <naveen.kumar.saini@intel.com>
Cc: swat@lists.yoctoproject.org
Subject: Re: [swat] SWAT Rotation schedule

Hello Naveen,

It is a quick reminder that you will be on SWAT duty starting Friday October
1st.

On 16/09/2021 23:29:50+0200, Alexandre Belloni wrote:
Hello,

Following the summer break, I would like to organize SWAT duty rotation.
This time, I've prepared a semi randomized schedule so it is easier
for each one of you to plan for SWAT duty.

Please check the table and let me know if you are not available for
the selected week. SWAT duty will be from Friday to Thursday and the
goal is to triage all the failures on swatbot before the weekly triage
call happening at 2:30pm UTC.

β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Week β”‚ Start β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Alejandro Hernandez Samaniego β”‚ 37 β”‚ 17/09/2021 β”‚
β”‚ Oleksiy Obitotskyy β”‚ 38 β”‚ 24/09/2021 β”‚
β”‚ Naveen Saini β”‚ 39 β”‚ 01/10/2021 β”‚
β”‚ Thomas Perrot β”‚ 40 β”‚ 08/10/2021 β”‚
β”‚ Paul Eggleton β”‚ 41 β”‚ 15/10/2021 β”‚
β”‚ Christopher Larson β”‚ 42 β”‚ 22/10/2021 β”‚
β”‚ Jon Mason β”‚ 43 β”‚ 29/10/2021 β”‚
β”‚ Lee Chee Yang β”‚ 44 β”‚ 05/11/2021 β”‚
β”‚ Minjae Kim β”‚ 45 β”‚ 12/11/2021 β”‚
β”‚ Jaga β”‚ 46 β”‚ 19/11/2021 β”‚
β”‚ Leo Sandoval β”‚ 47 β”‚ 26/11/2021 β”‚
β”‚ Ross Burton β”‚ 48 β”‚ 03/12/2021 β”‚
β”‚ KΓΆry Maincent β”‚ 49 β”‚ 10/12/2021 β”‚
β”‚ Anibal Limon β”‚ 50 β”‚ 17/12/2021 β”‚
β”‚ Saul Wold β”‚ 51 β”‚ 24/12/2021 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Alejandro you would be the next one on the list, starting this Friday,
can you confirm you are available?

There are currently 8 failures to triage on swatbot, I'm going to take
care of those.

Thanks!

--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com




--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel
engineering https://bootlin.com




Re: SWAT Rotation schedule

Alexandre Belloni
 

Hello Naveen,

It is a quick reminder that you will be on SWAT duty starting Friday
October 1st.

On 16/09/2021 23:29:50+0200, Alexandre Belloni wrote:
Hello,

Following the summer break, I would like to organize SWAT duty rotation.
This time, I've prepared a semi randomized schedule so it is easier for
each one of you to plan for SWAT duty.

Please check the table and let me know if you are not available for the
selected week. SWAT duty will be from Friday to Thursday and the goal is
to triage all the failures on swatbot before the weekly triage call
happening at 2:30pm UTC.

β”Œβ”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Week β”‚ Start β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Alejandro Hernandez Samaniego β”‚ 37 β”‚ 17/09/2021 β”‚
β”‚ Oleksiy Obitotskyy β”‚ 38 β”‚ 24/09/2021 β”‚
β”‚ Naveen Saini β”‚ 39 β”‚ 01/10/2021 β”‚
β”‚ Thomas Perrot β”‚ 40 β”‚ 08/10/2021 β”‚
β”‚ Paul Eggleton β”‚ 41 β”‚ 15/10/2021 β”‚
β”‚ Christopher Larson β”‚ 42 β”‚ 22/10/2021 β”‚
β”‚ Jon Mason β”‚ 43 β”‚ 29/10/2021 β”‚
β”‚ Lee Chee Yang β”‚ 44 β”‚ 05/11/2021 β”‚
β”‚ Minjae Kim β”‚ 45 β”‚ 12/11/2021 β”‚
β”‚ Jaga β”‚ 46 β”‚ 19/11/2021 β”‚
β”‚ Leo Sandoval β”‚ 47 β”‚ 26/11/2021 β”‚
β”‚ Ross Burton β”‚ 48 β”‚ 03/12/2021 β”‚
β”‚ KΓΆry Maincent β”‚ 49 β”‚ 10/12/2021 β”‚
β”‚ Anibal Limon β”‚ 50 β”‚ 17/12/2021 β”‚
β”‚ Saul Wold β”‚ 51 β”‚ 24/12/2021 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Alejandro you would be the next one on the list, starting this Friday,
can you confirm you are available?

There are currently 8 failures to triage on swatbot, I'm going to take
care of those.

Thanks!

--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com




--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: SWAT Rotation schedule

Alexandre Belloni
 

On 23/09/2021 05:15:06+0000, Oleksiy Obitotskyy via lists.yoctoproject.org wrote:
Hi,

I will start as planned from 24/09/2021.
Thanks, there are currently 7 failures that are not triaged. The two
reproducible issues may already be fixed and we'll know that once the
nightly builds have finished.

Unfortunately, the logs for the qemuarm64 failure are already gone are
there is already a (failing) qemuarm64 build running on the same worker.
The four other ones are one off issues that didn't repeat, I'll discuss
with Richard what to do with them.

Regards,

--
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

1 - 20 of 217