Date
1 - 3 of 3
Intermittent network issue on centos7-ty-4
Richard Purdie
Yesterday there was some talk in irc about a breif networking failure on
centos7-ty-4. The build was: https://autobuilder.yoctoproject.org/typhoon/#/builders/104/builds/3083 Showing these warnings: stdio: WARNING: gtk+3-3.24.30-r0 do_package: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known stdio: WARNING: gtk+3-3.24.30-r0 do_packagedata: Error contacting Hash Equivalence Server typhoon.yocto.io:8686: [Errno -2] Name or service not known The worry being why was the DNS suddenly not working. Looking at the log: /home/pokybuild/yocto-worker/genericx86-64-alt/build/build-renamed/tmp/work/core2-64-poky-linux/gtk+3/3.24.30-r0/temp/log.do_package.41319 The warning is near the end of that and the log timestamp is Oct 18 17:11 (i.e. when it was last written to). We didn't spot anything yesterday but I realised the systemd journalctl timestamps are an hour different that the timestamp I displayed from the filesystem thanks for daylight savings. Looking at journalctl an hour different, I see: Oct 18 16:00:03 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root. Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root. Oct 18 16:01:03 centos7-ty-4.yocto.io systemd[1]: Started Session 483 of user root. Oct 18 16:01:03 centos7-ty-4.yocto.io CROND[25699]: (root) CMD (run-parts /etc/cron.hourly) Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25822]: starting 0anacron Oct 18 16:01:04 centos7-ty-4.yocto.io run-parts(/etc/cron.hourly)[25923]: finished 0anacron Oct 18 16:01:04 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root. Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPREQUEST on enp4s0f0 to 172.29.10.1 port 67 (xid=0x1579bf79) Oct 18 16:05:17 centos7-ty-4.yocto.io dhclient[3905]: DHCPACK from 172.29.10.1 (xid=0x1579bf79) Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2367] dhcp4 (enp4s0f0): address 172.29.10.2 Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): plen 24 (255.255.255.0) Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): gateway 172.29.10.1 Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): lease time 7200 Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): hostname 'yct-ab02' Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): nameserver '172.29.10.1' Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp4 (enp4s0f0): domain name 'yocto.io' Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2373] dhcp (enp4s0f0): domain search 'pdx.yoctoproject.org.' Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp (enp4s0f0): domain search 'yocto.io.' Oct 18 16:05:18 centos7-ty-4.yocto.io NetworkManager[1436]: <info> [1634573118.2374] dhcp4 (enp4s0f0): state changed bound -> bound Oct 18 16:05:18 centos7-ty-4.yocto.io dhclient[3905]: bound to 172.29.10.2 -- renewal in 3024 seconds. Oct 18 16:05:18 centos7-ty-4.yocto.io dbus[1414]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-d Oct 18 16:05:18 centos7-ty-4.yocto.io systemd[1]: Starting Network Manager Script Dispatcher Service... Oct 18 16:05:19 centos7-ty-4.yocto.io dbus[1414]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher' Oct 18 16:05:19 centos7-ty-4.yocto.io systemd[1]: Started Network Manager Script Dispatcher Service. Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: new request (4 scripts) Oct 18 16:05:19 centos7-ty-4.yocto.io nm-dispatcher[24137]: req:1 'dhcp4-change' [enp4s0f0]: start running ordered scripts... Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Created slice User Slice of root. Oct 18 16:10:05 centos7-ty-4.yocto.io systemd[1]: Started Session 484 of user root. Oct 18 16:10:05 centos7-ty-4.yocto.io CROND[26785]: (root) CMD (/usr/lib64/sa/sa1 1 1) Oct 18 16:10:08 centos7-ty-4.yocto.io systemd[1]: Removed slice User Slice of root. i.e. it renewed the DHCP lease about that time and ran network manager scripts. This looks a bit like a smoking gun to me. Not sure what to do about it but I think we may know where the networking glitches come from. Cheers, Richard |
|
Richard Purdie
On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.org
wrote: Yesterday there was some talk in irc about a breif networking failure onChecking this again, I think the timestamps are not an hour different and I was misreading some tests I made. The do_install task finished at 17:07 and the do_package finished by 17:11 so we have a 4 minute window which does not match the last dhcp refresh before the error at 16:55. It was a nice idea :/. Cheers, Richard |
|
Richard Purdie
On Tue, 2021-10-19 at 13:24 +0100, Richard Purdie via lists.yoctoproject.org
wrote: On Tue, 2021-10-19 at 12:43 +0100, Richard Purdie via lists.yoctoproject.orgNot sure if this means anything or not (run on centos7-ty-4 for around 10 mins): -bash-4.2$ while true; do ping typhoon.yocto.io -c 1 > /dev/null || date; sleep 0.1; done Tue 19 Oct 12:53:35 GMT 2021 Tue 19 Oct 12:54:32 GMT 2021 Tue 19 Oct 12:55:39 GMT 2021 Tue 19 Oct 12:57:01 GMT 2021 Tue 19 Oct 12:59:54 GMT 2021 Tue 19 Oct 13:00:46 GMT 2021 Tue 19 Oct 13:02:54 GMT 2021 Tue 19 Oct 13:03:11 GMT 2021 Cheers, Richard |
|