Re: SWAT Rotation schedule
Jon Mason
Sorry, I was unaware. I'll see what I can do to make up for it.
the check-layer-nightly failure is because there are duplicate recipes for python3-sphinx in meta-virt and oe-core, and theta re different versions ./meta-virtualization/recipes-devtools/python/python3-sphinx_4.2.0.bb ./meta/recipes-devtools/python/python3-sphinx_4.4.0.bb Given that the meta-virt one is older, it probably should go. I'll contact Bruce to see what he wants to do about it. The meta-aws looks like a python3 issue. I'll take a look and if not trivial, open a bug. Thanks, Jon On Tue, Mar 22, 2022 at 8:24 PM Alexandre Belloni <alexandre.belloni@...> wrote:
|
|
Re: SWAT Rotation schedule
Alexandre Belloni
Hello Jon,
toggle quoted message
Show quoted text
A quick reminder that you are on duty this week. It would be great if you could sort out the meta-aws and check-layer-nightly failures. On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote:
Hello everyone, --
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com |
|
Re: SWAT Rotation schedule
Paul Eggleton
Just to let you know I've been out today (in fact, since Friday) as my kids have been ill - nothing to worry about, but it's kept me from SWAT duties. Hopefully should be back on deck tomorrow.
toggle quoted message
Show quoted text
Paul -----Original Message-----
From: Paul Eggleton Sent: Friday, 4 March 2022 5:47 am To: Alexandre Belloni <alexandre.belloni@...> Cc: swat@... Subject: RE: [swat] SWAT Rotation schedule Hi Alexandre Acknowledged, thanks. Paul -----Original Message----- From: Alexandre Belloni <alexandre.belloni@...> Sent: Friday, 4 March 2022 3:30 am To: Paul Eggleton <Paul.Eggleton@...> Cc: swat@... Subject: Re: [swat] SWAT Rotation schedule [You don't often get email from alexandre.belloni@.... Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] Hello Paul, This is a reminder that you are on SWAT duty starting this Friday On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote: -- Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbootlin.com%2F&data=04%7C01%7Cpaul.eggleton%40microsoft.com%7C07dadae22b7b41c5b47e08d9fd225dc4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637819146348526971%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=PT58h66Uprq6uEK3tQJh7NwRvN118j1DjUI35zg17fE%3D&reserved=0 |
|
[swatbot] [PATCH] swatapp: Display build start and complete dates
Alexandre Belloni
Also switch the changelog timestamp format to be more readable
[YOCTO #14650] Signed-off-by: Alexandre Belloni <alexandre.belloni@...> --- swatapp/templates/swatapp/changelog.html | 2 +- swatapp/templates/swatapp/index.html | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/swatapp/templates/swatapp/changelog.html b/swatapp/templates/swatapp/changelog.html index f28abaefe187..5011e469c451 100644 --- a/swatapp/templates/swatapp/changelog.html +++ b/swatapp/templates/swatapp/changelog.html @@ -5,7 +5,7 @@ {% if changes %} <ul> {% for change in changes %} - <li>{{ change.user.username }} changed {{ change.failure.build.targetname }}:{{ change.failure.stepname }} <a href="/collection/{{ change.failure.build.buildcollection.id }}/">{{ change.failure.build.buildid }}:{{ change.failure.id }}</a> from {{ change.get_oldstatus_display }} to {{ change.get_newstatus_display }} on {{ change.timestamp }}: {{ change.newnote }}</li> + <li>{{ change.user.username }} changed {{ change.failure.build.targetname }}:{{ change.failure.stepname }} <a href="/collection/{{ change.failure.build.buildcollection.id }}/">{{ change.failure.build.buildid }}:{{ change.failure.id }}</a> from {{ change.get_oldstatus_display }} to {{ change.get_newstatus_display }} on {{ change.timestamp|date:"Y-m-d H:m:s" }}: {{ change.newnote }}</li> {% endfor %} </ul> diff --git a/swatapp/templates/swatapp/index.html b/swatapp/templates/swatapp/index.html index a015e60b7bbf..0e331a9882d7 100644 --- a/swatapp/templates/swatapp/index.html +++ b/swatapp/templates/swatapp/index.html @@ -45,6 +45,9 @@ function togglefailures(source) { <br> <b>Owner:</b> {{ collection.owner }} <b>Reason:</b> {{ collection.reason }} + <br> + <b>Started:</b> {{ mainbuild.started|date:"Y-m-d H:m:s" }} + <b>Completed:</b> {{ mainbuild.completed|date:"Y-m-d H:m:s" }} </td> <td></td> @@ -88,6 +91,7 @@ function togglefailures(source) { <td> <a href="{{ build.url }}">{{ build.targetname }}</a> {{ build.workername }} + {{ build.completed|date:"U" }} ({{ build.get_status_display }}) </td> <td></td> -- 2.35.1 |
|
Re: SWAT Rotation schedule
Paul Eggleton
Hi Alexandre
toggle quoted message
Show quoted text
Acknowledged, thanks. Paul -----Original Message-----
From: Alexandre Belloni <alexandre.belloni@...> Sent: Friday, 4 March 2022 3:30 am To: Paul Eggleton <Paul.Eggleton@...> Cc: swat@... Subject: Re: [swat] SWAT Rotation schedule [You don't often get email from alexandre.belloni@.... Learn why this is important at http://aka.ms/LearnAboutSenderIdentification.] Hello Paul, This is a reminder that you are on SWAT duty starting this Friday On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote: -- Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbootlin.com%2F&data=04%7C01%7Cpaul.eggleton%40microsoft.com%7C07dadae22b7b41c5b47e08d9fd225dc4%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637819146348526971%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=PT58h66Uprq6uEK3tQJh7NwRvN118j1DjUI35zg17fE%3D&reserved=0 |
|
Re: SWAT Rotation schedule
Alexandre Belloni
Hello Paul,
toggle quoted message
Show quoted text
This is a reminder that you are on SWAT duty starting this Friday On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote:
--
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com |
|
Re: qemux86-64 ltp backtrace
Bruce Ashfield <bruce.ashfield@...>
On Thu, Feb 24, 2022 at 6:04 AM Richard Purdie
<richard.purdie@...> wrote: There's not a lot of useful information in the trace, but even under memory pressure / OOM, we shouldn't get a kernel oops, so it very well could be a kernel issue when resources are low. As we all know, there's probably not much we can do except watch and see if it repeats, since it is unlikely we can reproduce it on demand. Bruce
-- - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end - "Use the force Harry" - Gandalf, Star Trek II |
|
qemux86-64 ltp backtrace
Richard Purdie
Hi,
We saw an LTP image failure today: https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/2937 It was in the contoller tests (cgroup ones). I logged in and was able to grab the backtrace below. More logs are here: https://autobuilder.yocto.io/pub/failed-builds-data/alma8-ty-2/qemux86-64-ltp-20220224/ (there were earlier OOMs in the build, normal for ltp testing and quite a while before this other failure). Not sure if it is helpful/useful but wanted to log it and save the logs before they disappeared. Cheers, Richard [ 2117.798342] BUG: unable to handle page fault for address: 000027ff010040a8 [ 2117.802960] #PF: supervisor instruction fetch in kernel mode [ 2117.803589] #PF: error_code(0x0010) - not-present page [ 2117.804169] PGD 0 P4D 0 [ 2117.804471] Oops: 0010 [#1] PREEMPT SMP PTI [ 2117.804945] CPU: 2 PID: 7 Comm: kworker/u8:0 Not tainted 5.15.22-yocto-standard #1 [ 2117.805782] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 2117.807015] Workqueue: events_unbound flush_memcg_stats_dwork [ 2117.807652] RIP: 0010:0x27ff010040a8 [ 2117.808052] Code: Unable to access opcode bytes at RIP 0x27ff0100407e. [ 2117.808820] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002 [ 2117.809402] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000 [ 2117.810185] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280 [ 2117.810961] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000 [ 2117.811740] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003 [ 2117.812520] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0 [ 2117.813301] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000 [ 2117.814186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2117.814817] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0 [ 2117.815601] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2117.816383] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2117.817167] Call Trace: [ 2117.817444] <TASK> [ 2117.817684] ? cgroup_rstat_flush_locked+0x20b/0x2c0 [ 2117.818245] cgroup_rstat_flush_irqsafe+0x29/0x40 [ 2117.818764] __mem_cgroup_flush_stats+0x3e/0x60 [ 2117.819271] flush_memcg_stats_dwork+0xe/0x30 [ 2117.819757] process_one_work+0x1d5/0x3e0 [ 2117.820209] worker_thread+0x53/0x3f0 [ 2117.820616] ? rescuer_thread+0x360/0x360 [ 2117.821060] kthread+0x13b/0x160 [ 2117.821424] ? set_kthread_struct+0x50/0x50 [ 2117.821887] ret_from_fork+0x22/0x30 [ 2117.822290] </TASK> [ 2117.822544] Modules linked in: bnep [ 2117.822939] CR2: 000027ff010040a8 [ 2117.823319] ---[ end trace f11fe9e87c0e7a6f ]--- [ 2117.823832] RIP: 0010:0x27ff010040a8 [ 2117.824244] Code: Unable to access opcode bytes at RIP 0x27ff0100407e. [ 2117.824958] RSP: 0018:ffffaab680043dd8 EFLAGS: 00010002 [ 2117.825544] RAX: 000027ff010040a8 RBX: ffffa2b741a3e280 RCX: ffffa2b741119000 [ 2117.826325] RDX: 0000000000000026 RSI: 0000000000000003 RDI: ffffa2b741a3e280 [ 2117.827101] RBP: ffffaab680043e38 R08: 0000000000000000 R09: 0000000000000000 [ 2117.827882] R10: 0000000000000003 R11: 0000000000000018 R12: 0000000000000003 [ 2117.828669] R13: ffffffffb96ccd30 R14: ffffffffb96ccd30 R15: ffffffffb96ccfe0 [ 2117.829450] FS: 0000000000000000(0000) GS:ffffa2b77ed00000(0000) knlGS:0000000000000000 [ 2117.830343] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2117.830974] CR2: 000027ff010040a8 CR3: 0000000001242000 CR4: 00000000001506e0 [ 2117.831756] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2117.832540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2117.833320] note: kworker/u8:0[7] exited with preempt_count 3 |
|
Re: SWAT Rotation schedule
Naveen Saini
Hi Alex,
I will swap SWAT duty with Lee, Chee Yang. So on 25/02/2022 Chee Yang Lee will be on SWAT duty instead of me. │ Naveen Saini │ 8 │ 25/02/2022 │Regards, Naveen -----Original Message----- |
|
logs from qemux86-64-ltp failure on fedora35-ty-2
Richard Purdie
Hi,
We saw a failure in qemux86-64-ltp on fedora35-ty-2: https://autobuilder.yoctoproject.org/typhoon/#/builders/95/builds/2905 I saved the logs here: https://autobuilder.yocto.io/pub/failed-builds-data/fedora35-ty-2/qemux86-64-ltp-20220218/ I note there are a lot of rcu stalls in it. The controllers log stops after 4 cgroups tests. I wanted to make sure the logs for the issue were saved. Cheers, Richard |
|
Re: dbus-wait git failures on the autobuilder
Michael Halstead <mhalstead@...>
On Thu, Feb 17, 2022 at 8:54 AM Richard Purdie <richard.purdie@...> wrote: After the discussion in triage I tried to test some of the theories mentioned on There are no rate limiting rules in place on that server. There is a load balancer in front of it that shows several connections reset around the time of the error. I don't have detailed logging about the connection resets but I suspect issues are on the autobuilder data center side not on the git.yoctoproject.org mirror.
Michael Halstead Linux Foundation / Yocto Project Systems Operations Engineer |
|
dbus-wait git failures on the autobuilder
Richard Purdie
After the discussion in triage I tried to test some of the theories mentioned on
the call. There was this failed build: https://autobuilder.yoctoproject.org/typhoon/#/builders/79/builds/3169/steps/14/logs/stdio from alma8-ty-1.yocto.io, so I logged in there, restored the exact same build directory which failed as it hadn't been cleaned up yet and re-ran the test. [pokybuild@alma8-ty-1 build]$ oe-selftest -r fetch.Fetch.test_git_mirrors 2022-02-17 16:46:05,158 - oe-selftest - INFO - Adding layer libraries: 2022-02-17 16:46:05,158 - oe-selftest - INFO - /home/pokybuild/yocto-worker/oe-selftest-centos/build/meta-poky/lib 2022-02-17 16:46:05,158 - oe-selftest - INFO - /home/pokybuild/yocto-worker/oe-selftest-centos/build/meta/lib 2022-02-17 16:46:05,159 - oe-selftest - INFO - /home/pokybuild/yocto-worker/oe-selftest-centos/build/meta-yocto-bsp/lib 2022-02-17 16:46:05,159 - oe-selftest - INFO - /home/pokybuild/yocto-worker/oe-selftest-centos/build/meta-selftest/lib 2022-02-17 16:46:05,160 - oe-selftest - INFO - Running bitbake -e to test the configuration is valid/parsable 2022-02-17 16:46:07,875 - oe-selftest - INFO - Adding: "include selftest.inc" in /home/pokybuild/yocto-worker/oe-selftest-centos/build/build-st/conf/local.conf 2022-02-17 16:46:07,875 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf 2022-02-17 16:46:07,876 - oe-selftest - INFO - test_git_mirrors (fetch.Fetch) 2022-02-17 16:47:18,909 - oe-selftest - INFO - ... ok 2022-02-17 16:47:19,960 - oe-selftest - INFO - ---------------------------------------------------------------------- 2022-02-17 16:47:19,960 - oe-selftest - INFO - Ran 1 test in 73.472s 2022-02-17 16:47:19,960 - oe-selftest - INFO - OK 2022-02-17 16:47:24,497 - oe-selftest - INFO - RESULTS: 2022-02-17 16:47:24,498 - oe-selftest - INFO - RESULTS - fetch.Fetch.test_git_mirrors: PASSED (71.03s) 2022-02-17 16:47:24,502 - oe-selftest - INFO - SUMMARY: 2022-02-17 16:47:24,503 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 73.473s 2022-02-17 16:47:24,503 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0) so it worked. This tells is that the issue is probably not a cert issue on the worker or something worker specific but some transient issue on the network or upstream server side of things :/. Michael: Is there any reason git://git.yoctoproject.org/dbus-wait;branch=master would be unreliable to clone? Are we hitting some kind of rate limiting on the upstream git services? This being some network/server issue does fit with several selftests failing at the same time with the same error (3 our of 4 selftests failed at the same time for the failure above). Cheers, Richard |
|
Re: SWAT Rotation schedule
Hi Alexandre,
I'm ready to start my shift on Friday. Regards, Oleksiy ________________________________________ From: Alexandre Belloni <alexandre.belloni@...> Sent: Thursday, February 17, 2022 01:14 To: swat@...; Oleksiy Obitotskyi -X (oobitots - GLOBALLOGIC INC at Cisco) Subject: Re: [swat] SWAT Rotation schedule Hello Oleksiy, Just a reminder that you will be on SWAT duty starting this Friday, please confirm you will be available. On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote: Hello everyone,-- Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com |
|
Re: SWAT Rotation schedule
Alexandre Belloni
Hello Oleksiy,
toggle quoted message
Show quoted text
Just a reminder that you will be on SWAT duty starting this Friday, please confirm you will be available. On 27/01/2022 01:09:12+0100, Alexandre Belloni wrote:
Hello everyone, --
Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com |
|
Re: Further info on the fetch issues in meta-aws
elberger@...
Hi Richard
toggle quoted message
Show quoted text
Thank you for the great work and feedback. I think Nathan did complete the switchover on our master-next and we were planning on cutting it over to the mainline today or Monday. Will take the time to reconcile this before we do so. On Feb 10, 2022, at 18:04, Richard Purdie <richard.purdie@...> wrote: |
|
Failures in master-next build
Richard Purdie
Just to save some time, for whoever is triaging the currently failing master-
next build, I added a change to bitbake which shows errors for questionable git url usage. This appears to: * show errors in meta-aws which I reported to them * fails in world builds (including source fetching) - I need to fix that * fails in one of the oe-selftests - that looks like a bug in recipetool not setting PV correctly The buildtools issue is known about and waiting for a new uninative release. Cheers, Richard |
|
Further info on the fetch issues in meta-aws
Richard Purdie
Hi Richard,
I hate to be the bearer of bad news but there is a bit of an issue with meta-aws and it may be more serious then we originally thought. I've been trying to fix some fetcher issues and came up with a fix: https://git.yoctoproject.org/poky/commit/?h=master-next&id=222dd76493ea18c4fde5d3d34eeda9257adf34a2 and some tests: https://git.yoctoproject.org/poky/commit/?h=master-next&id=66ee0b94a88c6aaaddb3580f044a5cc5d35b7cc4 however the newly added error shows issues in meta-aws: https://autobuilder.yoctoproject.org/typhoon/#/builders/122/builds/769/steps/12/logs/stdio In the first commit linked above I explain why the lack of use of get_srcrev() is a problem: """ Where a git url uses a tag instead of a full source revision, breakage can currently occur in builds. Issues include: * the revision being looked up in multiple tasks (fetch and unpack) * the risk a different revision may be obtained in those tasks * that some tasks may not be allowed to access the network * that a revision may not be consistent throughout a given build * rerunning a specific task may given inconsistent results """ and I suspect these are things you care about. You don't have to set PV to include SRCPV, you could just call bb.fetch2.get_srcrev(d) manually in those recipes instead. This probably explains some of the previous strange behaviour for some of the recipes. Cheers, Richard |
|
Re: Failure on AB with meta-agl-core
Scott is aware and on top of it. js Richard Purdie <richard.purdie@...> schrieb am Fr., 4. Feb. 2022, 18:51: On Fri, 2022-02-04 at 09:49 -0800, Saul Wold wrote: |
|
Re: Failure on AB with meta-agl-core
Richard Purdie
On Fri, 2022-02-04 at 09:49 -0800, Saul Wold wrote:
Hi Richard,I did already talk with Scott Murray and there are patches ready for this. I think meta-aws was ok? Cheers, Richard |
|
Failure on AB with meta-agl-core
Saul Wold
Hi Richard,
Reaching out due to a failure in meta-agl-core on the Autobuilder. Weston has been updated to 10.x, but the bbappend in meta-agl-core is still 9.x based. Do you have plans for updating? Thanks -- Sau! |
|