thank you very much for the detailed answer.
Yes, you are right, it is mostly the same recipes that fail. But they also change from time to time.
Today it happened to me even without Jenkins and Docker, normally in the console with the recipe keymaps_1.0.bb.
With the nighly builds over the Jenkins I help myself at the moment that I delete build/tmp before.
So far, the problem has not occurred again.
Von: Trevor Woerner <twoerner@...>
Gesendet: Dienstag, 29. März 2022 18:23
An: Alexander Kanavin <alex.kanavin@...>
Cc: Matthias Klein <matthias.klein@...>; yocto@...
Betreff: Re: [yocto] Strange sporadic build issues (incremental builds in docker container)
On Thu 2022-03-24 @ 09:31:25 AM, Alexander Kanavin wrote:
I don't. You need to inspect the build tree to find clues why theYes I've been seeing exactly these issues as well.
I'm not using any sort of virtualization, I'm using Jenkins to do nightly builds directly on my host. My host machine is openSUSE 15.3. These problems started on Feb 21 for me.
Each of my builds starts by doing a "git pull" on each of the repositories, then kicks off a build if any of the repositories changed. A fresh build will always succeed. Doing a "clean" and rebuilding will (I believe) always succeed. My gut feeling is that it somehow has something to do with having an existing build, refreshing the repositories, then rebuilding.
I spent weeks trying to find a reproducer. I wrote a script to checkout one version of the repositories (before), build, checkout a newer version of the repositories (after) and rebuilding. Even in cases where I used the exact same hashes that had failed on my Jenkins build and repeating 20 times, in some cases I wasn't able to reproduce the error. I was able to find 1 reproducer involving a build for an imx28evk MACHINE, but even then after 20 iterations
13 were bad and 7 were good. I repeated that set of 20 builds many times and it was never 100% bad.
My investigations led me to believe that it might be related to rm_work and/or BB_NUMBER_THREADS/PARALLEL_MAKE. In my Jenkins builds I enable 'INHERIT += "rm_work"' and I also limit the BB_NUMBER_THREADS and set PARALLEL_MAKE. On the cmdline I was able to reduce the number of failures (sometimes to none) by removing the rm_work and THREADS/PARALLEL, but never completely eliminate it.
In Jenkins the build failures still felt as random as they were without the change, so I can't say that it's having much effect in Jenkins, but seems to have some effect on the cmdline.
I can say this with certainty: Matthias says it seems that the specific recipe that fails is random, but it's not. In every case the recipe that fails is a recipe whose source files are contained in the meta layer itself. For me the failing recipes were always:
If you look at the recipes for those packages they do not have a SRC_URI that fetches code from some remote location then uses quilt to apply some patches.
In both cases all of the "source" code exists in the layer itself, and somehow quilt is involved in placing them in the build area.
I have dozens and dozens of these failures recorded and it is always with a recipe that follows that pattern. But 99%-ish percent of the failures are with the two packages I listed above.
The failures aren't related to days when those packages change. The failures are just... sporadic.
So the issue is related to:
- recipes with in-layer sources
- quilt (being run twice (?))
- updating layers, and rebuilding in a build area with an existing build
- Feb 21 2022 (or thereabouts)
The issue might be related to:
- my build host?