Re: Strange sporadic build issues (incremental builds in docker container)
On Thu 2022-03-24 @ 09:31:25 AM, Alexander Kanavin wrote:
I don't. You need to inspect the build tree to find clues why theYes I've been seeing exactly these issues as well.
I'm not using any sort of virtualization, I'm using Jenkins to do nightly
builds directly on my host. My host machine is openSUSE 15.3. These problems
started on Feb 21 for me.
Each of my builds starts by doing a "git pull" on each of the repositories,
then kicks off a build if any of the repositories changed. A fresh build will
always succeed. Doing a "clean" and rebuilding will (I believe) always
succeed. My gut feeling is that it somehow has something to do with having an
existing build, refreshing the repositories, then rebuilding.
I spent weeks trying to find a reproducer. I wrote a script to checkout one
version of the repositories (before), build, checkout a newer version of the
repositories (after) and rebuilding. Even in cases where I used the exact same
hashes that had failed on my Jenkins build and repeating 20 times, in some
cases I wasn't able to reproduce the error. I was able to find 1 reproducer
involving a build for an imx28evk MACHINE, but even then after 20 iterations
13 were bad and 7 were good. I repeated that set of 20 builds many times and
it was never 100% bad.
My investigations led me to believe that it might be related to rm_work and/or
BB_NUMBER_THREADS/PARALLEL_MAKE. In my Jenkins builds I enable 'INHERIT +=
"rm_work"' and I also limit the BB_NUMBER_THREADS and set PARALLEL_MAKE. On
the cmdline I was able to reduce the number of failures (sometimes to none) by
removing the rm_work and THREADS/PARALLEL, but never completely eliminate it.
In Jenkins the build failures still felt as random as they were without the
change, so I can't say that it's having much effect in Jenkins, but seems to
have some effect on the cmdline.
I can say this with certainty: Matthias says it seems that the specific
recipe that fails is random, but it's not. In every case the recipe that fails
is a recipe whose source files are contained in the meta layer itself. For me
the failing recipes were always:
If you look at the recipes for those packages they do not have a SRC_URI that
fetches code from some remote location then uses quilt to apply some patches.
In both cases all of the "source" code exists in the layer itself, and somehow
quilt is involved in placing them in the build area.
I have dozens and dozens of these failures recorded and it is always with a
recipe that follows that pattern. But 99%-ish percent of the failures are with
the two packages I listed above.
The failures aren't related to days when those packages change. The failures
are just... sporadic.
So the issue is related to:
- recipes with in-layer sources
- quilt (being run twice (?))
- updating layers, and rebuilding in a build area with an existing build
- Feb 21 2022 (or thereabouts)
The issue might be related to:
- my build host?