Re: Files get sporadically lost for native packages


Randy MacLeod
 

On 2020-04-02 4:44 a.m., Konrad Weihmann wrote:

To answer your others questions... see below

On 02.04.20 05:42, Randy MacLeod wrote:
On 2020-03-28 8:26 a.m., Konrad Weihmann wrote:
Hi,

I'm facing the following error message sporadically on all branches I tried so far (master, zeus, warrior and thud)

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_python_func() autogenerated', lineno: 2, function: <module>
      0001:
  *** 0002:extend_recipe_sysroot(d)
      0003:
File: '/build/poky/meta/classes/staging.bbclass', lineno: 551, function: extend_recipe_sysroot
      0547:                    dest = newmanifest[l]
      0548:                    if l.endswith("/"):
      0549:                        staging_copydir(l, targetdir, dest, seendirs)
      0550:                        continue
  *** 0551:                    staging_copyfile(l, targetdir, dest, postinsts, seendirs)
      0552:
      0553:    bb.note("Installed into sysroot: %s" % str(msg_adding))
      0554:    bb.note("Skipping as already exists in sysroot: %s" % str(msg_exists))
      0555:
File: '/build/poky/meta/classes/staging.bbclass', lineno: 152, function: staging_copyfile
      0148:        os.symlink(linkto, dest)
      0149:        #bb.warn(c)
      0150:    else:
      0151:        try:
  *** 0152:            os.link(c, dest)
      0153:        except OSError as err:
      0154:            if err.errno == errno.EXDEV:
      0155:                bb.utils.copyfile(c, dest)
      0156:            else:
Exception: FileNotFoundError: [Errno 2] No such file or directory: '/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/__init__.cpython-37.pyc' -> '/build/poky/build/tmp/work/qemux86_64-mine-linux/core-image-minimal-mine/1.0-r0/recipe-sysroot-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/__init__.cpython-37.pyc'

I already had a look at the manifest

cat manifest-x86_64-python3-msgcheck-native.populate_sysroot
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__init__.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/po.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/msgcheck.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/__init__.cpython-37.pyc
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/po.cpython-37.pyc
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/msgcheck.cpython-37.pyc
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/dependency_links.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/requires.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/top_level.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/SOURCES.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/PKG-INFO
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/entry_points.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/bin/msgcheck
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/sysroot-providers/python3-msgcheck-native
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/sysroot-providers/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/share/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/bin/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/

which states the file should be there, but when doing

find /build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__init__.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/po.cpython-37.pyc
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/__pycache__/msgcheck.cpython-37.pyc
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/po.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck/msgcheck.py
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/dependency_links.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/requires.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/top_level.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/SOURCES.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/PKG-INFO
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/lib/python3.7/site-packages/msgcheck-2.8-py3.7.egg-info/entry_points.txt
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/bin
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/bin/msgcheck
/build/poky/build/tmp/sysroots-components/x86_64/python3-msgcheck-native/usr/share

the file isn't there.

This happens to random python packages compiled as native (sometimes even for python-native itself), but (afaik) not for cross or target packages (I'm pretty sure because of the different packaging applied).
So I digged a little into the code classes/sstate.bbclass:sstate_install, which seems to create the sysroot-component dir and the manifest.
There is a gap between the manifest creation (line 285) and the hardlinking (line till 311).

Now my question is there any place where a file potentially could get lost? - at first glance there shouldn't be one - I have to admit that I don't fully understand all this subprocess magic in lib/oe/path.py:copyhardlinktree.
What I do to fix the issue is reopening the manifest and double check for missing files and remove them from the manifest, but this
feels wrong - so any advise is welcome...

Hope that someone more familiar with the topic could have a look.

Hi Konrad,

I'm not really familiar with that code but it's being run buy 1000s of
builder around the world so let's try to eliminate a few possibilities.

When did you start having this problem?
Since the start of the test distribution I'm working on. But also for plain poky builds if I forcefully inject all of the python-native site-packages via local.conf (DEPENDS_class-native += "..."), without actually using them in the recipe scope
How often do you think it's happening: 1 in 3 builds, 1 in 10?
See the other mail - looks like it heavily depends on the host
Tell us about your machine: OS,version, disk, CPUs, ram
See the other mail
Do you do anything special in your conf dir? Send local.conf perhaps.
No custom modification (just for testing the DEPENDS-injection)
Do you have any local bbappends or commits on top of poky or
in other layers?
No
Have you tried to simplify the build to eliminate problems
potentially caused by other layers?
I did - see above
Are you able to reproduce the problem on more than one build machine?
See the other mail
Are you able to reproduce the problem on a different Linux distro?
Not really - Debian 9 was fine all other Hosts are Ubuntu based
Are there other builds or users on the machine that may be causing
extra load?
No the hosts are just being poorly equipped - at least the ones that produce this issue


Hi Konrad,

Thanks for the detailed and complete replies.

I don't think I've seen this error and we do 100s of builds
per day using local many-core systems running Ubuntu-18.04
but with the builds in docker containers using a variety of
OS distributions.

My first *wild* guess is that the problem might go away on the Azure
systems if you allocate more memory. That might be an easy
test to do so that we can confirm that it happens more frequently
when there is a memory constraint. Can you do that test?

I've also BCCed someone who might know someone who
would be interested in fixing Azure + Yocto bugs. Let's see
if they can help. :)

It would also be helpful if you created a defect in:

   https://bugzilla.yoctoproject.org/

and hopefully add a patch in that defect including the -native recipes that
are required to make the problem happen.

Thanks,

../Randy




../Randy


Thanks

Konrad









    


-- 
# Randy MacLeod
# Wind River Linux

Join yocto@lists.yoctoproject.org to automatically receive all group messages.