Re: sstate causing stripped kernel vs symbols mismatch


Sean McKay
 

I don’t know offhand, but the kernel documentation seems relatively straightforward.

I can start investigating in that direction and see how complex it looks like it’s going to be.

 

When you say that reproducible builds are turned on by default, is there a flag somewhere that can be used to turn that off that I need to gate these changes behind? Or can they be made globally so that the reproducibility can’t be turned off (easily)?

 

 

Do we expect to generally be okay with letting this sort of race condition remain in sstate? I concede that it’s probably okay, since I think the kernel is the only thing with this kind of forking task tree behavior after do_compile, and if we get 100% reproducible builds working, it’s not overly relevant… but it seems like it probably deserves a warning somewhere in the documentation.

 

I can also bring this question to the next technical meeting (I know I just missed one) if it seems the sort of thing we need to get consensus.

 

Cheers!

-Sean

 

 

 

From: Joshua Watt <jpewhacker@...>
Sent: Thursday, April 9, 2020 10:00 AM
To: McKay, Sean <sean.mckay@...>; yocto@...
Subject: Re: [yocto] sstate causing stripped kernel vs symbols mismatch

 

 

On 4/9/20 11:42 AM, Sean McKay wrote:

Anyone have any thoughts or guidance on this?

It seems like a pretty major bug to me.

 

We’re willing to put the work in to fix it, and if it’s not something the upstream community is interested in, I’ll just pick a solution for us and go with it.

But if it’s something that we’d like me to upstream, I’d like some feedback on which path I should start walking down before I start taking things apart.

 

We have had a recent push for reproducible builds (and they are now enabled by default). Do you have any idea how much effort it would take to make the kernel build reproducibly? It's something we probably want anyway, and can add to the automated testing infrastructure to ensure it doesn't regress.

 

 



 

Cheers!

-Sean

 

From: yocto@... <yocto@...> On Behalf Of Sean McKay
Sent: Tuesday, April 7, 2020 12:03 PM
To: yocto@...
Subject: [yocto] sstate causing stripped kernel vs symbols mismatch

 

Hi all,

 

We’ve discovered that (quite frequently) the kernel that we deploy doesn’t match the unstripped one that we’re saving for debug symbols. I’ve traced the issue to a combination of an sstate miss for the kernel do_deploy step combined with an sstate hit for do_package_write_rpm. (side note: we know we have issues with sstate reuse/stamps including things they shouldn’t which is why we hit this so much. We’re working on that too)

 

The result is that when our debug rootfs is created (where we added the kernel symbols), it’s got the version of the kernel from the sstate cached rpm files, but since do_deploy had an sstate miss, the entire kernel gets rebuilt to satisfy that dependency chain. Since the kernel doesn’t have reproducible builds working, the resulting pair of kernels don’t match each other for debug purposes.

 

So, I have two questions to start:

  1. What is the recommended way to be getting debug symbols for the kernel, since do_deploy doesn’t seem to have a debug counterpart (which is why we originally just set things up to add the rpm to the generated debug rootfs)
  2. Does this seem like a bug that should be fixed? If so, what would be the recommended solution (more thoughts below)?

 

Even if there’s a task somewhere that does what I’m looking for, this seems like a bit of a bug. I generally feel like we want to be able to trust sstate, so the fact that forking dependencies that each generate their own sstate objects can be out of sync is a bit scary.

I’ve thought of several ways around this, but I can’t say I like any of them.

  • (extremely gross hack) Create a new task to use instead of do_deploy that depends on do_packagegroup_write_rpm. Unpack the restored (or built) RPMs and use those blobs to deploy the kernel and symbols to the image directory.
  • (gross hack with painful effects on build time) Disable sstate for do_package_write_rpm and do_deploy. Possibly replace with sstate logic for the kernel’s do_install step (side question – why doesn’t do_install generate sstate? It seems like it should be able to, since the point is to drop everything into the image directory)
  • (possibly better, but sounds hard) Change the sstate logic so that if anything downstream of a do_compile task needs to be rerun, everything downstream of it needs to be rerun and sstate reuse for that recipe is not allowed (basically all or nothing sstate). Maybe with a flag that’s allowed in the bitbake file to indicate that a recipe does have reproducible builds and that different pieces are allowed to come from sstate in that case.
  • (fix the symptoms but not the problem) Figure out how to get linux-yocto building in a reproducible fashion and pretend the problem doesn’t exist.

 

 

If you’re interested, this is quite easy to reproduce – these are my repro steps

  • Check out a clean copy of zeus (22.0.2)
  • Add kernel-image to core-image-minimal in whatever fashion you choose (I just dumped it in the RDEPENDS for packagegroup-core-boot for testing)
  • bitbake core-image-minimal
  • bitbake -c clean core-image-minimal linux-yocto (or just wipe your whole build dir, since everything should come from sstate now)
  • Delete the sstate object(s) for linux-yocto’s deploy task.
  • bitbake core-image-minimal
  • Compare the BuildID hashes for the kernel in the two locations using file (you’ll need to use the kernel’s extract-vmlinux script to get it out of the bzImage)
    • file tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs/boot/vmlinux-5.2.28-yocto-standard
    • ./tmp/work-shared/qemux86-64/kernel-source/scripts/extract-vmlinux tmp/deploy/images/qemux86-64/bzImage > vmlinux-deploy && file vmlinux-deploy

 

Anyone have thoughts or suggestions?

 

Cheers!

-Sean McKay




Join yocto@lists.yoctoproject.org to automatically receive all group messages.