Re: Integrating meta-doubleopen into OE

Joshua Watt

On 6/1/21 4:52 PM, Richard Purdie wrote:
On Tue, 2021-06-01 at 16:39 -0500, Joshua Watt wrote:
Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format

It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.

Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).
I haven't spent as much time looking at the code as you probably have now
but I am conscious that we currently have a pile of code I really don't like
in OE-Core which has major issues and probably isn't fit for purpose.
In particular:

* the source archiver is problematic in various ways
Agreed. IMHO, one of the bigger problems with the source archiver is that it has so many modes of operation. Presumably, someone wanted each one, but it does make it ugly and hard to test. Technically, I believe that the meta-doubleopen adds "yet another" archiver mode (since it does appear to do source archiving), but in practice and with a little work it looks like this overlaps the "patched" mode of archiver.bbclass; I think more evaluation is necessary to see if the overlap can be eliminated in one way or another.

* the populate_lic step is awkward and probably not useful
* we already generate manifests in other formats

I'm wondering if there is a way to inject more information into the packagedata
stores and maybe do something different with source code layout/archiving so
that we end up simplifying parts of the build by adding this support, rather
than adding to what can be a rather complex mess in places.

I've not got a clear thought out plan but I am worried about adding things
without trying to remove some of the legacy pieces. Any thoughts?
Yes, I think that something like this could reduce some of the legacy stuff that exists today. But, we can't do that until we have SBOM support in OE-core first :)

I think based on some of this feedback, perhaps we can start by getting some of the cleanup done in meta-doubleopen and sort of use it as our "unofficial" SBOM solution? At the same time, we can do some work in OE-core to work toward some of the simplifications that will make it easier. Eventually I suspect we will encounter some point where it no longer makes sense for them to be separate (or it's just too hard to replace some legacy component without SBOM support in OE-core) and we can migrate the at that time. I would really like to see SBOM support in OE-core, since I think it tells a good user story to be part of core (and TBH, shouldn't be all that complex for OE-core to pull off).

I'm also worried that if we generate the complex SPDX files that meta-doubleopen
does, we'll have people running away. We may need to default to something simpler
with the option of adding in a lot of the information as unless you're handing
things off to fossology or other tools, it probably is overkill for most users
and if default may actually put people off?
I need to look more into it, but I'm partial to sticking with SPDX if at all possible. If it can express everything we need I'd rather not invent some other format that is basically the same thing.



Join to automatically receive all group messages.