Date   

Re: Integrating meta-doubleopen into OE

Philip Balister
 

On 6/2/21 10:05 PM, Joshua Watt wrote:
On Wed, Jun 2, 2021 at 5:43 PM Philip Balister <philip@balister.org> wrote:

On 6/1/21 5:39 PM, Joshua Watt wrote:
All,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.

Short term goal, get an OpenEmbedded presence at the next SBOM plugfest.
I feel this will get us important feedback we can use to shape SBOM
support in openembedded-core. Joshua, do you think we can build an SBOM
from openembedded-core and meta-doubleopen that would be good for the
plugfest?
I think that we can get something pretty good for the plugfest using
meta-doubleopen. Ideally, we would generate something that passes the
SPDX online validator (https://tools.spdx.org/app/validate/). I tried
with the current output, but it complained that our licenses were not
valid SPDX identifiers (which I think is a known issue in OE). I think
a simple mapping table for the GPL(+) licenses is all that is needed.
Hopefully that's all that is needed there, but we shall see. Beyond
that there are probably a few "nice to have" things that are not
strictly necessary, such as the package licenses, the Package
Supplier, and maybe some knobs to make the output smaller (it's about
500MB for core-image-minimal; although I'd *love* to see some of the
SPDX consumers parse it *evil laugh*). Perhaps we can see how much we
get done by before the 17th (or earlier if necessary)? I think all of
these changes should be pretty simple and can easily be made in the
meta-doubleopen repository.


A couple of links:

https://www.ntia.doc.gov/files/ntia/publications/ntia_sbom_tooling_2021-q2-checkpoint.pdf

https://docs.google.com/forms/d/e/1FAIpQLSdVOewc3uCZh39inX4X7QsA_jaQMqyrEiLFrWEZEpWxRCi3eQ/viewform
It's not quite clear to me: are they providing something we should
generate a SBOM from, or do we just make one from whatever and they
will pass it around?
I think we can just make an SBOM for a simple image (core-image-base)
and publish it so people can test it against their tools. Hopefully this
is an accurate thought :) I suppose we should check.

Philip




I am glad to help with registration an attend the meeting. I understand
several people have conflicts on the day of the plugfest who are better
prepared to talk about this stuff, but with some support I am glad to
submit the form.

And crap, I may not be available on the 17'th. Let's see what we can
work out.

Philip



Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).


Thanks,

Joshua Watt





Re: Integrating meta-doubleopen into OE

Joshua Watt
 

On Wed, Jun 2, 2021 at 5:43 PM Philip Balister <philip@balister.org> wrote:

On 6/1/21 5:39 PM, Joshua Watt wrote:
All,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.

Short term goal, get an OpenEmbedded presence at the next SBOM plugfest.
I feel this will get us important feedback we can use to shape SBOM
support in openembedded-core. Joshua, do you think we can build an SBOM
from openembedded-core and meta-doubleopen that would be good for the
plugfest?
I think that we can get something pretty good for the plugfest using
meta-doubleopen. Ideally, we would generate something that passes the
SPDX online validator (https://tools.spdx.org/app/validate/). I tried
with the current output, but it complained that our licenses were not
valid SPDX identifiers (which I think is a known issue in OE). I think
a simple mapping table for the GPL(+) licenses is all that is needed.
Hopefully that's all that is needed there, but we shall see. Beyond
that there are probably a few "nice to have" things that are not
strictly necessary, such as the package licenses, the Package
Supplier, and maybe some knobs to make the output smaller (it's about
500MB for core-image-minimal; although I'd *love* to see some of the
SPDX consumers parse it *evil laugh*). Perhaps we can see how much we
get done by before the 17th (or earlier if necessary)? I think all of
these changes should be pretty simple and can easily be made in the
meta-doubleopen repository.


A couple of links:

https://www.ntia.doc.gov/files/ntia/publications/ntia_sbom_tooling_2021-q2-checkpoint.pdf

https://docs.google.com/forms/d/e/1FAIpQLSdVOewc3uCZh39inX4X7QsA_jaQMqyrEiLFrWEZEpWxRCi3eQ/viewform
It's not quite clear to me: are they providing something we should
generate a SBOM from, or do we just make one from whatever and they
will pass it around?


I am glad to help with registration an attend the meeting. I understand
several people have conflicts on the day of the plugfest who are better
prepared to talk about this stuff, but with some support I am glad to
submit the form.

And crap, I may not be available on the 17'th. Let's see what we can
work out.

Philip



Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).


Thanks,

Joshua Watt





Re: Integrating meta-doubleopen into OE

Saul Wold
 

On 6/2/21 3:43 PM, Philip Balister wrote:
On 6/1/21 5:39 PM, Joshua Watt wrote:
All,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.
Short term goal, get an OpenEmbedded presence at the next SBOM plugfest.
I feel this will get us important feedback we can use to shape SBOM
support in openembedded-core. Joshua, do you think we can build an SBOM
from openembedded-core and meta-doubleopen that would be good for the
plugfest?
I am available that day and do have an interest in the SBOM, I am starting to read up and try out the meta-doubleopen layer. I don't want to duplicate.

Sau!

A couple of links:
https://www.ntia.doc.gov/files/ntia/publications/ntia_sbom_tooling_2021-q2-checkpoint.pdf
https://docs.google.com/forms/d/e/1FAIpQLSdVOewc3uCZh39inX4X7QsA_jaQMqyrEiLFrWEZEpWxRCi3eQ/viewform
I am glad to help with registration an attend the meeting. I understand
several people have conflicts on the day of the plugfest who are better
prepared to talk about this stuff, but with some support I am glad to
submit the form.
And crap, I may not be available on the 17'th. Let's see what we can
work out.
Philip



Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).


Thanks,

Joshua Watt







--
Sau!


Re: Integrating meta-doubleopen into OE

Philip Balister
 

On 6/1/21 5:39 PM, Joshua Watt wrote:
All,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.

Short term goal, get an OpenEmbedded presence at the next SBOM plugfest.
I feel this will get us important feedback we can use to shape SBOM
support in openembedded-core. Joshua, do you think we can build an SBOM
from openembedded-core and meta-doubleopen that would be good for the
plugfest?

A couple of links:

https://www.ntia.doc.gov/files/ntia/publications/ntia_sbom_tooling_2021-q2-checkpoint.pdf

https://docs.google.com/forms/d/e/1FAIpQLSdVOewc3uCZh39inX4X7QsA_jaQMqyrEiLFrWEZEpWxRCi3eQ/viewform

I am glad to help with registration an attend the meeting. I understand
several people have conflicts on the day of the plugfest who are better
prepared to talk about this stuff, but with some support I am glad to
submit the form.

And crap, I may not be available on the 17'th. Let's see what we can
work out.

Philip



Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).


Thanks,

Joshua Watt





VS: [licensing] Integrating meta-doubleopen into OE

Mikko Murto
 

In particular:

* the source archiver is problematic in various ways
Agreed. IMHO, one of the bigger problems with the source archiver is that it
has so many modes of operation. Presumably, someone wanted each one,
but it does make it ugly and hard to test. Technically, I believe that the meta-
doubleopen adds "yet another" archiver mode (since it does appear to do
source archiving), but in practice and with a little work it looks like this
overlaps the "patched" mode of archiver.bbclass; I think more evaluation is
necessary to see if the overlap can be eliminated in one way or another.
Yep, there's an archiver step included, for which the layer may not be the best place. The reasoning for it is that we wanted an archive of the files described in the SPDX for the project's workflow to run them through the license scanner. One functionality that I'm not particularly happy about is that we also include the packaged files in the archive. When we only archived the source files and linked the packaged files to their source with dwarfsrcfiles, we noticed that files that are generated during the build process but are not binary are kind of lost here, so we had to upload them too for scanning.


I'm also worried that if we generate the complex SPDX files that
meta-doubleopen does, we'll have people running away. We may need to
default to something simpler with the option of adding in a lot of the
information as unless you're handing things off to fossology or other
tools, it probably is overkill for most users and if default may actually put
people off?

I need to look more into it, but I'm partial to sticking with SPDX if at all
possible. If it can express everything we need I'd rather not invent some
other format that is basically the same thing.
I think the keyword here is *complex* SPDX files. For many use cases a SBOM may be a list of packages in image. The current default saves a lot of information in addition to this, specifically information about all the source files and packaged files and their relationships to the packages, which is something we need for the current project. A valid SPDX document could omit this file level information and just include the package information, which may be a saner default for a lot of people, at least in the beginning. This is something that in my mind would be a good candidate for configuration; store only the package data by default but include file level data if so specified in local.conf for example.

Mikko


Re: Integrating meta-doubleopen into OE

Joshua Watt
 

On 6/1/21 4:52 PM, Richard Purdie wrote:
On Tue, 2021-06-01 at 16:39 -0500, Joshua Watt wrote:
Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.


Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).
I haven't spent as much time looking at the code as you probably have now
but I am conscious that we currently have a pile of code I really don't like
in OE-Core which has major issues and probably isn't fit for purpose.
In particular:

* the source archiver is problematic in various ways
Agreed. IMHO, one of the bigger problems with the source archiver is that it has so many modes of operation. Presumably, someone wanted each one, but it does make it ugly and hard to test. Technically, I believe that the meta-doubleopen adds "yet another" archiver mode (since it does appear to do source archiving), but in practice and with a little work it looks like this overlaps the "patched" mode of archiver.bbclass; I think more evaluation is necessary to see if the overlap can be eliminated in one way or another.

* the populate_lic step is awkward and probably not useful
* we already generate manifests in other formats

I'm wondering if there is a way to inject more information into the packagedata
stores and maybe do something different with source code layout/archiving so
that we end up simplifying parts of the build by adding this support, rather
than adding to what can be a rather complex mess in places.

I've not got a clear thought out plan but I am worried about adding things
without trying to remove some of the legacy pieces. Any thoughts?
Yes, I think that something like this could reduce some of the legacy stuff that exists today. But, we can't do that until we have SBOM support in OE-core first :)

I think based on some of this feedback, perhaps we can start by getting some of the cleanup done in meta-doubleopen and sort of use it as our "unofficial" SBOM solution? At the same time, we can do some work in OE-core to work toward some of the simplifications that will make it easier. Eventually I suspect we will encounter some point where it no longer makes sense for them to be separate (or it's just too hard to replace some legacy component without SBOM support in OE-core) and we can migrate the at that time. I would really like to see SBOM support in OE-core, since I think it tells a good user story to be part of core (and TBH, shouldn't be all that complex for OE-core to pull off).



I'm also worried that if we generate the complex SPDX files that meta-doubleopen
does, we'll have people running away. We may need to default to something simpler
with the option of adding in a lot of the information as unless you're handing
things off to fossology or other tools, it probably is overkill for most users
and if default may actually put people off?
I need to look more into it, but I'm partial to sticking with SPDX if at all possible. If it can express everything we need I'd rather not invent some other format that is basically the same thing.


Cheers,

Richard




Re: Integrating meta-doubleopen into OE

Peter Kjellerstedt
 

-----Original Message-----
From: licensing@lists.yoctoproject.org <licensing@lists.yoctoproject.org>
On Behalf Of Mikko Murto
Sent: den 2 juni 2021 11:50
To: Richard Purdie <richard.purdie@linuxfoundation.org>; Joshua Watt
<JPEWhacker@gmail.com>; licensing@lists.yoctoproject.org
Subject: VS: [licensing] Integrating meta-doubleopen into OE

Hi,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for
OE.

Sounds great, I'd be happy to assist in any way I can!

It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would
want anyway. I'd be curious for the Mikko (the author) to chime in
and let us know what works well and what doesn't with the layer.
In addition to what I think works well and what doesn't I'll try and
describe some of the things that are saved and conventions we're trying
out here also. If something seems odd, please let me know!

The basic data about packages works decently. Three different types of
packages are saved to the packages field in the SPDX, one package
describing the final image, packages for the recipes built and packages
for the sub-packages of the recipes. These are differentiated with
different SPDXIDs: core-image-minimal's images package is "SPDXRef-Image-
core-image-minimal-qemux86-64...", zlib's recipe is "SPDXRef-Recipe-zlib"
and the sub-package zlib-dev is "SPDXRef-Package-zlib-dev. "SPDXRef" is
required by the SPDX spec and "Image", "Recipe" and "Package" identifies
which of the three the package is. For the recipes we currently save the
declared licensing information, but for the sub-packages we don't save
anything. We could save the information of the recipe for its sub-packages
as well; do you think this information would be accurate?
The licenses of packages default to the same LICENSE as specified for
the recipe. However, it is possible for a recipe to specify different
licenses for the packages by using, e.g., LICENSE_${PN}-dev = "...".
The licenses specified for a package must be a subset of the licenses
specified for the recipe. This is typically done when a recipe produces
a library that is, e.g., using the LGPL-2.1 license, while the main
application may be GPL-3.0. In that case the recipe's LICENSE would
be "GPL-3.0 & LGPL-2.1", while the lib package would use
LICENSE_lib${PN} = "LGPL-2.1". It could even be that it is only the
lib that uses LGPL-2.1, in which case the main package might have a
LICENSE_${PN} = "GPL-3.0".

//Peter

These packages are linked to each other with SPDX relationships. Each
recipe is related to its sub-packages such as "SPDXRef-Recipe-zlib
GENERATES SPDXRef-Package-zlib-dev". The image and sub-packages are linked
with relationships like "SPDXRef-Package-zlib-dev PACKAGE_OF SPDXRef-
Image-core-image-minimal", which are extracted from the IMAGE_MANIFEST.
This all works decently well, I think. One thing that could possibly be
useful additional information here would be some sort of dependency
information describing that a package is included because some other
package depends on it.

In the files we save two different types of files, files included in the
recipes' source and files packaged with the sub-packages. These are again
differentiated with the id, for example "SPDXRef-SourceFile-zlib-1" being
a file in zlib's source and "SPDXRef-PackagedFile-zlib-dev-1" being a file
packaged with the sub-package zlib-dev. These are also linked to the
packages with relationships such as "SPDXRef-Recipe-zlib CONTAINS SPDXRef-
SourceFile-1" and "SPDXRef-Package-zlib-dev CONTAINS SPDXRef-PackagedFile-
1". This seems to also be in a decent shape, if I've understood everything
correctly.

The next bit of information is the one where I'm maybe the most uncertain.
For the binary files, we run the dwarfsrcfiles-utility to try to determine
the source files used to build those binaries. Then we try to find those
source files and link them to the binaries with relationships like
"SPDXRef-PackagedFile-zlib-dev-1 GENERATED_FROM SPDXRef-SourceFile-zlib-
1". This is done across package borders, so binaries are related to source
files from glib also for example. Locating these source files based on the
information from dwarfscrfiles may have some problems. Not all files are
found. The logic for getting the file information is at
https://github.com/doubleopen-project/meta-
doubleopen/blob/d4e1d9a4e566ba6e74789f8a9d2376dea808eef3/classes/create-
srclist.bbclass#L48-L66 and the not found files are logged at
https://github.com/doubleopen-project/meta-
doubleopen/blob/d4e1d9a4e566ba6e74789f8a9d2376dea808eef3/classes/combine-
spdx.bbclass#L70.

Is this of any help? As said, if I can help in any way, please let me
know. If a call would be easier at some point, I'm available that way as
well.

I'm also worried that if we generate the complex SPDX files that meta-
doubleopen does, we'll have people running away. We may need to
default to something simpler with the option of adding in a lot of the
information as unless you're handing things off to fossology or other
tools, it probably is overkill for most users and if default may
actually put people off?

I agree that some of the data we currently gather may be quite a lot. We
started with that as it's required for the project we're working on, but
some feature gates could perhaps be used to limit what data is collected
and saved. For a lot of projects, just the packages and their declared
licensing data may very well be enough. For the project currently at hand,
detailed file level information is required. Just the packages may be a
sane default though.

Best regards,
Mikko


VS: [licensing] Integrating meta-doubleopen into OE

Mikko Murto
 

Hi,

Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
Sounds great, I'd be happy to assist in any way I can!

It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would
want anyway. I'd be curious for the Mikko (the author) to chime in
and let us know what works well and what doesn't with the layer.
In addition to what I think works well and what doesn't I'll try and describe some of the things that are saved and conventions we're trying out here also. If something seems odd, please let me know!

The basic data about packages works decently. Three different types of packages are saved to the packages field in the SPDX, one package describing the final image, packages for the recipes built and packages for the sub-packages of the recipes. These are differentiated with different SPDXIDs: core-image-minimal's images package is "SPDXRef-Image-core-image-minimal-qemux86-64...", zlib's recipe is "SPDXRef-Recipe-zlib" and the sub-package zlib-dev is "SPDXRef-Package-zlib-dev. "SPDXRef" is required by the SPDX spec and "Image", "Recipe" and "Package" identifies which of the three the package is. For the recipes we currently save the declared licensing information, but for the sub-packages we don't save anything. We could save the information of the recipe for its sub-packages as well; do you think this information would be accurate?

These packages are linked to each other with SPDX relationships. Each recipe is related to its sub-packages such as "SPDXRef-Recipe-zlib GENERATES SPDXRef-Package-zlib-dev". The image and sub-packages are linked with relationships like "SPDXRef-Package-zlib-dev PACKAGE_OF SPDXRef-Image-core-image-minimal", which are extracted from the IMAGE_MANIFEST. This all works decently well, I think. One thing that could possibly be useful additional information here would be some sort of dependency information describing that a package is included because some other package depends on it.

In the files we save two different types of files, files included in the recipes' source and files packaged with the sub-packages. These are again differentiated with the id, for example "SPDXRef-SourceFile-zlib-1" being a file in zlib's source and "SPDXRef-PackagedFile-zlib-dev-1" being a file packaged with the sub-package zlib-dev. These are also linked to the packages with relationships such as "SPDXRef-Recipe-zlib CONTAINS SPDXRef-SourceFile-1" and "SPDXRef-Package-zlib-dev CONTAINS SPDXRef-PackagedFile-1". This seems to also be in a decent shape, if I've understood everything correctly.

The next bit of information is the one where I'm maybe the most uncertain. For the binary files, we run the dwarfsrcfiles-utility to try to determine the source files used to build those binaries. Then we try to find those source files and link them to the binaries with relationships like "SPDXRef-PackagedFile-zlib-dev-1 GENERATED_FROM SPDXRef-SourceFile-zlib-1". This is done across package borders, so binaries are related to source files from glib also for example. Locating these source files based on the information from dwarfscrfiles may have some problems. Not all files are found. The logic for getting the file information is at https://github.com/doubleopen-project/meta-doubleopen/blob/d4e1d9a4e566ba6e74789f8a9d2376dea808eef3/classes/create-srclist.bbclass#L48-L66 and the not found files are logged at https://github.com/doubleopen-project/meta-doubleopen/blob/d4e1d9a4e566ba6e74789f8a9d2376dea808eef3/classes/combine-spdx.bbclass#L70.

Is this of any help? As said, if I can help in any way, please let me know. If a call would be easier at some point, I'm available that way as well.

I'm also worried that if we generate the complex SPDX files that meta-
doubleopen does, we'll have people running away. We may need to
default to something simpler with the option of adding in a lot of the
information as unless you're handing things off to fossology or other
tools, it probably is overkill for most users and if default may actually put people off?
I agree that some of the data we currently gather may be quite a lot. We started with that as it's required for the project we're working on, but some feature gates could perhaps be used to limit what data is collected and saved. For a lot of projects, just the packages and their declared licensing data may very well be enough. For the project currently at hand, detailed file level information is required. Just the packages may be a sane default though.

Best regards,
Mikko


Re: Integrating meta-doubleopen into OE

keydi
 

Given the recent interest in SBOM support for OE-core, I'm going to start
looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of the
requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages that is
considered "minimum viable" for SBOM support (based on the presentations
from Kate Stewart that I watched). The only field I've noticed is lacking is the
"Supplier Name" (SPDX "PackageSupplier"). I'm by no means an SBOM expert, so
I'm not quite clear what that field should be or how to populate it, but I think we
can worry about that later.

2) It's self contained; there is not dependency on external components or
servers.

3) It produces SPDX JSON output, which is a standardized format and should be
fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.


Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).
Frequently a wide range of open source software is incorporated (or more than this) to code-base.
It results in set of license types of same or even higher number of elements in set projects need to adhere to.
Every license type seems to raise own requirements regarding deliveries to customer.
There is also a huge set of commonly known copyleft license types.
Projects seem to need tool-chain capable to deliver set a range of artifact types
as complete set of copyleft licenses in use is asking for.
Set of artifacts seems to be what project need to deliver to customers, this sounds for me to be more than SBOM.
Is this a proper view? If yes, which of these 2 components (sbom, or rather set of artifacts for customer delivery for oss compliance)
does OE aim to fulfill?


Re: Integrating meta-doubleopen into OE

Richard Purdie
 

On Tue, 2021-06-01 at 16:39 -0500, Joshua Watt wrote:
Given the recent interest in SBOM support for OE-core, I'm going to
start looking at integrating meta-doubleopen as an SBOM solution for OE.
I've taken a look around in the project, and I think it covers most of
the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages
that is considered "minimum viable" for SBOM support (based on the
presentations from Kate Stewart that I watched). The only field I've
noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm
by no means an SBOM expert, so I'm not quite clear what that field
should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components
or servers.

3) It produces SPDX JSON output, which is a standardized format and
should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be
considered a minimum viable SBOM solution for OE, but it seems quite
useful and I suspect any extra functionality are things we would want
anyway. I'd be curious for the Mikko (the author) to chime in and let us
know what works well and what doesn't with the layer.


Ideally, I think this could be brought into OE-core in some form; does
anyone have any thoughts on this idea? I think that the code as-is needs
a little bit of cleanup to make that happen, but moving it to core would
make that a little easier (it seems there is some duplication between
what meta-doubleopen is doing and other various components in OE).
I haven't spent as much time looking at the code as you probably have now 
but I am conscious that we currently have a pile of code I really don't like
in OE-Core which has major issues and probably isn't fit for purpose. 
In particular:

* the source archiver is problematic in various ways
* the populate_lic step is awkward and probably not useful
* we already generate manifests in other formats

I'm wondering if there is a way to inject more information into the packagedata
stores and maybe do something different with source code layout/archiving so
that we end up simplifying parts of the build by adding this support, rather
than adding to what can be a rather complex mess in places.

I've not got a clear thought out plan but I am worried about adding things
without trying to remove some of the legacy pieces. Any thoughts?

I'm also worried that if we generate the complex SPDX files that meta-doubleopen
does, we'll have people running away. We may need to default to something simpler
with the option of adding in a lot of the information as unless you're handing
things off to fossology or other tools, it probably is overkill for most users
and if default may actually put people off?

Cheers,

Richard


Integrating meta-doubleopen into OE

Joshua Watt
 

All,

Given the recent interest in SBOM support for OE-core, I'm going to start looking at integrating meta-doubleopen as an SBOM solution for OE. I've taken a look around in the project, and I think it covers most of the requirements that I think OE needs for SBOM support:

1) It provides almost all of the information about recipes and packages that is considered "minimum viable" for SBOM support (based on the presentations from Kate Stewart that I watched). The only field I've noticed is lacking is the "Supplier Name" (SPDX "PackageSupplier"). I'm by no means an SBOM expert, so I'm not quite clear what that field should be or how to populate it, but I think we can worry about that later.

2) It's self contained; there is not dependency on external components or servers.

3) It produces SPDX JSON output, which is a standardized format and should be fairly easily translated to just about any other format


It's possibly that meta-doubleopen does more that what would be considered a minimum viable SBOM solution for OE, but it seems quite useful and I suspect any extra functionality are things we would want anyway. I'd be curious for the Mikko (the author) to chime in and let us know what works well and what doesn't with the layer.


Ideally, I think this could be brought into OE-core in some form; does anyone have any thoughts on this idea? I think that the code as-is needs a little bit of cleanup to make that happen, but moving it to core would make that a little easier (it seems there is some duplication between what meta-doubleopen is doing and other various components in OE).


Thanks,

Joshua Watt


Re: Urgent need for SPDX-formatted bill of materials.

Armin Kuster
 

On 5/18/21 3:02 PM, Richard Purdie wrote:
On Tue, 2021-05-18 at 17:28 -0400, Trevor Woerner wrote:
On Tue, May 18, 2021 at 4:59 PM akuster808 <akuster808@gmail.com> wrote:
On 5/18/21 11:32 AM, Trevor Woerner wrote:
I'd like to recommend this be a round-table topic for next week's OE
Developers Meeting?
If meta-doubleopen addresses the issue for folks, what is the topic of
this Round-table?
I'm still investigating and putting together a set of ideas.
I'd like to try and put some guidance around the discussion if I may?
There are a ton of things we could do here but I think the need
is comparatively clear and pressing. Discussions are good where
the outcome is unknown and options need to be explored. I think
some of this is relatively clear for the reasons I'll mention below.

meta-doubleopen says "This meta layer is intended for use with 
Double Open's open source license compliance workflow". *license*
workflow, we're talking about SBOMs. The fact they produce SPDX 
files isn't all that's required to create an SBOM. SPDX is just
a file format. In fact there's nothing in that layer that says 
anything about SBOM. From what I can tell, all meta-doubleopen is 
producing is an SPDX version of the various manifest files one 
would find if buildhistory is enabled.

SPDX is only one of several file formats that can be used to 
generate an SBOM in a standard way. It could be worth a discussion 
to at least mention the others.
Can I suggest we adopt the position that we aim for SPDX unless someone
produces a strong argument that something else has advantages?

The reason I say this is that it is the standard most projects are
consolidating around, it shows alignment with other work at the LF
and SDPX is aiming to become an ISO standard. To do something different
would put us in a difficult position IMO. People complain that LF 
projects don't collaborate enough, here we have an opportunity I want
to make work.


We could look at what an SBOM is, and what are the minimum required 
fields to produce an SBOM.
We do need to find out what the legislation says about this so we can
meet it.
Why? OE nor YP sell to the US Government?  This seems to be more of a
concern for commercial vendors.  They are the ones who need to come to
the table and help support these efforts. Being this is such a new
thing, things will change.  Remember, this is the US Government, just
because the President said this, does not mean each Federal agency will
adopt it for verbatim. 

Another question for the round table: should we integrate this into 
oe-core, or leave it as a separate layer?
We need to be able to say that OE/YP generates SBOM manifests for
images out the box, preferably by default. If we don't do that, we will
lose out to projects which can claim this. I think that makes the 
decision clear.


The round table is also a great way of introducing this important topic
to the community at large. I bet you half the people attending the 
conference have never heard of an SBOM, but might be interested to 
know YP/OE is looking into integrating it into the build system 
especially now that the US government has released an Executive Order 
regarding SBOMs, and that the EU is also looking into these sorts of things as well).

I'll look into inviting the DoubleOpen people to the meeting.

Joshua mentioned that the company he works for is also investigating 
generating SBOMs from YP/OE builds, so let's make sure everyone is 
working on one project, instead of scattering the community.

So there are a couple things we could talk about :-)
I think aspects which do need discussion is how to handle:

* SPDX data at the do_package/do_packagedata level
* SPDX data at the archiver and do_populate_lic level
* whether we can replace existing image manifests
* whether we need tooling to take an SPDX image manifest and process it to 
various forms for end user/tool use (e.g. actual file output or API?).
Transitioning to a more accepted format is something YP/OE should
discuss but don't do it to meet the needs of the US Government.

-armin



This probably translates into some kind of plan with different phases.

Cheers,

Richard





Re: Urgent need for SPDX-formatted bill of materials.

Trevor Woerner
 

On Wed, May 19, 2021 at 1:27 PM Saul Wold <Saul.Wold@...> wrote:
+1 on RP's comments here, see below

On 5/18/21 3:02 PM, Richard Purdie wrote:
> On Tue, 2021-05-18 at 17:28 -0400, Trevor Woerner wrote:
>> On Tue, May 18, 2021 at 4:59 PM akuster808 <akuster808@...> wrote:
>>> On 5/18/21 11:32 AM, Trevor Woerner wrote:
>>>> I'd like to recommend this be a round-table topic for next week's OE
>>>> Developers Meeting?
>>> If meta-doubleopen addresses the issue for folks, what is the topic of
>>> this Round-table?
>>>
>>
>>
>> I'm still investigating and putting together a set of ideas.
>
> I'd like to try and put some guidance around the discussion if I may?
> There are a ton of things we could do here but I think the need
> is comparatively clear and pressing. Discussions are good where
> the outcome is unknown and options need to be explored. I think
> some of this is relatively clear for the reasons I'll mention below.
>
>> meta-doubleopen says "This meta layer is intended for use with
>> Double Open's open source license compliance workflow". *license*
>> workflow, we're talking about SBOMs. The fact they produce SPDX
>> files isn't all that's required to create an SBOM. SPDX is just
>> a file format. In fact there's nothing in that layer that says
>> anything about SBOM. From what I can tell, all meta-doubleopen is
>> producing is an SPDX version of the various manifest files one
>> would find if buildhistory is enabled.
>>
>> SPDX is only one of several file formats that can be used to
>> generate an SBOM in a standard way. It could be worth a discussion
>> to at least mention the others.
>
> Can I suggest we adopt the position that we aim for SPDX unless someone
> produces a strong argument that something else has advantages?
>
> The reason I say this is that it is the standard most projects are
> consolidating around, it shows alignment with other work at the LF
> and SDPX is aiming to become an ISO standard. To do something different
> would put us in a difficult position IMO. People complain that LF
> projects don't collaborate enough, here we have an opportunity I want
> to make work.
>
>
>> We could look at what an SBOM is, and what are the minimum required
>> fields to produce an SBOM.
>
> We do need to find out what the legislation says about this so we can
> meet it.
>
https://www.ntia.gov/sbom may be the starting point, I am just getting
around to looking at it. I don't know if there is someone in this thread
(Kate maybe?) that can interpret legislation to technical!

>> Another question for the round table: should we integrate this into
>> oe-core, or leave it as a separate layer?
>
> We need to be able to say that OE/YP generates SBOM manifests for
> images out the box, preferably by default. If we don't do that, we will
> lose out to projects which can claim this. I think that makes the
> decision clear.
>
>
>> The round table is also a great way of introducing this important topic
>> to the community at large. I bet you half the people attending the
>> conference have never heard of an SBOM, but might be interested to
>> know YP/OE is looking into integrating it into the build system
>> especially now that the US government has released an Executive Order
>> regarding SBOMs, and that the EU is also looking into these sorts of things as well).
>>
>> I'll look into inviting the DoubleOpen people to the meeting.
>>
>> Joshua mentioned that the company he works for is also investigating
>> generating SBOMs from YP/OE builds, so let's make sure everyone is
>> working on one project, instead of scattering the community.
>>
Windriver also needs to provide this data, so there are probably more
people interested for sure.

>> So there are a couple things we could talk about :-)
>
> I think aspects which do need discussion is how to handle:
>
> * SPDX data at the do_package/do_packagedata level
> * SPDX data at the archiver and do_populate_lic level
> * whether we can replace existing image manifests
> * whether we need tooling to take an SPDX image manifest and process it to
>    various forms for end user/tool use (e.g. actual file output or API?).
>
I recently submitted the image-manifest script which produces a JSON
output from the image manifest so limited to what is actually being
built. This was based on a tinfoil script from Paul Eggelton.

This currently just reads the recipe LICENSE info which is human
generated, we need to figure out how to do a better job with SPDX.

Sounds good.

Saul: will you be able to join us for the discussion next Tuesday?
The developers meeting is free to join (you don't have to be registered for the conference to attend)
 

> This probably translates into some kind of plan with different phases.
>
> Cheers,
>
> Richard
>
>
>
>
>

--
Sau!


Re: Urgent need for SPDX-formatted bill of materials.

Trevor Woerner
 

On Wed, May 19, 2021 at 2:29 AM Konrad Weihmann <kweihmann@...> wrote:
When is the round table going to happen exactly? I'd be interested to join the call, as I've been working on something aim for exactly such a SBOM

Next week, on Tuesday and Wednesday the Yocto Project is having a virtual Summit.

As part of the Summit, on Tuesday a 4.5 hour block has been set aside for a developers meeting/round-table on a bunch of different topics of interest to the project. The list of topics is still being worked out, once that's worked out then we'll assign time slots to the individual topics. I've added BSOM to the list of topics. The exact time will be worked out on Monday. https://www.openembedded.org/wiki/OEDVM_2021

The block of time set aside for the developer meeting is: 1530 UTC to 2000 UTC Tuesday May 25.
The details of the developers meeting (including the link to join) can be found here: https://pretalx.com/yocto-project-summit-2021/talk/BVZMYW/


Re: Urgent need for SPDX-formatted bill of materials.

Konrad Weihmann
 

When is the round table going to happen exactly? I'd be interested to join the call, as I've been working on something aim for exactly such a SBOM


Re: Urgent need for SPDX-formatted bill of materials.

Kate Stewart
 

"we only have to concern ourselves with producing a proper, compliant SBOM".

+1  Being able to generate the SBOM as a byproduct of the build is going to have the most trust. 
Yocto is in a unique position to do this,  and provide guidance on extending the next generation
of SPDX as well.   Richard convinced me a couple of years ago that the necessary information is present
in the debug info,  challenge is extracting it out and outputting the document.   

Possible approach 
- mark all licensing as NOASSERTION for now, and focus on the components and mapping
the relationships between them.  
- Next phase, add in the licensing information when its available as SPDX headers (ie. no scanning 
tools needed),  use declared vs detected to separate out the info at the package level on what you're
getting from sources.

The example of how it's being done in Zephyr is based on hooking into CMake see:

Kubernetes approach:

AGL might be a good testbed for this capability with Yocto, as there is a PoC starting
in the Auto-ISAC,  and they'll be looking for SBOMs, so many eyes.

In terms of validating the output format produced - 
any document created conforms to the specification.

If there are questions about the way to partition the information, etc.
Steve Winslow and myself are happy to weigh in.

HTH,
Kate





On Tue, May 18, 2021 at 5:15 PM Trevor Woerner <twoerner@...> wrote:
Richard, this is all awesome! Thanks for your input :-)

On Tue, May 18, 2021 at 6:03 PM Richard Purdie <richard.purdie@...> wrote:
* whether we need tooling to take an SPDX image manifest and process it to 
  various forms for end user/tool use (e.g. actual file output or API?).

Kate Stewart recently did a webinar on this topic, you can find the video and slides:  

She also talked about this at the most recent FOSDEM:

I'm thinking of inviting her to the discussion.

If you look at her slides from the webinar, around slide 27 she talks about the ecosystem of tools for working with SBOMs depending on whether you're a producer, consumer, or user of a product. Given what she says, we only have to concern ourselves with producing a proper, compliant SBOM. Other tools in the ecosystem will handle the other things.




Re: Urgent need for SPDX-formatted bill of materials.

Trevor Woerner
 

The LF also recently released this blog post, which mentions YP in a positive light wrt SBOMs:


Re: Urgent need for SPDX-formatted bill of materials.

Trevor Woerner
 

Richard, this is all awesome! Thanks for your input :-)

On Tue, May 18, 2021 at 6:03 PM Richard Purdie <richard.purdie@...> wrote:
* whether we need tooling to take an SPDX image manifest and process it to 
  various forms for end user/tool use (e.g. actual file output or API?).

Kate Stewart recently did a webinar on this topic, you can find the video and slides:  

She also talked about this at the most recent FOSDEM:

I'm thinking of inviting her to the discussion.

If you look at her slides from the webinar, around slide 27 she talks about the ecosystem of tools for working with SBOMs depending on whether you're a producer, consumer, or user of a product. Given what she says, we only have to concern ourselves with producing a proper, compliant SBOM. Other tools in the ecosystem will handle the other things.


Re: Urgent need for SPDX-formatted bill of materials.

Richard Purdie
 

On Tue, 2021-05-18 at 17:28 -0400, Trevor Woerner wrote:
On Tue, May 18, 2021 at 4:59 PM akuster808 <akuster808@gmail.com> wrote:
On 5/18/21 11:32 AM, Trevor Woerner wrote:
I'd like to recommend this be a round-table topic for next week's OE
Developers Meeting?
If meta-doubleopen addresses the issue for folks, what is the topic of
this Round-table?

I'm still investigating and putting together a set of ideas.
I'd like to try and put some guidance around the discussion if I may?
There are a ton of things we could do here but I think the need
is comparatively clear and pressing. Discussions are good where
the outcome is unknown and options need to be explored. I think
some of this is relatively clear for the reasons I'll mention below.

meta-doubleopen says "This meta layer is intended for use with 
Double Open's open source license compliance workflow". *license*
workflow, we're talking about SBOMs. The fact they produce SPDX 
files isn't all that's required to create an SBOM. SPDX is just
a file format. In fact there's nothing in that layer that says 
anything about SBOM. From what I can tell, all meta-doubleopen is 
producing is an SPDX version of the various manifest files one 
would find if buildhistory is enabled.

SPDX is only one of several file formats that can be used to 
generate an SBOM in a standard way. It could be worth a discussion 
to at least mention the others.
Can I suggest we adopt the position that we aim for SPDX unless someone
produces a strong argument that something else has advantages?

The reason I say this is that it is the standard most projects are
consolidating around, it shows alignment with other work at the LF
and SDPX is aiming to become an ISO standard. To do something different
would put us in a difficult position IMO. People complain that LF 
projects don't collaborate enough, here we have an opportunity I want
to make work.


We could look at what an SBOM is, and what are the minimum required 
fields to produce an SBOM.
We do need to find out what the legislation says about this so we can
meet it.

Another question for the round table: should we integrate this into 
oe-core, or leave it as a separate layer?
We need to be able to say that OE/YP generates SBOM manifests for
images out the box, preferably by default. If we don't do that, we will
lose out to projects which can claim this. I think that makes the 
decision clear.


The round table is also a great way of introducing this important topic
to the community at large. I bet you half the people attending the 
conference have never heard of an SBOM, but might be interested to 
know YP/OE is looking into integrating it into the build system 
especially now that the US government has released an Executive Order 
regarding SBOMs, and that the EU is also looking into these sorts of things as well).

I'll look into inviting the DoubleOpen people to the meeting.

Joshua mentioned that the company he works for is also investigating 
generating SBOMs from YP/OE builds, so let's make sure everyone is 
working on one project, instead of scattering the community.

So there are a couple things we could talk about :-)
I think aspects which do need discussion is how to handle:

* SPDX data at the do_package/do_packagedata level
* SPDX data at the archiver and do_populate_lic level
* whether we can replace existing image manifests
* whether we need tooling to take an SPDX image manifest and process it to 
various forms for end user/tool use (e.g. actual file output or API?).

This probably translates into some kind of plan with different phases.

Cheers,

Richard


Re: Urgent need for SPDX-formatted bill of materials.

Trevor Woerner
 

On Tue, May 18, 2021 at 4:59 PM akuster808 <akuster808@...> wrote:
On 5/18/21 11:32 AM, Trevor Woerner wrote:
> I'd like to recommend this be a round-table topic for next week's OE
> Developers Meeting?
If meta-doubleopen addresses the issue for folks, what is the topic of
this Round-table?

I'm still investigating and putting together a set of ideas.

meta-doubleopen says "This meta layer is intended for use with Double Open's open source license compliance workflow". *license* workflow, we're talking about SBOMs. The fact they produce SPDX files isn't all that's required to create an SBOM. SPDX is just a file format. In fact there's nothing in that layer that says anything about SBOM. From what I can tell, all meta-doubleopen is producing is an SPDX version of the various manifest files one would find if buildhistory is enabled.

SPDX is only one of several file formats that can be used to generate an SBOM in a standard way. It could be worth a discussion to at least mention the others.

We could look at what an SBOM is, and what are the minimum required fields to produce an SBOM.

Another question for the round table: should we integrate this into oe-core, or leave it as a separate layer?

The round table is also a great way of introducing this important topic to the community at large. I bet you half the people attending the conference have never heard of an SBOM, but might be interested to know YP/OE is looking into integrating it into the build system (especially now that the US government has released an Executive Order regarding SBOMs, and that the EU is also looking into these sorts of things as well).

I'll look into inviting the DoubleOpen people to the meeting.

Joshua mentioned that the company he works for is also investigating generating SBOMs from YP/OE builds, so let's make sure everyone is working on one project, instead of scattering the community.

So there are a couple things we could talk about :-)
 
 
>
> Saul, Randy, Joshua, (and anyone else) I'll offer to moderate if you'd
> be interesting in joining the discussions?
>
>
>

1 - 20 of 44