what to expect from distributed sstate cache?


Mans Zigher <mans.zigher@...>
 

Hi,

This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.

Thanks


Alexander Kanavin
 

The recommended set up is to use r/w NFS between build machines, so they all contribute to the cache directly. And yes, if the inputs to the task are identical, then there should a cache hit.

If you are getting cache misses where you are expecting a cache hit, then bitbake-diffsigs/bitbake-dumpsig may help to debug.

Alex


On Wed, 27 May 2020 at 10:59, Mans Zigher <mans.zigher@...> wrote:
Hi,

This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned  and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.

Thanks


Mikko Rapeli
 

Hi,

On Wed, May 27, 2020 at 10:58:55AM +0200, Mans Zigher wrote:
This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.
We do something similar except we rsync a sstate mirror to build
nodes from latest release before a build (and topic from gerrit
are merged to latest release too to avoid sstate and build tree getting
too out of sync).

bitbake-diffsigs can tell you why things get rebuild. The answers
should be there.

Also note that docker images are not reproducible by default
and might end up having different patch versions of openssl etc
depending on who build them and when. One way to work around this
is to use e.g. snapshots.debian.org repos for Debian containers
with a timestamped state of the full package repo used to generate
the container. I've done something similar but manual on top of
debootstrap to create a build rootfs tarball for lxc.

Hope this helps,

-Mikko


Mans Zigher <mans.zigher@...>
 

Hi,

Thanks for the input. Regarding docker we are building the docker
image and we are using the same image for all nodes so should they not
be identical when the nodes start the containers?

Thanks,

Den ons 27 maj 2020 kl 11:16 skrev <Mikko.Rapeli@bmw.de>:


Hi,

On Wed, May 27, 2020 at 10:58:55AM +0200, Mans Zigher wrote:
This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.
We do something similar except we rsync a sstate mirror to build
nodes from latest release before a build (and topic from gerrit
are merged to latest release too to avoid sstate and build tree getting
too out of sync).

bitbake-diffsigs can tell you why things get rebuild. The answers
should be there.

Also note that docker images are not reproducible by default
and might end up having different patch versions of openssl etc
depending on who build them and when. One way to work around this
is to use e.g. snapshots.debian.org repos for Debian containers
with a timestamped state of the full package repo used to generate
the container. I've done something similar but manual on top of
debootstrap to create a build rootfs tarball for lxc.

Hope this helps,

-Mikko


Martin Jansa
 

There is no limitation like that, but it's quite easy to break that, you mentioned some ugly BSP before, I wouldn't be surprised if it's broken there.

What worked well for me over the years is using this script:
openembedded-core/scripts/sstate-diff-machines.sh
on jenkins which produces the sstate-cache, it not only checks for signature issues between MACHINEs (can be used for single MACHINE as well) but creates a directory with all sstate signatures in our builds, this is then published as tarball together with built images.

Then when I'm seeing unexpected low reuse of sstate on some builder, I just use the same sstate-diff-machines.sh locally, fetch the tarball from the build I was trying to reproduce and compare the content with bitbake-diffsigs.

Cheers,

On Wed, May 27, 2020 at 10:59 AM Mans Zigher <mans.zigher@...> wrote:
Hi,

This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned  and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.

Thanks


Mans Zigher <mans.zigher@...>
 

Thanks will have a look at that script.

Den ons 27 maj 2020 kl 13:41 skrev Martin Jansa <martin.jansa@gmail.com>:


There is no limitation like that, but it's quite easy to break that, you mentioned some ugly BSP before, I wouldn't be surprised if it's broken there.

What worked well for me over the years is using this script:
openembedded-core/scripts/sstate-diff-machines.sh
on jenkins which produces the sstate-cache, it not only checks for signature issues between MACHINEs (can be used for single MACHINE as well) but creates a directory with all sstate signatures in our builds, this is then published as tarball together with built images.

Then when I'm seeing unexpected low reuse of sstate on some builder, I just use the same sstate-diff-machines.sh locally, fetch the tarball from the build I was trying to reproduce and compare the content with bitbake-diffsigs.

Cheers,

On Wed, May 27, 2020 at 10:59 AM Mans Zigher <mans.zigher@gmail.com> wrote:

Hi,

This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.

Thanks


Mike Looijmans
 

We're sharing the sstate-cache on plain Ubuntu LTS machines, all installed from wherever, and no issues with low hitrates.

Even if the build server was running something like ubuntu 14 and "my" machine running 16, the hit rate would still be pretty good, it would only rebuild the "native" stuff, but not the target binaries.

We just share the sstate-cache over HTTP (Apache), so it's a one-way thing - you can get sstate objects from the build server, but not the other way around.



Met vriendelijke groet / kind regards,

Mike Looijmans
System Expert


TOPIC Embedded Products B.V.
Materiaalweg 4, 5681 RJ Best
The Netherlands

T: +31 (0) 499 33 69 69
E: mike.looijmans@topicproducts.com
W: www.topicproducts.com

Please consider the environment before printing this e-mail

On 27-05-2020 10:58, Mans Zigher via lists.yoctoproject.org wrote:
Hi,

This is maybe more related to bitbake but I start by posting it here.
I am for the first time trying to make use of a distributed sstate
cache but I am getting some unexpected results and wanted to hear if
my expectations are wrong. Everything works as expected when a build
node is using a sstate cache from it's self so I do a clean build and
upload the sstate cache from that build to our mirror. If then do a
complete build using the mirror I get a 99% hit rate which is what I
would expect. If I then start a build on a different node using the
same cache I am only getting a 16% hit rate. I am running the builds
inside docker so the environment should be identical. We have several
build nodes in our CI and they where actually cloned and all of them
have the same HW. They are all running the builds in docker but it
looks like they can share the sstate cache and still get a 99% hit
rate. This to me suggests that the hit rate for the sstate cache is
node depending so a cache cannot actually be shared between different
nodes which is not what I expected. I have not been able find any
information about this limitation. Any clarification regarding what to
expect from the sstate cache would be appreciated.

Thanks

--
Mike Looijmans