[list address fixed, sorry]
We've been having bandwidth trouble with downloads.yoctoproject.org so we did some quick analysis to see what the issue is. Basically in speeding up the server which was the rate limit, we hit the limits of the hosting pipe. I'd note a few things:
a) it isn't the sstate mirroring, it is nearly all being used by downloads.
b) 25% of all our bandwidth is going on "git2_sourceware.org.git.binutils- gdb.git.tar.gz" - i.e. downloading the source mirror binutils tarball
c) 15% is on git2_sourceware.org.git.glibc.git.tar.gz i.e. glibc
d) OE-Core has downloads.yoctoproject.org as a MIRROR
e) poky has it as a PREMIRROR
What are our options? As far as I can see we could:
a) increase the pipe from downloads.yoctoproject.org but that does come at a non-trivial cost to the project.
b) Seek help with hosting some of the larger mirror tarballs from people better able to host them and have that as a first premirror?
c) Switch the binutils and glibc recipes to tarballs and patches. I know Khem finds this less convenient and they keep moving back and forward but we keep running into this issue and having to switch back from git.
d) To soften the blow of c) we could add devupstream support to the recipes? We could script updating the recipe to add the patches?
e) We could drop the PREMIRRORS from poky. This would stop the SCM targets from hitting our mirrors first. That does transfer load to the upstream project SCMs though and I'm not sure that will be appreciated. I did sent that patch, I'm not sure about it though.
We are going to need to do *something* though as the current situation can't continue. I'm open to other ideas...
Cheers,
Richard
|
|
On Wed, 2022-03-30 at 11:42 +0100, Richard Purdie via lists.yoctoproject.org wrote: What are our options? As far as I can see we could:
a) increase the pipe from downloads.yoctoproject.org but that does come at a non-trivial cost to the project.
b) Seek help with hosting some of the larger mirror tarballs from people better able to host them and have that as a first premirror?
c) Switch the binutils and glibc recipes to tarballs and patches. I know Khem finds this less convenient and they keep moving back and forward but we keep running into this issue and having to switch back from git.
d) To soften the blow of c) we could add devupstream support to the recipes? We could script updating the recipe to add the patches?
e) We could drop the PREMIRRORS from poky. This would stop the SCM targets from hitting our mirrors first. That does transfer load to the upstream project SCMs though and I'm not sure that will be appreciated. I did sent that patch, I'm not sure about it though. I meant to add: f) Switch the problematic recipes to use shallow clones with something like: BB_GIT_SHALLOW:pn-binutils = "1" BB_GIT_SHALLOW:pn-binutils-cross-${TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-canadian-${TRANSLATED_TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-testsuite = "1" BB_GIT_SHALLOW:pn-binutils-crosssdk-${SDK_SYS} = "1" BB_GIT_SHALLOW:pn-glibc = "1" The challenge here is that in order to be effective, there needs to be a PREMIRROR setup with the shallow tarballs on it. This means we couldn't do e) above and have this have much effect unless we craft some very specific PREMIRROR entries too. Cheers, Richard
|
|
On Wed, 30 Mar 2022 at 12:10, Richard Purdie <richard.purdie@...> wrote: f) Switch the problematic recipes to use shallow clones with something like:
BB_GIT_SHALLOW:pn-binutils = "1" BB_GIT_SHALLOW:pn-binutils-cross-${TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-canadian-${TRANSLATED_TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-testsuite = "1" BB_GIT_SHALLOW:pn-binutils-crosssdk-${SDK_SYS} = "1" BB_GIT_SHALLOW:pn-glibc = "1"
The challenge here is that in order to be effective, there needs to be a PREMIRROR setup with the shallow tarballs on it. This means we couldn't do e) above and have this have much effect unless we craft some very specific PREMIRROR entries too. Even without premirrors this is a lot faster for glibc: $ time git clone git://sourceware.org/git/glibc.git Cloning into 'glibc'... remote: Enumerating objects: 6956, done. remote: Counting objects: 100% (6956/6956), done. remote: Compressing objects: 100% (2938/2938), done. remote: Total 670093 (delta 5328), reused 4750 (delta 3932), pack-reused 663137 Receiving objects: 100% (670093/670093), 205.19 MiB | 16.39 MiB/s, done. Resolving deltas: 100% (573265/573265), done. Updating files: 100% (19011/19011), done. real 1m56.255s $ time git clone git://sourceware.org/git/glibc.git --depth 1 Cloning into 'glibc'... remote: Enumerating objects: 18809, done. remote: Counting objects: 100% (18809/18809), done. remote: Compressing objects: 100% (9704/9704), done. remote: Total 18809 (delta 8812), reused 12185 (delta 7968), pack-reused 0 Receiving objects: 100% (18809/18809), 41.79 MiB | 11.96 MiB/s, done. Resolving deltas: 100% (8812/8812), done. Updating files: 100% (19011/19011), done. real 0m8.701s A full clone fetches 200MB and takes 2 minutes (a lot of that is actually resolving the deltas, not the fetch). A shallow clone of the current HEAD fetches 40MB and is done in 8 seconds. Why would we need a premirror? Ross
|
|
On Wed, 2022-03-30 at 12:18 +0100, Ross Burton wrote: On Wed, 30 Mar 2022 at 12:10, Richard Purdie <richard.purdie@...> wrote:
f) Switch the problematic recipes to use shallow clones with something like:
BB_GIT_SHALLOW:pn-binutils = "1" BB_GIT_SHALLOW:pn-binutils-cross-${TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-canadian-${TRANSLATED_TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-testsuite = "1" BB_GIT_SHALLOW:pn-binutils-crosssdk-${SDK_SYS} = "1" BB_GIT_SHALLOW:pn-glibc = "1"
The challenge here is that in order to be effective, there needs to be a PREMIRROR setup with the shallow tarballs on it. This means we couldn't do e) above and have this have much effect unless we craft some very specific PREMIRROR entries too. Even without premirrors this is a lot faster for glibc:
$ time git clone git://sourceware.org/git/glibc.git Cloning into 'glibc'... remote: Enumerating objects: 6956, done. remote: Counting objects: 100% (6956/6956), done. remote: Compressing objects: 100% (2938/2938), done. remote: Total 670093 (delta 5328), reused 4750 (delta 3932), pack-reused 663137 Receiving objects: 100% (670093/670093), 205.19 MiB | 16.39 MiB/s, done. Resolving deltas: 100% (573265/573265), done. Updating files: 100% (19011/19011), done.
real 1m56.255s
$ time git clone git://sourceware.org/git/glibc.git --depth 1 Cloning into 'glibc'... remote: Enumerating objects: 18809, done. remote: Counting objects: 100% (18809/18809), done. remote: Compressing objects: 100% (9704/9704), done. remote: Total 18809 (delta 8812), reused 12185 (delta 7968), pack-reused 0 Receiving objects: 100% (18809/18809), 41.79 MiB | 11.96 MiB/s, done. Resolving deltas: 100% (8812/8812), done. Updating files: 100% (19011/19011), done.
real 0m8.701s
A full clone fetches 200MB and takes 2 minutes (a lot of that is actually resolving the deltas, not the fetch). A shallow clone of the current HEAD fetches 40MB and is done in 8 seconds.
Why would we need a premirror? The code doesn't do "--depth=1". https://git.yoctoproject.org/poky/commit/?id=27d56982c7ba05e86a100b0cca2411ee5ac7a85e""" This implements support for shallow mirror tarballs, not shallow clones. Supporting shallow clones directly is not really doable for us, as we'd need to hardcode the depth between branch HEAD and the SRCREV, and that depth would change as the branch is updated. """ Put another way, you didn't specify a revision in your clone above and if you try, it becomes rather tricky. To make this work we therefore need a mirror with the shallow tarballs on it. Just for info, the binutils mirror tarball is ~1.3GB, the shallow tarball is 65MB. Cheers, Richard
|
|
On 30/03/2022 11:42:46+0100, Richard Purdie wrote: [list address fixed, sorry]
We've been having bandwidth trouble with downloads.yoctoproject.org so we did some quick analysis to see what the issue is. Basically in speeding up the server which was the rate limit, we hit the limits of the hosting pipe. I'd note a few things:
a) it isn't the sstate mirroring, it is nearly all being used by downloads.
b) 25% of all our bandwidth is going on "git2_sourceware.org.git.binutils- gdb.git.tar.gz" - i.e. downloading the source mirror binutils tarball
c) 15% is on git2_sourceware.org.git.glibc.git.tar.gz i.e. glibc
d) OE-Core has downloads.yoctoproject.org as a MIRROR
e) poky has it as a PREMIRROR
What are our options? As far as I can see we could:
a) increase the pipe from downloads.yoctoproject.org but that does come at a non-trivial cost to the project.
b) Seek help with hosting some of the larger mirror tarballs from people better able to host them and have that as a first premirror?
c) Switch the binutils and glibc recipes to tarballs and patches. I know Khem finds this less convenient and they keep moving back and forward but we keep running into this issue and having to switch back from git.
d) To soften the blow of c) we could add devupstream support to the recipes? We could script updating the recipe to add the patches?
e) We could drop the PREMIRRORS from poky. This would stop the SCM targets from hitting our mirrors first. That does transfer load to the upstream project SCMs though and I'm not sure that will be appreciated. I did sent that patch, I'm not sure about it though. I would simply drop PREMIRRORS, this is actually a privacy concern for some of our customers that didn't realize they are leaking the names of their internal git repositories to downloads.yoctoproject.org. -- Alexandre Belloni, co-owner and COO, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
|
|
On 3/30/22 09:53, Alexandre Belloni via lists.yoctoproject.org wrote: On 30/03/2022 11:42:46+0100, Richard Purdie wrote:
[list address fixed, sorry]
We've been having bandwidth trouble with downloads.yoctoproject.org so we did some quick analysis to see what the issue is. Basically in speeding up the server which was the rate limit, we hit the limits of the hosting pipe. I'd note a few things:
a) it isn't the sstate mirroring, it is nearly all being used by downloads.
b) 25% of all our bandwidth is going on "git2_sourceware.org.git.binutils- gdb.git.tar.gz" - i.e. downloading the source mirror binutils tarball
c) 15% is on git2_sourceware.org.git.glibc.git.tar.gz i.e. glibc
d) OE-Core has downloads.yoctoproject.org as a MIRROR
e) poky has it as a PREMIRROR
What are our options? As far as I can see we could:
a) increase the pipe from downloads.yoctoproject.org but that does come at a non-trivial cost to the project.
b) Seek help with hosting some of the larger mirror tarballs from people better able to host them and have that as a first premirror?
c) Switch the binutils and glibc recipes to tarballs and patches. I know Khem finds this less convenient and they keep moving back and forward but we keep running into this issue and having to switch back from git.
d) To soften the blow of c) we could add devupstream support to the recipes? We could script updating the recipe to add the patches?
e) We could drop the PREMIRRORS from poky. This would stop the SCM targets from hitting our mirrors first. That does transfer load to the upstream project SCMs though and I'm not sure that will be appreciated. I did sent that patch, I'm not sure about it though. I would simply drop PREMIRRORS, this is actually a privacy concern for some of our customers that didn't realize they are leaking the names of their internal git repositories to downloads.yoctoproject.org. Indeed, that would be concerning for us as well. Would it be possible to ignore PREMIRRORS based on the recipe layer? Alternatively, we could create blocklists for heavy packages that need to fetch from upstream first rather than drop PREMIRRORS completely. Sometimes, having a secondary source could save valuable time when the upstream is not responsive.
|
|
On Wed, 2022-03-30 at 10:05 -0400, Claude Bing wrote: On 3/30/22 09:53, Alexandre Belloni via lists.yoctoproject.org wrote:
On 30/03/2022 11:42:46+0100, Richard Purdie wrote:
[list address fixed, sorry]
We've been having bandwidth trouble with downloads.yoctoproject.org so we did some quick analysis to see what the issue is. Basically in speeding up the server which was the rate limit, we hit the limits of the hosting pipe. I'd note a few things:
a) it isn't the sstate mirroring, it is nearly all being used by downloads.
b) 25% of all our bandwidth is going on "git2_sourceware.org.git.binutils- gdb.git.tar.gz" - i.e. downloading the source mirror binutils tarball
c) 15% is on git2_sourceware.org.git.glibc.git.tar.gz i.e. glibc
d) OE-Core has downloads.yoctoproject.org as a MIRROR
e) poky has it as a PREMIRROR
What are our options? As far as I can see we could:
a) increase the pipe from downloads.yoctoproject.org but that does come at a non-trivial cost to the project.
b) Seek help with hosting some of the larger mirror tarballs from people better able to host them and have that as a first premirror?
c) Switch the binutils and glibc recipes to tarballs and patches. I know Khem finds this less convenient and they keep moving back and forward but we keep running into this issue and having to switch back from git.
d) To soften the blow of c) we could add devupstream support to the recipes? We could script updating the recipe to add the patches?
e) We could drop the PREMIRRORS from poky. This would stop the SCM targets from hitting our mirrors first. That does transfer load to the upstream project SCMs though and I'm not sure that will be appreciated. I did sent that patch, I'm not sure about it though. I would simply drop PREMIRRORS, this is actually a privacy concern for some of our customers that didn't realize they are leaking the names of their internal git repositories to downloads.yoctoproject.org. Indeed, that would be concerning for us as well. Would it be possible to ignore PREMIRRORS based on the recipe layer? Alternatively, we could create blocklists for heavy packages that need to fetch from upstream first rather than drop PREMIRRORS completely. Sometimes, having a secondary source could save valuable time when the upstream is not responsive. We don't have any support for "per-layer" overrides at this time which would be the way to do that. It is something I think we probably do want to consider adding but I haven't had the bandwidth to look at it. I'd note that these mirrors in PREMIRRORS are also in MIRRORS already in OE-Core so there is a fallback, it just controls the order they're tried in. Cheers, Richard
|
|

Khem Raj
On Wed, Mar 30, 2022 at 4:29 AM Richard Purdie <richard.purdie@...> wrote: On Wed, 2022-03-30 at 12:18 +0100, Ross Burton wrote:
On Wed, 30 Mar 2022 at 12:10, Richard Purdie <richard.purdie@...> wrote:
f) Switch the problematic recipes to use shallow clones with something like:
BB_GIT_SHALLOW:pn-binutils = "1" BB_GIT_SHALLOW:pn-binutils-cross-${TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-canadian-${TRANSLATED_TARGET_ARCH} = "1" BB_GIT_SHALLOW:pn-binutils-cross-testsuite = "1" BB_GIT_SHALLOW:pn-binutils-crosssdk-${SDK_SYS} = "1" BB_GIT_SHALLOW:pn-glibc = "1"
The challenge here is that in order to be effective, there needs to be a PREMIRROR setup with the shallow tarballs on it. This means we couldn't do e) above and have this have much effect unless we craft some very specific PREMIRROR entries too. Even without premirrors this is a lot faster for glibc:
$ time git clone git://sourceware.org/git/glibc.git Cloning into 'glibc'... remote: Enumerating objects: 6956, done. remote: Counting objects: 100% (6956/6956), done. remote: Compressing objects: 100% (2938/2938), done. remote: Total 670093 (delta 5328), reused 4750 (delta 3932), pack-reused 663137 Receiving objects: 100% (670093/670093), 205.19 MiB | 16.39 MiB/s, done. Resolving deltas: 100% (573265/573265), done. Updating files: 100% (19011/19011), done.
real 1m56.255s
$ time git clone git://sourceware.org/git/glibc.git --depth 1 Cloning into 'glibc'... remote: Enumerating objects: 18809, done. remote: Counting objects: 100% (18809/18809), done. remote: Compressing objects: 100% (9704/9704), done. remote: Total 18809 (delta 8812), reused 12185 (delta 7968), pack-reused 0 Receiving objects: 100% (18809/18809), 41.79 MiB | 11.96 MiB/s, done. Resolving deltas: 100% (8812/8812), done. Updating files: 100% (19011/19011), done.
real 0m8.701s
A full clone fetches 200MB and takes 2 minutes (a lot of that is actually resolving the deltas, not the fetch). A shallow clone of the current HEAD fetches 40MB and is done in 8 seconds.
Why would we need a premirror? The code doesn't do "--depth=1".
https://git.yoctoproject.org/poky/commit/?id=27d56982c7ba05e86a100b0cca2411ee5ac7a85e
""" This implements support for shallow mirror tarballs, not shallow clones. Supporting shallow clones directly is not really doable for us, as we'd need to hardcode the depth between branch HEAD and the SRCREV, and that depth would change as the branch is updated. """
Put another way, you didn't specify a revision in your clone above and if you try, it becomes rather tricky.
To make this work we therefore need a mirror with the shallow tarballs on it.
Just for info, the binutils mirror tarball is ~1.3GB, the shallow tarball is 65MB.
right, I think shallow clone should be default IMO for all git fetcher tarballs Cheers,
Richard
|
|
On Wed, Mar 30, 2022 at 10:24 AM Khem Raj < raj.khem@...> wrote: On Wed, Mar 30, 2022 at 4:29 AM Richard Purdie
<richard.purdie@...> wrote:
>
> On Wed, 2022-03-30 at 12:18 +0100, Ross Burton wrote:
> > On Wed, 30 Mar 2022 at 12:10, Richard Purdie
> > <richard.purdie@...> wrote:
> > > f) Switch the problematic recipes to use shallow clones with something like:
> > >
> > > BB_GIT_SHALLOW:pn-binutils = "1"
> > > BB_GIT_SHALLOW:pn-binutils-cross-${TARGET_ARCH} = "1"
> > > BB_GIT_SHALLOW:pn-binutils-cross-canadian-${TRANSLATED_TARGET_ARCH} = "1"
> > > BB_GIT_SHALLOW:pn-binutils-cross-testsuite = "1"
> > > BB_GIT_SHALLOW:pn-binutils-crosssdk-${SDK_SYS} = "1"
> > > BB_GIT_SHALLOW:pn-glibc = "1"
> > >
> > > The challenge here is that in order to be effective, there needs to be a
> > > PREMIRROR setup with the shallow tarballs on it. This means we couldn't do e)
> > > above and have this have much effect unless we craft some very specific
> > > PREMIRROR entries too.
> >
> > Even without premirrors this is a lot faster for glibc:
> >
> > $ time git clone git://sourceware.org/git/glibc.git
> > Cloning into 'glibc'...
> > remote: Enumerating objects: 6956, done.
> > remote: Counting objects: 100% (6956/6956), done.
> > remote: Compressing objects: 100% (2938/2938), done.
> > remote: Total 670093 (delta 5328), reused 4750 (delta 3932), pack-reused 663137
> > Receiving objects: 100% (670093/670093), 205.19 MiB | 16.39 MiB/s, done.
> > Resolving deltas: 100% (573265/573265), done.
> > Updating files: 100% (19011/19011), done.
> >
> > real 1m56.255s
> >
> > $ time git clone git://sourceware.org/git/glibc.git --depth 1
> > Cloning into 'glibc'...
> > remote: Enumerating objects: 18809, done.
> > remote: Counting objects: 100% (18809/18809), done.
> > remote: Compressing objects: 100% (9704/9704), done.
> > remote: Total 18809 (delta 8812), reused 12185 (delta 7968), pack-reused 0
> > Receiving objects: 100% (18809/18809), 41.79 MiB | 11.96 MiB/s, done.
> > Resolving deltas: 100% (8812/8812), done.
> > Updating files: 100% (19011/19011), done.
> >
> > real 0m8.701s
> >
> > A full clone fetches 200MB and takes 2 minutes (a lot of that is
> > actually resolving the deltas, not the fetch). A shallow clone of the
> > current HEAD fetches 40MB and is done in 8 seconds.
> >
> > Why would we need a premirror?
>
> The code doesn't do "--depth=1".
>
> https://git.yoctoproject.org/poky/commit/?id=27d56982c7ba05e86a100b0cca2411ee5ac7a85e
>
> """
> This implements support for shallow mirror tarballs, not shallow clones.
> Supporting shallow clones directly is not really doable for us, as we'd need
> to hardcode the depth between branch HEAD and the SRCREV, and that depth would
> change as the branch is updated.
> """
>
> Put another way, you didn't specify a revision in your clone above and if you
> try, it becomes rather tricky.
>
> To make this work we therefore need a mirror with the shallow tarballs on it.
>
> Just for info, the binutils mirror tarball is ~1.3GB, the shallow tarball is
> 65MB.
right, I think shallow clone should be default IMO for all git fetcher tarballs
We've been using shallow git tarballs for all recipes for years at Mentor, definitely speeds up fetches from local mirrors and reduces how much we need to ship to customers to allow them to use BB_NO_NETWORK out of the box. -- Christopher Larson chris_larson@..., chris.larson@..., kergoth@... Principal Software Engineer, Embedded Linux Solutions, Siemens Digital Industries Software
|
|