bitbake controlling memory use


Trevor Gamblin
 


On 2021-06-07 3:27 p.m., Gmane Admin wrote:
[Please note: This e-mail is from an EXTERNAL e-mail address]

Op 05-06-2021 om 15:35 schreef Gmane Admin:
Op 14-04-2021 om 06:59 schreef Richard Purdie:
On Tue, 2021-04-13 at 21:14 -0400, Randy MacLeod wrote:
On 2021-04-11 12:19 p.m., Alexander Kanavin wrote:
make already has -l option for limiting new instances if load
average is
too high, so it's only natural to add a RAM limiter too.

    -l [N], --load-average[=N], --max-load[=N]
                                Don't start multiple jobs unless
load is
below N.

In any case, patches welcome :)

During today's Yocto technical call (1),
we talked about approaches to limiting the system load and avoiding
swap and/or OOM events. Here's what (little!) i recall from the
discussion, 9 busy hours later.

In the short run, instead of independently maintaining changes to
configurations to limit parallelism or xz memory usage, etc, we
could develop an optional common include file where such limits
are shared across the community.

In the longer run, changes to how bitbake schedules work may be needed.

Richard says that there was a make/build server idea and maybe even a
patch from a while ago. It may be in one of his poky-contrib branches.
I took a few minutes to look but nothing popped up. A set of keywords to
search for might help me find it.

http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/wipqueue4&id=d66a327fb6189db5de8bc489859235dcba306237

This patch resolves a starvation of a particular resource (execution
cores), which is good.
However, the problem I am facing is starvation of another resource (memory).

Cheers,

Richard


I like the idea. Unfortunately the patch doesn't apply to Gatesgarth, so
I couldn't test it. Any chance you would be doing a refresh?

Ok so I refreshed this patch my self and it seems to be working nicely
(3000 out of 4000 tasks complete), except for one thing: do_configure
for cmake-native fails and I don't see why. From the log:

loading initial cache file
xxxxxx/out/linux64/build/tmp/work/x86_64-linux/cmake-native/3.18.2-r0/build/Bootstrap.cmk/InitialCacheFlags.cmake
-- The C compiler identification is GNU 10.3.0
-- The CXX compiler identification is GNU 10.3.0
-- Detecting C compiler ABI info
CMake Error: Generator: execution of make failed. Make command was:
xxxxxx/out/linux64/poky/scripts/make-intercept/make cmTC_68352/fast &&
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: xxxxxx/out/linux64/build/tmp/hosttools/gcc
CMake Error: Generator: execution of make failed. Make command was:
xxxxxx/out/linux64/poky/scripts/make-intercept/make cmTC_f23a0/fast &&
-- Check for working C compiler:
xxxxxx/out/linux64/build/tmp/hosttools/gcc - broken
CMake Error at Modules/CMakeTestCCompiler.cmake:66 (message):
  The C compiler

    "xxxxxx/out/linux64/build/tmp/hosttools/gcc"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir:
xxxxxx/tmp/work/x86_64-linux/cmake-native/3.18.2-r0/build/CMakeFiles/CMakeTmp

    Run Build
Command(s):xxxxxx/out/linux64/poky/scripts/make-intercept/make
cmTC_f23a0/fast && Permission denied
    Generator: execution of make failed. Make command was:
xxxxxx/out/linux64/poky/scripts/make-intercept/make cmTC_f23a0/fast &&

  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:7 (project)

Crazy. I don't see why making a complete recipe works fine, while making
a test program during configure fails. Ideas?

When I encountered this failure with the patch, it was because the scripts/make-intercept/make script was not marked as executable. However, there was another failure even after that was changed, claiming that an appropriate Makefile parser was not found on the system, which (strangely) seems to be fixed by correcting the script's hashbang to use python3 instead of python. Please see my other response in this thread for further action.

- Trevor






Gmane Admin
 

Op 08-06-2021 om 21:08 schreef Trevor Gamblin:
On 2021-06-05 9:35 a.m., Gmane Admin wrote:
[Please note: This e-mail is from an EXTERNAL e-mail address]

Op 14-04-2021 om 06:59 schreef Richard Purdie:
On Tue, 2021-04-13 at 21:14 -0400, Randy MacLeod wrote:
On 2021-04-11 12:19 p.m., Alexander Kanavin wrote:
make already has -l option for limiting new instances if load average is
too high, so it's only natural to add a RAM limiter too.

    -l [N], --load-average[=N], --max-load[=N]
                                Don't start multiple jobs unless load is
below N.

In any case, patches welcome :)
During today's Yocto technical call (1),
we talked about approaches to limiting the system load and avoiding
swap and/or OOM events. Here's what (little!) i recall from the
discussion, 9 busy hours later.

In the short run, instead of independently maintaining changes to
configurations to limit parallelism or xz memory usage, etc, we
could develop an optional common include file where such limits
are shared across the community.

In the longer run, changes to how bitbake schedules work may be needed.

Richard says that there was a make/build server idea and maybe even a
patch from a while ago. It may be in one of his poky-contrib branches.
I took a few minutes to look but nothing popped up. A set of keywords to
search for might help me find it.
http://git.yoctoproject.org/cgit.cgi/poky-contrib/commit/?h=rpurdie/wipqueue4&id=d66a327fb6189db5de8bc489859235dcba306237

Cheers,

Richard

I like the idea. Unfortunately the patch doesn't apply to Gatesgarth, so
I couldn't test it. Any chance you would be doing a refresh?
I have reworked the patch and I'm doing some testing with it right now. Once I have collected some data (and possibly reworked it further, depending on results), perhaps I can have you test it out as well? That should be in the next day or two.
Sure. But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake.


- Trevor




Ferry Toth
 

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.

Ferry


Trevor Gamblin
 


On 2021-06-10 5:22 a.m., Ferry Toth wrote:

[Please note: This e-mail is from an EXTERNAL e-mail address]

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.

Hi,

I've pushed the patch onto my fork of the poky repo at https://github.com/threexc/poky

Let me know how your testing turns out - I am still running tests as well, but it would be good to know how others' attempts turn out, and more changes could still end up being necessary.

- Trevor


Ferry




Ferry Toth
 

Hi,

Op 10-06-2021 om 21:06 schreef Trevor Gamblin:


On 2021-06-10 5:22 a.m., Ferry Toth wrote:

**[Please note: This e-mail is from an EXTERNAL e-mail address]

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.
Hi,

I've pushed the patch onto my fork of the poky repo at https://github.com/threexc/poky

Let me know how your testing turns out - I am still running tests as well, but it would be good to know how others' attempts turn out, and more changes could still end up being necessary.
Your patch didn't apply clean on Gatesgarth, but fix seemd trivial. With this it builds cmake-native fine, thanks!

You can find it here: https://github.com/htot/meta-intel-edison/commit/8abce2f6f752407c7b2831dabf37cc358ce55bc7

I will check if any other build errors occurs, and if not will try to time image build with and without the patch to compare performance and see if it worth the effort.

- Trevor


Ferry


Ferry Toth
 

Hi

Op 10-06-2021 om 22:35 schreef Ferry Toth:
Hi,

Op 10-06-2021 om 21:06 schreef Trevor Gamblin:


On 2021-06-10 5:22 a.m., Ferry Toth wrote:

**[Please note: This e-mail is from an EXTERNAL e-mail address]

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.
Hi,

I've pushed the patch onto my fork of the poky repo at https://github.com/threexc/poky

Let me know how your testing turns out - I am still running tests as well, but it would be good to know how others' attempts turn out, and more changes could still end up being necessary.
Your patch didn't apply clean on Gatesgarth, but fix seemd trivial. With this it builds cmake-native fine, thanks!

You can find it here: https://github.com/htot/meta-intel-edison/commit/8abce2f6f752407c7b2831dabf37cc358ce55bc7

I will check if any other build errors occurs, and if not will try to time image build with and without the patch to compare performance and see if it worth the effort.
It works fine. To measure time I first built https://github.com/htot/meta-intel-edison (gatesgarth), so everything needed is downloaded and cached. Then prior to each run I `rm -rf out` and `rm -rf bbcache/sstate-cache/*` to force everything to rebuild. And then `time bitbake -k edison-image`

With patch:
real    218m12,686s
user    0m24,058s
sys     0m4,379s

Without:
real    219m36,944s
user    0m24,770s
sys     0m4,266s

Strange, I expected more.

This is on 4 core/8ht i7-3770 CPU @ 3.40GHz with 16Gb RAM and nodejs restricted to -j 2 (so that alone takes ~ 60min to build).

- Trevor


Ferry


Randy MacLeod
 

On 2021-06-12 12:31 p.m., Ferry Toth wrote:
Hi
Op 10-06-2021 om 22:35 schreef Ferry Toth:
Hi,

Op 10-06-2021 om 21:06 schreef Trevor Gamblin:


On 2021-06-10 5:22 a.m., Ferry Toth wrote:

**[Please note: This e-mail is from an EXTERNAL e-mail address]

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.
Hi,

I've pushed the patch onto my fork of the poky repo at https://github.com/threexc/poky

Let me know how your testing turns out - I am still running tests as well, but it would be good to know how others' attempts turn out, and more changes could still end up being necessary.
Your patch didn't apply clean on Gatesgarth, but fix seemd trivial. With this it builds cmake-native fine, thanks!

You can find it here: https://github.com/htot/meta-intel-edison/commit/8abce2f6f752407c7b2831dabf37cc358ce55bc7

I will check if any other build errors occurs, and if not will try to time image build with and without the patch to compare performance and see if it worth the effort.
It works fine. To measure time I first built https://github.com/htot/meta-intel-edison (gatesgarth), so everything needed is downloaded and cached. Then prior to each run I `rm -rf out` and `rm -rf bbcache/sstate-cache/*` to force everything to rebuild. And then `time bitbake -k edison-image`
With patch:
real    218m12,686s
user    0m24,058s
sys     0m4,379s
Without:
real    219m36,944s
user    0m24,770s
sys     0m4,266s
Strange, I expected more.
Hi Ferry,

Thanks for the update.

Trevor and I saw similar (lack of ) results.

Trevor even trying getting kea, which uses 'make' to be done the
'configure' stage, for two builds in differect dirs. Then to run the two
'bitbake -c compile kea'
with and with out the patch with the expectation that with the
job server patch and the right number of jobs, the two builds would
take longer. I don't know the exact timing but there was no
noticeable difference.

We did strace things to confirm that the make wrapper was being called
and the actual make was being called by the wrapper. I suspect that
the next thing we try will be to patch 'make' to log when the jobserver
kicks in or to play with some make jobserver demo such as:
https://github.com/olsner/jobclient
to get some experience with how things are supposed to work and
to be able to strace a successful use of the job server feature.

A little RTFM / UTSL may also be required.

../Randy


This is on 4 core/8ht i7-3770 CPU @ 3.40GHz with 16Gb RAM and nodejs restricted to -j 2 (so that alone takes ~ 60min to build).

- Trevor


Ferry



--
# Randy MacLeod
# Wind River Linux


Ferry Toth
 

Hi

Op 13-06-2021 om 02:38 schreef Randy MacLeod:
On 2021-06-12 12:31 p.m., Ferry Toth wrote:
Hi

Op 10-06-2021 om 22:35 schreef Ferry Toth:
Hi,

Op 10-06-2021 om 21:06 schreef Trevor Gamblin:


On 2021-06-10 5:22 a.m., Ferry Toth wrote:

**[Please note: This e-mail is from an EXTERNAL e-mail address]

Hi Trevor,

Gmane is really messing things up here, sorry about that. I need to create a new thread I'm afraid.

I'd like to your reworked patch.

But note, I reworked it too (but maybe wrongly). I builds like 90% of my image, but fails building cmake-native. Or more accurately it fails do_configure while trying to build a small test program.
Hi,

I've pushed the patch onto my fork of the poky repo at https://github.com/threexc/poky

Let me know how your testing turns out - I am still running tests as well, but it would be good to know how others' attempts turn out, and more changes could still end up being necessary.
Your patch didn't apply clean on Gatesgarth, but fix seemd trivial. With this it builds cmake-native fine, thanks!

You can find it here: https://github.com/htot/meta-intel-edison/commit/8abce2f6f752407c7b2831dabf37cc358ce55bc7

I will check if any other build errors occurs, and if not will try to time image build with and without the patch to compare performance and see if it worth the effort.
It works fine. To measure time I first built https://github.com/htot/meta-intel-edison (gatesgarth), so everything needed is downloaded and cached. Then prior to each run I `rm -rf out` and `rm -rf bbcache/sstate-cache/*` to force everything to rebuild. And then `time bitbake -k edison-image`

With patch:
real    218m12,686s
user    0m24,058s
sys     0m4,379s

Without:
real    219m36,944s
user    0m24,770s
sys     0m4,266s

Strange, I expected more.
Hi Ferry,

Thanks for the update.

Trevor and I saw similar (lack of ) results.

Trevor even trying getting kea, which uses 'make' to be done the
'configure' stage, for two builds in differect dirs. Then to run the two
   'bitbake -c compile kea'
with and with out the patch with the expectation that with the
job server patch and the right number of jobs, the two builds would
take longer. I don't know the exact timing but there was no
noticeable difference.

We did strace things to confirm that the make wrapper was being called
and the actual make was being called by the wrapper. I suspect that
I watched running processes from KDE ksysguard and I believe the number of compilers was actually restricted with the patch. On the other hand the server I tried was running munin which logs #processes over time and there I didn't see a difference. Confused..
the next thing we try will be to patch 'make' to log when the jobserver
kicks in or to play with some make jobserver demo such as:
   https://github.com/olsner/jobclient
to get some experience with how things are supposed to work and
to be able to strace a successful use of the job server feature.
I'm available for testing.
A little RTFM / UTSL may also be required.

../Randy



This is on 4 core/8ht i7-3770 CPU @ 3.40GHz with 16Gb RAM and nodejs restricted to -j 2 (so that alone takes ~ 60min to build).
I do like the jobserver idea a lot. Especially if it would take memory restrictions into account as well. The problem with building nodejs (and probably rust as well), there is lots to compile so you really want -j 16. But then during linking ld uses 4GB per instance, and there are 5 started. So on my machine I wouldn't want to start more then 3.

- Trevor


Ferry