[intel-iot-refkit] intel-iot-refkit in a single repo

Patrick Ohly patrick.ohly at intel.com
Wed Apr 26 09:52:06 PDT 2017


Our current "git submodule" approach has some known drawbacks:
     1. If any of the servers that host the submodules goes down or some
        change is made that removes the revision that we use (unlikely,
        but perhaps it happens by accident), then a new clone of
        intel-iot-refkit is unusable because the submodules are
     2. Forking intel-iot-refkit is easy, but forking also the
        subcomponents (should one decide to do that - more on that next)
        is hard.

There are valid reasons for forking:
     1. Adding additional commits and doing a test PR is fairly common
        in refkit, followed by submitting commits to the upstream
     2. A downstream user of refkit may end up in situations where
        modifications are needed that are impossible (fixing a .bbclass)
        or awkward (version updates) via .bbappend files.

I experimented a bit with hosting all meta data inside the same
intel-iot-refkit.git repo, including all submodules in additional,
separate branches.

Initially I wanted to do it so that "git clone intel-iot-refkit.git"
would not show the additional branches by default, by storing them under
refs/modules which isn't fetched by default. However, "git submodule"
then also didn't fetch the necessary commits and I've not found a way to
specify what it fetches. Forking probably also would have been harder in
this approach.

Instead, now the submodules are under refs/heads/modules and thus get
fetched. They are also visible, so one can do "git log
modules/openembedded-core/master" and tab-completion will work while
typing a branch name. I find myself switching between repos quite often,
so I can image that having everything in one repo will be convenient.

Tags are in refs/tags/<submodule name>/<tags of that submodule>, but
there aren't that many of those. Forking intel-iot-refkit automatically
also forks the submodules and .gitmodules is set up so that it pulls
from the forked repo (as long as the name is still intel-iot-refkit.git,
due to a submodule limitation - renaming works, but then one has to
edit .gitmodules).

Cloning can be faster, depending on how it is done.

Here's the current approach:

$ time git clone --recursive git at github.com:intel/intel-iot-refkit.git
real	3m9.527s
user	0m15.804s
sys	0m4.844s
$ du -h -s intel-iot-refkit
244M	intel-iot-refkit

When fetching objects only once (via the --reference parameter), a
checkout is twice a fast and requires less disk space, but admittedly
also is more complicated:

$ time sh -c 'git clone git at github.com:pohly/intel-iot-refkit.git && cd intel-iot-refkit && git submodule init && git submodule update --reference .'
real	1m24.184s
user	0m21.784s
sys	0m3.392s
$ du -h -s intel-iot-refkit
212M	intel-iot-refkit

The current approach still works, but becomes slower and requires more
disk space than before, because each .git/modules/<submodule> directory
is basically a full clone of the entire intel-iot-refkit.git:

$ time git clone --recursive git at github.com:pohly/intel-iot-refkit.git
real	7m11.693s
user	1m31.348s
sys	0m23.556s
$ du -h -s intel-iot-refkit
1.2G	intel-iot-refkit

Is it worth exploring this further?

For this to work, we need to enhance our CI so that it updates
submodules in git at github.com:intel/intel-iot-refkit.git. This needs to
be done regularly (to ensure that developers pulling from it also get
recent subcomponents, in case that they want to try out something with
that) and for each PR (because a PR might add a new submodule - in that
case its enough to extend the local clone of the repo).

This can be done manually, too, but only by developers who have write
access to the repo. Pull requests probably won't work (because of the
extra merge commits).

For the record, here's how I created the branches and tags in an
unmodified intel-iot-refkit.git clone where all submodules had been
checked out:

git submodule foreach 'cd ..; remoteurl=$(git config submodule.$name.url); if git config remote.$name.url >/dev/null; then git remote set-url $name $remoteurl; else git remote add $name $remoteurl; fi; git config --unset-all remote.$name.fetch; git config --add remote.$name.fetch +refs/heads/*:refs/heads/modules/$name/*; git config --add remote.$name.fetch +refs/tags/*:refs/tags/$name/*; git config remote.$name.tagOpt --no-tags'
git push pohly +refs/heads/modules/*:refs/heads/modules/* +refs/tags/*:refs/tags/*

Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.

More information about the Intel-iot-refkit mailing list