At the DebConf videoteam, we use ansible to manage our machines. Last fall in
Cambridge, we migrated our repositories on salsa.debian.org
and I started
playing with the Gitlab CI. It's pretty powerful and helped us catch a bunch of
errors we had missed.
As it was my first time playing with continuous integration and docker, I had
trouble when our playbooks used systemd
in a way or another and I couldn't
figure out a way to have systemd
run in the Gitlab docker runner.
Fast forward a few months and I lost another day and a half working on this issue. I haven't been able to make it work (my conclusion is that it's not currently possible), but I thought I would share what I learned in the process with others. Who knows, maybe someone will have a solution!
10 steps to failure
I first stated by creating a privileged Gitlab docker runner on a machine that
is dedicated to running Gitlab CI runners. To run systemd
in docker you either
need to run privileged docker instances or to run them with the
--add-cap=SYS_ADMIN
permission.
If you were trying to run a docker container that runs with systemd
directly,
you would do something like:
$ docker run -it --cap-add SYS_ADMIN -v /sys/fs/cgroup:/sys/fs/cgroup:ro debian-systemd
I tried replicating this behavior with the Gitlab runner by mounting the right
volumes in the runner and giving it the right cap
permissions.
The thing is, normally your docker container runs a entrypoint command such as
CMD ["/lib/systemd/systemd"]
. To run its CI scripts, the Gitlab runner takes
that container but replaces the entrypoint command by:
sh -c 'if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash \nelif [ -x /usr/bin/bash ]; then\n\texec /usr/bin/bash \nelif [ -x /bin/bash ]; then\n\texec /bin/bash \nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh \nelif [ -x /usr/bin/sh ]; then\n\texec /usr/bin/sh \nelif [ -x /bin/sh ]; then\n\texec /bin/sh \nelse\n\techo shell not found\n\texit 1\nfi\n\n'
That is to say, it tries to run bash
.
If you try to run commands that require systemd
such as systemctl status
,
you'll end up with this error message since systemd
is not running:
Failed to get D-Bus connection: Operation not permitted
Trying to run systemd
manually once the container has been started won't work
either, since systemd
needs to be PID 1 in order to work (and PID 1 is
bash
). You end up with this error:
Trying to run as user instance, but the system has not been booted with systemd.
At this point, I came up with a bunch of creative solutions to try to bypass Gitlab's entrypoint takeover. Turns out you can tell the Gitlab runner to override the container's entrypoint with your own. Sadly, the runner then appends its long bash command right after.
For example, if you run a job with this gitlab-ci entry:
image: name: debian-systemd entrypoint: "/lib/systemd/systemd" script: - /usr/local/bin/my-super-script
You will get this entrypoint:
/lib/systemd/systemd sh -c 'if [ -x /usr/local/bin/bash ]; then\n\texec /usr/local/bin/bash \nelif [ -x /usr/bin/bash ]; then\n\texec /usr/bin/bash \nelif [ -x /bin/bash ]; then\n\texec /bin/bash \nelif [ -x /usr/local/bin/sh ]; then\n\texec /usr/local/bin/sh \nelif [ -x /usr/bin/sh ]; then\n\texec /usr/bin/sh \nelif [ -x /bin/sh ]; then\n\texec /bin/sh \nelse\n\techo shell not found\n\texit 1\nfi\n\n'
This obviously fails. I then tried to be clever and use this entrypoint:
["/lib/systemd/systemd", "&&"]
. This does not work either, since docker
requires the entrypoint to be only one command.
Someone pointed out to me that you could try to use exec /lib/systemd/systemd
to PID 1 bash
by systemd
, but that also fails with an error telling you
the system has not been booted with systemd
.
One more level down
Since it seems you can't run systemd
in the Gitlab docker runner directly,
why not try to run systemd
in docker in docker (dind)? dind is used quite a
lot in the Gitlab CI to build containers, so we thought it might work.
Sadly, we haven't been able to make this work either. You need to mount volumes
in docker to run systemd
properly and it seems docker doesn't like to mount
volumes from a docker container that already have been mounted from the docker
host... Ouf.
If you have been able to run systemd
in the Gitlab docker runner, please
contact me!
Paths to explore
The only Gitlab runner executor I've used at the moment is the docker one, since
it's what most Gitlab instances run. I have no experience with it, but since
there is also an LXC executor, it might be possible to run Gitlab CI tests with
systemd
this way.