Verifying systemd units on GitLab runner hosts

How to check timers and services installed on CKI GitLab runner machines from deployment-all and interpret systemd journal output

Runner hosts provisioned through deployment-all (Beaker inventory, OpenStack playbooks, and similar) often carry systemd units and timers installed by Ansible—for example the deployment-all/ansible/roles/cki_gitlab_runner role ships maintenance .service and .timer pairs alongside gitlab-runner itself. The exact set can change as the role evolves; use the templates under that role in deployment-all as the source of truth for names and behaviour.

How to verify

SSH to the host, then substitute the unit name you care about (for example a *.service or *.timer under /etc/systemd/system/):

systemctl status UNIT.service
journalctl -u UNIT.service -n 50 --no-pager

If the work is scheduled with a timer:

systemctl status UNIT.timer

systemctl list-timers --all lists enabled timers and approximate next runs.

What success usually looks like

  • Oneshot services (Type=oneshot): a recent successful run often shows active (exited) briefly, or inactive (dead) after completion, with journalctl showing the job’s own messages followed by systemd’s Finished … line.
  • Long-running services: active (running) while healthy.

Exact log text depends on the unit; compare with a known-good host or with the scripts and unit files in deployment-all.

When something fails

  • systemctl status shows failed and an exit code; journalctl -u is the first place to see stderr and script output.
  • Exit code 127 from a shell often means “command not found”—sometimes because the unit ran unexpected input (for example non-shell data where a command was expected). The journal usually shows the first token the shell tried to execute.
  • After changing units in deployment-all, re-run the relevant playbook so the host picks up new unit files, helper scripts, and config; then systemctl daemon-reload (or rely on Ansible’s handlers) if units were edited by hand.

For root cause, align what the unit’s ExecStart does with what appears in the journal and with the template in the repository.