Fixing GitLab job system failures
How to investigate GitLab CI/CD job runner system failures
Problem
You get a problem of a CKI GitLab CI/CD job failing like
Steps
-
Determine the gitlab-runner responsible for the job. This can be derived from the gitlab-runner name in the job output. In the screenshot above, the
wf-aws-aws-internal-b-dm-internal-buildrefers to the internal runner in AZb. -
Log into the gitlab-runner machine via
ansible_ssh.sh. -
Look at the output of the journal for the gitlab-runner via
journalctl --since today --all --unit gitlab-runnerGet started by looking for
ERRORand red lines in the output.