cki_tools.rpm_cache

S3-backed Fedora download cache for RPMs and OCI container images

The rpm-cache consists of two AWS Lambda functions that provide a transparent caching proxy for Fedora RPM downloads and OCI container image artifacts. Requests are served from the upstream origin when available and fall back to S3-cached copies when the origin no longer has the artifact (e.g. after upstream garbage collection). All non-cacheable requests are forwarded to the origin without caching.

Environment variable Secret Required Description
ORIGIN_HOST no no upstream RPM mirror hostname, defaults to dl.fedoraproject.org
OCI_REGISTRY no no upstream OCI registry hostname, defaults to quay.io
OCI_ALLOWED_PREFIXES no no space-separated image names to cache, defaults to fedora/fedora
S3_BUCKET_NAME no yes S3 bucket for cached artifacts
UPLOADER_LAMBDA_ARN no yes ARN of the uploader Lambda function (handler only)
CACHE_EXTENSIONS no no space-separated cacheable extensions, defaults to .rpm
PRESIGNED_URL_EXPIRATION no no presigned URL lifetime in seconds, defaults to 3600
CKI_DEPLOYMENT_ENVIRONMENT no no deployment environment for Sentry tagging
ORIGIN_HEAD_TIMEOUT no no HEAD request timeout in seconds, defaults to 5
SENTRY_DSN yes no Sentry DSN for error reporting
LAMBDA_HANDLER no yes container image Lambda handler function name

Lambda functions

The rpm-cache image provides two Lambda handlers selected via LAMBDA_HANDLER:

  • cki_tools.rpm_cache.handler_lambda: API Gateway entry point that routes RPM requests through the cache strategy described below and OCI requests through the v2 distribution API handler.
  • cki_tools.rpm_cache.uploader_lambda: async worker invoked by the handler on cache miss; downloads from origin and uploads to S3.

RPM cache behavior

The handler uses a HEAD-first strategy to minimise S3 egress costs:

  • .rpm requests (origin has it): handler sends a HEAD request to origin first; if the origin returns 200, the client is redirected there directly (no S3 egress). If the RPM is not yet cached, the uploader is triggered asynchronously to populate the cache for future fallback.
  • .rpm requests (gone from origin, cached): when the HEAD returns 404 or times out and the RPM exists in S3, the handler redirects to a time-limited presigned S3 URL.
  • .rpm requests (gone from origin, not cached): handler redirects to origin as a last resort (the client will see the origin’s error).
  • Non-.rpm requests: handler returns 302 to origin (no caching).
  • Errors: any S3/Lambda error falls back to a 302 redirect to origin, so the cache is never in the critical path.

OCI cache behavior

Requests with a v2/ path prefix are handled as OCI Distribution API v2 requests. The cache only processes digest-based manifest and blob pulls for images listed in OCI_ALLOWED_PREFIXES; everything else (tag lookups, other API endpoints, images not in the allowlist) is forwarded to the upstream registry transparently.

  • GET /v2/: health check, returns 200 with Docker-Distribution-API-Version: registry/2.0.
  • GET /v2/<name>/manifests/sha256:<digest> (allowed image): same strategy as RPMs – redirect to origin when available (triggering the uploader on cache miss), fall back to serving the manifest inline from S3 when the origin no longer has it.
  • GET /v2/<name>/blobs/sha256:<digest> (allowed image): same strategy as RPMs – redirect to origin when available, fall back to a presigned S3 URL when the origin no longer has it.
  • All other requests: forwarded to the upstream registry (302 redirect). This includes tag-based pulls, tag listings, and images not in OCI_ALLOWED_PREFIXES.
  • Errors: S3 failures are logged and the handler falls through to fetch from the upstream registry, so the cache is never in the critical path.

Image allowlisting

OCI caching is scoped to specific images via two independent mechanisms:

Server-side (OCI_ALLOWED_PREFIXES): the cache only stores and serves artifacts for image names matching this allowlist. Requests for other images are forwarded to the upstream registry without touching S3. This prevents the cache from filling up with unwanted images.

Client-side (registries.conf prefix): the prefix field in registries.conf controls which pulls the container runtime routes through the cache in the first place. Using a specific prefix (e.g. quay.io/fedora/fedora) ensures that only matching images are sent to the mirror.

Both layers should agree: the client prefix should match (or be a subset of) the server-side OCI_ALLOWED_PREFIXES.

Client configuration

To use the cache as a transparent mirror for quay.io, configure registries.conf to redirect digest-based pulls through the cache:

[[registry]]
location = "quay.io"
prefix = "quay.io/fedora/fedora"

[[registry.mirror]]
location = "rpm-cache.example.com"
pull-from-mirror = "digest-only"

With this configuration, buildah and skopeo will try the cache first for any digest-pinned pull of quay.io/fedora/fedora, falling back to quay.io directly if the cache is unavailable. Tag-based pulls bypass the mirror entirely due to pull-from-mirror = "digest-only".

To test without modifying the system-wide configuration, use a local config file:

buildah --registries-conf=/path/to/registries.conf pull quay.io/fedora/fedora@sha256:...
# or
CONTAINERS_REGISTRIES_CONF=/path/to/registries.conf skopeo copy \
    docker://quay.io/fedora/fedora@sha256:... oci:image:latest

Alternatively, address the cache directly as a registry (useful for ad-hoc testing without any registries.conf changes):

skopeo copy --src-tls-verify=false --src-no-creds \
    docker://cache-host:8080/fedora/fedora@sha256:... oci:image:latest

Request and response format

The handler receives API Gateway v2 (HTTP API) events with a {path+} catch-all route parameter. The path mirrors the origin URL structure:

  • Input: event["pathParameters"]["path"] – everything after the hostname, e.g. pub/fedora/linux/updates/44/Everything/x86_64/Packages/f/foo-1.0.fc44.x86_64.rpm or v2/fedora/fedora/manifests/sha256:abc123
  • 302 redirect: RPM and blob responses are redirects via the Location header, pointing to either the origin URL or a presigned S3 URL
  • 200: OCI health check and manifest responses are returned inline
  • 400: returned only when pathParameters or path is missing

Deployment

  • Container image: quay.io/cki/rpm-cache
  • S3 key prefix: RPMs are stored under cache/pub/...; OCI artifacts use flat content-addressed keys under cache/oci/<digest> (deduplicating across image names since OCI digests are content-addressed)
  • Deployment config: lives in the deployment-all repo as an Ansible playbook using the cki_aws_lambda role (same pattern as receiver)

The image can also be run as a plain HTTP server for integration testing:

python -m cki_tools.rpm_cache serve

Testing

  • Unit tests: python -m pytest tests/test_rpm_cache.py
  • Integration tests: inttests/images/rpm-cache/ (runs via CI image inttest job, or locally with tox -e image -- inttests/images/rpm-cache). The OCI integration test deploys the image in HTTP serving mode and uses skopeo copy to pull a real Fedora image through the cache.

Design rationale

This module intentionally avoids depending on cki-lib to keep the Lambda image small (12 pip packages vs 41+). See the module docstring in cki_tools/rpm_cache.py for details.