Skip to content

Container Image Analysis & Recommendations

This document evaluates Platform-Mesh container images regarding security, maintenance, and potential improvements.

Executive Summary

Key Findings:

  • Security: 15 of 37 images (41%) have critical CVEs in the current release (0.1.1)
  • Maintenance: Core components use Renovate, but many images are outdated
  • Improvement Potential: 18 of 37 images (49%) can be replaced with Chainguard alternatives

1. CVE Status & Maintenance Assessment

Current State (Release 0.1.1)

Critical Security Issues:

  • 15 images with critical CVEs (see security-scan.md for details)
  • Most critical: PostgreSQL images (7-9 critical CVEs), Keycloak (4 critical), Flux controllers (4-7 critical)

Positive Aspects:

  • ✅ Renovate configured for core Platform-Mesh operator repositories
  • ✅ Commit signing enabled (including Renovate commits)
  • ✅ Some operators show clean CVE scans (platform-mesh-operator: 0 CVEs)

Concerns:

  • ⚠️ Many outdated images (see version gaps in images.md)
  • ⚠️ Inconsistent version pinning: OCM controller pinned by digest, ETCD Druid uses latest
  • ⚠️ No container image signing (required for SLSA compliance)
  • ⚠️ Release management appears ad-hoc (per-PR releases)
  • ⚠️ Changelogs are git diffs only

Maintenance & Update Process

Current Process:

  • Platform-Mesh releases controlled via helm-releases repository
  • "Upstream Images" repos for Keycloak and PostgreSQL using Bitnami base images
  • Builder images use golang:1.25 (not distroless/minimal)

Answered Questions:

Why "Upstream Images" repositories?

  • Platform-Mesh uses separate "upstream-images" repos because they're rebasing Bitnami images
  • Bitnami provides well-maintained base images for Keycloak and PostgreSQL
  • The custom repos allow Platform-Mesh to add specific configurations while benefiting from Bitnami's security patching

Open Questions:

  • Can individual images be updated independently for critical CVEs?
  • What's the approval process for image version changes?

2. Image Replacement & OCM ComponentVersion Integration

Current Architecture

Image Categories:

  1. Infrastructure: Kubernetes components, cert-manager, Flux, Crossplane
  2. Platform-Mesh Core: Operators (account, extension-manager, platform-mesh, security)
  3. Dependencies: Keycloak, PostgreSQL, OpenFGA, KCP
  4. Supporting: Mailpit, UI components (portal, marketplace-ui)

Current Challenges:

  • Multiple PostgreSQL versions in use (15.4.0, 17.6.0)
  • Separate "upstream-images" and "images" repositories for the same components
  • Mixed registry sources (ghcr.io, docker.io, quay.io, registry.k8s.io)

OCM Integration Feasibility

Requirements for OCM ComponentVersion:

  • Images must be accessible from target registries
  • Version pinning must be consistent
  • Private registry support needed (for Chainguard)

Recommendations:

  • ✅ Consolidate image versions (standardize on PostgreSQL 17.6.0)
  • ✅ Implement consistent digest-based pinning
  • ✅ Add support for imagePullSecrets in Helm charts/OCM configuration
  • ✅ Consider OCM transport mechanisms for private registries

How Easy Is It to Replace Images in OCM?

Based on analysis of the platform-mesh/ocm repository, here's the detailed assessment:

Difficulty Rating: 🟡 MODERATE (with proper tooling setup)

OCM Component Structure

Platform-Mesh uses the Open Component Model (OCM) with the following architecture:

  1. Component Descriptors defined in component-constructor.yaml files
  2. Template-based versioning using Go template syntax: {{ .VARIABLE_NAME }}
  3. CTF (Common Transfer Format) archives for component storage and transfer
  4. OCI registry storage (ghcr.io and compatible registries)

Image Definition Example

components:
  - name: traefik
    version: { { .TRAEFIK_VERSION } }
    resources:
      - name: traefik-image
        type: ociImage
        relation: external
        version: { { .TRAEFIK_IMAGE_VERSION } }
        access:
          type: ociArtifact
          imageReference: "traefik:{{ .TRAEFIK_IMAGE_VERSION }}"

Image Replacement Process

Step 1: Update Component Constructor

# Modify imageReference in constructor/component-constructor.yaml
access:
  type: ociArtifact
  imageReference: "cgr.dev/chainguard-private/traefik:latest" # ← Replace here

Step 2: Update Version Variables

# Update template variables (typically in CI/CD or environment config)
export TRAEFIK_IMAGE_VERSION="latest"
export TRAEFIK_VERSION="v3.6.7"

Step 3: Package Component

# Create/update OCM component using OCM CLI
ocm add componentversions --create --file <archive> component-constructor.yaml

Step 4: Transfer to Registry

# Publish to OCI registry
ocm transfer ctf <archive> ghcr.io/platform-mesh

Automation & CI/CD

Platform-Mesh uses GitHub Actions workflows for automated packaging:

  • package_transfer.yaml: Matrix build across multiple component constructors
  • re-build-configmaps.yml: Generates ConfigMaps from Helmfile configurations
  • re-get-version.yml: Semantic versioning (auto-increment or manual)
  • draft-release.sh: Automated release notes generation

Ease of Replacement by Image Category

Category Difficulty Reason
Infrastructure (Traefik, cert-manager) 🟢 Easy Simple imageReference change, no dependencies
Bitnami-based (PostgreSQL, Keycloak) 🟡 Moderate Need compatibility testing, multiple components depend on these
Platform-Mesh operators 🟠 Complex Require rebuild with new base images, extensive testing
Chainguard migration 🟡 Moderate Requires imagePullSecrets configuration, registry authentication

Key Benefits of OCM Approach

Separation of Concerns

  • Image definitions in Helmfiles
  • Component metadata in constructors
  • Deployment configs in ConfigMaps

Version Templating

  • Single variable controls multiple image references
  • Environment-specific overrides possible
  • Consistent versioning across components

Multi-Registry Support

  • Works with any OCI-compliant registry
  • Easy migration between registries
  • Private registry authentication supported

GitOps Integration

  • Flux CD pulls from OCI registries
  • KRO ResourceGraphDefinition orchestrates deployments
  • Automated updates via Renovate

Challenges & Solutions

Challenge 1: Template Variable Management

Problem: Variables scattered across multiple files
Solution: Centralize in single .env or values file

Challenge 2: ImagePullSecrets for Private Registries

# Add to Helm chart values or KRO configuration
imagePullSecrets:
  - name: chainguard-registry-creds

Challenge 3: Version Synchronization

Problem: Multiple components reference same image at different versions
Solution: Use shared template variables, centralized version management

Practical Migration Steps

Phase 1: Test Single Image (2-4 hours)

# 1. Fork and modify component-constructor.yaml
# 2. Update imageReference for one infrastructure image
# 3. Test in KIND cluster
# 4. Validate functionality

Phase 2: Batch Infrastructure Updates (1 week)

# Update all infrastructure images (cert-manager, Flux, CoreDNS)
# Use existing Renovate workflows for automation
# Deploy to test environment

Phase 3: Core Dependencies (2-3 weeks)

# Replace PostgreSQL and Keycloak with Chainguard
# Extensive integration testing
# Database migration validation
# Authentication flow verification

Required Tools & Access

  • OCM CLI: ocm command-line tool
  • Registry Credentials: Write access to ghcr.io/platform-mesh
  • GitHub Access: Ability to modify constructor files and workflows
  • Kubernetes Cluster: For testing (KIND cluster already configured)
  • ImagePullSecrets: For private registries (Chainguard)

3. Chainguard Images Evaluation

Coverage Analysis

Available Chainguard Replacements: 18 of 37 images (49%)

See Chainguard Comparison Table below for detailed CVE comparison.

Benefits of Chainguard Migration

Security Improvements:

  • 14 of 18 Chainguard images have 0 critical CVEs
  • Significantly reduced attack surface (minimal/distroless base)
  • Regular security updates and scanning

Operational Benefits:

  • Consistent base image across components
  • Better supply chain security
  • SLSA compliance potential

Migration Considerations

Technical Requirements:

  • Private registry credentials (imagePullSecrets)
  • Helm chart updates for registry configuration
  • OCM component descriptor updates
  • Testing compatibility with existing deployments

Compatibility Risks:

  • Keycloak: 2 high CVEs in Chainguard version (vs 4 critical in current)
  • PostgreSQL: 1 high CVE in Chainguard (vs 7 critical in current)
  • Flux controllers: 0 CVEs in Chainguard (vs 4-7 critical in current)

Blockers:

  • No Chainguard images available for: KCP, Platform-Mesh operators, UI components, OpenFGA (though upstream v1.11.3 has 0 CVEs)

Phase 1 - Quick Wins (Low Risk):

  • Replace infrastructure images: cert-manager (3 images), CoreDNS, etcd
  • Replace Flux controllers (3 images)
  • Replace Traefik, Mailpit

Phase 2 - Core Dependencies (Medium Risk):

  • PostgreSQL replacement (test compatibility first)
  • Keycloak replacement (validate authentication flows)

Phase 3 - Build Optimization (High Effort):

  • Rebuild Platform-Mesh operators with Wolfi/distroless base
  • Implement image signing
  • Standardize builder images

Chainguard Comparison Table

Images that can be replaced with Chainguard alternatives:

Component Current Image Current CVEs (C/H/M/L) Chainguard Alternative Chainguard CVEs (C/H/M/L) Improvement
Keycloak ghcr.io/platform-mesh/upstream-images/keycloak:26.3.3 4/44/51/21 🔴 cgr.dev/chainguard-private/keycloak:latest 0/2/4/0 🟠 ⬇️ 95% reduction
OpenFGA openfga/openfga:v1.9.0 0/30/12/0 🟠 cgr.dev/chainguard-private/openfga:latest 0/0/0/0 🟢 ⬇️ 100% clean
Flux Kustomize ghcr.io/fluxcd/kustomize-controller:v1.7.1 7/38/14/12 🔴 cgr.dev/chainguard-private/flux-kustomize-controller:latest 0/0/0/0 🟢 ⬇️ 100% clean
Flux Helm ghcr.io/fluxcd/helm-controller:v1.4.2 6/27/3/6 🔴 cgr.dev/chainguard-private/flux-helm-controller:latest 0/0/0/0 🟢 ⬇️ 100% clean
Flux Source ghcr.io/fluxcd/source-controller:v1.7.2 6/29/3/6 🔴 cgr.dev/chainguard-private/flux-source-controller:latest 0/0/6/0 🟠 ⬇️ 88% reduction
PostgreSQL ghcr.io/platform-mesh/upstream-images/postgresql:17.6.0 7/35/20/18 🔴 cgr.dev/chainguard-private/postgres:latest 0/1/1/0 🟠 ⬇️ 97% reduction
Cert-Manager CA quay.io/jetstack/cert-manager-cainjector:v1.19.1 0/5/0/0 🟠 cgr.dev/chainguard-private/cert-manager-cainjector:latest 0/0/0/0 🟢 ⬇️ 100% clean
Cert-Manager Controller quay.io/jetstack/cert-manager-controller:v1.19.1 0/5/0/0 🟠 cgr.dev/chainguard-private/cert-manager-controller:latest 0/0/0/0 🟢 ⬇️ 100% clean
Cert-Manager Webhook quay.io/jetstack/cert-manager-webhook:v1.19.1 0/5/0/0 🟠 cgr.dev/chainguard-private/cert-manager-webhook:latest 0/0/0/0 🟢 ⬇️ 100% clean
CoreDNS registry.k8s.io/coredns/coredns:v1.12.1 0/17/8/0 🟠 cgr.dev/chainguard-private/coredns:latest 0/0/0/0 🟢 ⬇️ 100% clean
etcd registry.k8s.io/etcd:3.6.4-0 0/91/25/0 🟠 cgr.dev/chainguard-private/etcd:latest 0/0/0/0 🟢 ⬇️ 100% clean
Kube Controller Mgr registry.k8s.io/kube-controller-manager:v1.34.0 0/24/4/0 🟠 cgr.dev/chainguard-private/kubernetes-kube-controller-manager:latest 0/0/0/0 🟢 ⬇️ 100% clean
Kube API Server registry.k8s.io/kube-apiserver:v1.34.0 0/24/4/0 🟠 cgr.dev/chainguard-private/kubernetes-kube-apiserver:latest 0/0/0/0 🟢 ⬇️ 100% clean
Kube Proxy registry.k8s.io/kube-proxy:v1.34.0 2/32/6/2 🔴 cgr.dev/chainguard-private/kubernetes-kube-proxy:latest 0/0/0/0 🟢 ⬇️ 100% clean
Kube Scheduler registry.k8s.io/kube-scheduler:v1.34.0 0/23/4/0 🟠 cgr.dev/chainguard-private/kubernetes-kube-scheduler:latest 0/0/0/0 🟢 ⬇️ 100% clean
Crossplane xpkg.crossplane.io/crossplane/crossplane:v1.20.1 0/16/5/0 🟠 cgr.dev/chainguard-private/crossplane:latest 0/0/2/0 🟢 ⬇️ 90% reduction
Traefik docker.io/traefik:v3.6.0 4/20/3/6 🔴 cgr.dev/chainguard-private/traefik:latest 0/0/0/0 🟢 ⬇️ 100% clean
Mailpit axllent/mailpit:v1.27.9 6/33/4/6 🔴 cgr.dev/chainguard-private/mailpit:latest 0/0/0/0 🟢 ⬇️ 100% clean

Legend: 🔴 Critical CVEs | 🟠 High CVEs | 🟢 Clean or Low Risk

Action Items

Immediate (High Priority)

  1. Set up OCM tooling and access
  2. Install OCM CLI (ocm)
  3. Configure registry credentials for ghcr.io/platform-mesh
  4. Set up Chainguard registry access for testing

  5. Update outdated infrastructure images to latest upstream versions (many CVE fixes available)

  6. Traefik: v3.6.0 → v3.6.7 (via OCM component-constructor.yaml)
  7. Cert-manager: v1.19.1 → v1.19.2
  8. Flux controllers: Update all three to latest versions

  9. Standardize PostgreSQL version across all components

  10. Currently using both 15.4.0 and 17.6.0
  11. Migrate all to PostgreSQL 17.6.0-debian-12-r4
  12. Update OCM component descriptors accordingly

  13. Replace ETCD Druid latest tag with pinned version v0.34.0

  14. Enable imagePullSecrets configuration for private registry support

  15. Add to Helm charts/KRO ResourceGraphDefinition
  16. Required for Chainguard image migration

Short-term (2-4 weeks)

  1. Pilot Chainguard images for cert-manager and Flux controllers (low risk, high security gain)
  2. Update component-constructor.yaml with Chainguard imageReferences
  3. Configure imagePullSecrets for cgr.dev/chainguard-private
  4. Test in KIND cluster environment
  5. Package and transfer to OCM registry

  6. Centralize OCM version management

  7. Create single source of truth for image versions (e.g., .env file)
  8. Update CI/CD workflows to use centralized variables
  9. Ensure template variables ({{ .VERSION }}) are consistently applied

  10. Implement image signing for Platform-Mesh operators

  11. Use cosign for container image signatures
  12. Integrate with OCM component descriptors
  13. Add verification to deployment pipelines

  14. Document OCM image update process and approval workflows

  15. Create runbook for updating images via component-constructor.yaml
  16. Define approval process for CVE-related emergency updates
  17. Establish testing requirements before OCM registry push

  18. Audit and consolidate "upstream-images" vs "images" repositories

  19. Document Bitnami rebasing strategy
  20. Evaluate if separate repos are still necessary
  21. Consider direct Chainguard migration instead of Bitnami rebasing

Medium-term (1-3 months)

  1. Migrate critical dependencies to Chainguard (PostgreSQL, Keycloak) with thorough testing
  2. Rebuild Platform-Mesh operators with Wolfi/distroless base images
  3. Implement automated CVE scanning in CI/CD pipelines
  4. Establish release cadence and changelog automation

References

Internal Documentation

Platform-Mesh OCM Resources

External Resources