Skip to content

Evaluation

This document summarizes the findings from evaluating the Keycloak deployment and integration in Platform-Mesh, including advantages, disadvantages, and open questions.

Advantages

Finding Description
Bitnami Charts & Images Leverages well-maintained Bitnami Helm charts and container images, reducing custom maintenance burden.
No External IdP Dependency Self-managed Keycloak instance avoids dependencies on external IdPs and potential configuration drift that could cause system-wide issues.
Operator-Managed Lifecycle Security Operator manages realms, users, and clients programmatically. (Note: Clarify if it only bootstraps or fully manages the lifecycle)

Disadvantages

Finding Description
Mirrored Bitnami Images Bitnami images are mirrored internally, presumably to work around the "latest-only-free" limitation.
Image Replacement Difficulty Experience with PostgreSQL has shown that Bitnami images cannot easily be swapped for alternatives (e.g., Chainguard) since Bitnami adds additional initialization logic.
Weak Auto-Generated Secrets Bitnami chart auto-generates secrets that are relatively weak (no special characters, length of 10).
Overly Permissive Service Accounts Security Operator and IAM Service use service accounts with full admin privileges on Keycloak.
Java Technology Stack Keycloak is built on Java, which may present a knowledge gap for teams without Java expertise.

Neutral Observations

Finding Description
Minimal Chart Customization Currently using mostly Bitnami chart defaults with minimal customization.
Open Network Access PostgreSQL port and Keycloak UI/API are accessible to all pods by default via NetworkPolicy. The Helm chart supports configuration to restrict this.

Open Questions

Question Context
Backup & Recovery Strategy? No backup/recovery strategy is currently defined. Should Velero be used for Keycloak state?
Bootstrap vs. Full Management? Does the Security Operator only bootstrap Keycloak resources, or does it continuously reconcile them?

Analysis

When Self-Managed Keycloak Makes Sense

Self-managing Keycloak within Platform-Mesh provides value when:

  • Full control over IdP configuration and lifecycle is required
  • Air-gapped or isolated environments cannot reach external IdPs
  • Multi-tenant realm isolation is needed with programmatic provisioning
  • Tight integration with platform operators for user/client lifecycle

When External IdP May Be Preferable

Consider using an external IdP when:

  • Managed IdP services (e.g., Auth0, Okta, Azure AD) are already in use
  • Operational overhead of managing Keycloak is not justified
  • Enterprise SSO integration is already established elsewhere
  • High availability requirements exceed what can be self-managed

Short-Term

  1. Strengthen secret generation - Override Bitnami defaults to generate stronger secrets (special characters, minimum 32 characters)
  2. Review service account permissions - Apply principle of least privilege to Security Operator and IAM Service Keycloak clients
  3. Restrict network access - Configure NetworkPolicies to limit PostgreSQL and Keycloak API access to required components only

Medium-Term

  1. Define backup strategy - Evaluate Velero or PostgreSQL-native backups for Keycloak data
  2. Document lifecycle management - Clarify whether Security Operator bootstraps or continuously reconciles Keycloak resources
  3. Evaluate image alternatives - Assess feasibility of using hardened images (Chainguard, Ironbank) with required initialization logic

Long-Term

  1. High availability - Plan for Keycloak HA deployment with multiple replicas and session replication
  2. Monitoring & alerting - Implement Keycloak metrics collection and alerting for authentication failures, latency, etc.
  3. Disaster recovery testing - Regularly test backup/restore procedures

Component Trade-off Matrix

Aspect Self-Managed Keycloak External IdP (e.g., Auth0, Okta)
Complexity High Low
Operational overhead Significant Minimal
Control Full Limited
Cost Infrastructure only License/subscription fees
Multi-tenancy Native (realms) Depends on provider
Customization Unlimited Provider constraints
Compliance Self-managed Provider-dependent

Risk Mitigation

Risk Mitigation
Data loss Implement automated backups with tested restore procedures
Weak credentials Override Bitnami secret generation with strong defaults
Privilege escalation Reduce service account permissions to minimum required
Network exposure Restrict access via NetworkPolicies
Java vulnerabilities Keep Keycloak updated, monitor CVEs
Knowledge gap Document operational procedures, consider training