Resource Broker - Overview & Architecture Analysis¶
Relevance for CloudAPI Project
NOT IMMEDIATELY RELEVANT - Resource Broker is a future enhancement ("cherry on top") for our CloudAPI/CLI/Portal scenario.
Current Priority: CloudAPI, CLI, Portal implementation Future Consideration: Resource Broker as cloud broker layer once core CloudAPI is stable
Why Document Now?: Understanding Platform Mesh's full capabilities helps inform our architecture decisions, even though we won't implement this advanced feature initially.
Executive Summary
Resource Broker is a Kubernetes operator that abstracts multiple vendor APIs into unified, transferable specifications, enabling zero-downtime resource migration and dynamic provider selection. It solves vendor lock-in by allowing platforms to standardize how databases, certificates, and similar resources are requested and provisioned across diverse backends.
Key Innovation: Change a single field (e.g., domain suffix) and resources automatically migrate between providers without application code changes.
Our Use Case: After CloudAPI/CLI/Portal are stable, Resource Broker could provide multi-cloud brokering capabilities - automatically routing cloud resources (compute, storage, databases) to different providers based on policies.
Status: Alpha (v1alpha1) | License: Apache 2.0 | Language: Go (84%) Repository: https://github.com/platform-mesh/resource-broker
Quick Facts¶
| Category | Details |
|---|---|
| Purpose | Multi-provider resource abstraction & zero-downtime migration |
| Maturity | Alpha (active development, 360 commits) |
| API Version | v1alpha1 (broker.platform-mesh.io) |
| Core CRDs | AcceptAPI, Migration, MigrationConfiguration |
| Integration | KCP APIExport/APIBinding, Virtual Workspaces |
| Key Feature | Zero-downtime lifecycle management with staged migrations |
| Use Cases | Multi-tenant SaaS, hybrid cloud, DR, cost optimization |
| Dependencies | KCP, cert-manager (example), KRO |
What is Resource Broker?¶
Core Value Proposition
Organizations can offer generic APIs while dynamically routing fulfillment to different providers based on policies and capabilities. No vendor lock-in, no manual migrations, no downtime.
Resource Broker abstracts multiple vendor APIs into unified specifications, solving the fundamental problem that "running services requires other services".
The Problem It Solves¶
Traditional Approach Problems
Vendor Lock-In
- Applications directly depend on specific provider APIs (AWS RDS, Azure Database)
- Changing providers requires code refactoring
Manual Migrations
- Moving workloads requires downtime windows
- Complex coordination across teams
- High risk of data loss or service disruption
Platform Complexity
- Each provider has different APIs and configurations
- Operations teams must learn multiple systems
- No unified monitoring or management
No Abstraction
- Consumers must know provider-specific details
- Cannot switch providers without application changes
- Testing against multiple providers is difficult
Resource Broker Solution¶
The Magic of Abstraction
Change app.internal.corp → app.corp.com and the broker automatically switches providers without code changes, downtime, or manual intervention.
┌─────────────────────────────────────────────────────────────┐
│ Consumer Application │
│ (Generic Certificate API) │
│ │
│ kubectl apply -f certificate.yaml │
│ spec: │
│ fqdn: app.internal.corp ← Change this one field! │
└────────────────────────┬────────────────────────────────────┘
│
│ Creates: Certificate
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Resource Broker │
│ (Routing & Lifecycle Management) │
│ │
│ • Filter Matching • Zero-Downtime Migration │
│ • Provider Selection • Automated Cutover │
└──────┬──────────────────────────────────┬───────────────────┘
│ │
│ *.internal.corp │ *.corp.com
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ InternalCA │ │ ExternalCA │
│ Provider │ │ Provider │
│ (cert-manager) │ │ (Let's Encrypt)│
│ Private PKI │ │ Public CA │
└─────────────────┘ └─────────────────┘
Real-World Scenario
Before Resource Broker: Migrating from internal CA to Let's Encrypt
- Update application code (certificate paths, CA bundles)
- Change deployment configs (secrets, volumes)
- Schedule downtime window (coordinate with stakeholders)
- Manual certificate reissuance (for all apps)
- Update secret references (in multiple places)
- Test thoroughly (high risk)
- Downtime: 2-4 hours
With Resource Broker:
- Change one field:
fqdn: app.corp.com(or label) - Broker detects change and matches new provider
- Routes to Let's Encrypt automatically
- New certificate issued in background
- Zero downtime cutover
- Automatic secret sync to consumer
- Downtime: 0 seconds
Architecture¶
Three-Tier Design Philosophy
Resource Broker uses a separation of concerns architecture where consumers, coordination logic, and providers are completely decoupled. This enables dynamic routing, zero-downtime migrations, and vendor independence.
Three Distinct Roles¶
1. Platform/Coordination Cluster¶
Platform Layer Details
Purpose: Central control plane for routing and lifecycle management
Hosts:
- Resource Broker Operator
- KCP Control Plane
- APIExports (AcceptAPI, Generic Resource APIs)
- Virtual Workspaces
Responsibilities:
- Route requests to matching providers
- Record migration states
- Orchestrate zero-downtime cutover
- Manage provider health and availability
- Enforce routing policies
Location: Typically shared infrastructure cluster
2. Consumer Clusters/Workspaces¶
Consumer Layer Details
Purpose: Where users create high-level resource requests
User Experience:
# Generic API - no provider-specific details!
apiVersion: example.com/v1alpha1
kind: Certificate
metadata:
name: my-app-cert
spec:
fqdn: app.internal.corp
validity: 90d
Benefits:
- Simplified API: No provider knowledge needed
- Abstraction: Same API across all providers
- Portability: Move between providers seamlessly
- Consistency: Uniform resource model
- Security: No direct provider access needed
Location: Per-team/per-tenant workspaces
3. Provider Clusters/Workspaces¶
Provider Layer Details
Purpose: Execute actual resource provisioning
Components:
- Specialized controllers (cert-manager, database operators)
- AcceptAPI declarations (capability advertising)
- Physical infrastructure (databases, CAs, storage)
Capabilities:
# Provider declares: "I can serve Certificates for *.internal.corp"
kind: AcceptAPI
spec:
gvr:
resource: certificates
filters:
- key: spec.fqdn
suffix: ".internal.corp"
Benefits:
- Specialization: Each provider optimized for specific use cases
- Isolation: Providers don't see other providers
- Flexibility: Add/remove providers dynamically
- Cost Control: Route to cheaper providers for dev/test
Location: Per-provider infrastructure (cloud regions, on-prem, etc.)
Component Diagram¶
┌────────────────────────────────────────────────────────────────┐
│ Platform Cluster (KCP) │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Resource Broker Operator │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐ │ │
│ │ │ Routing │ │ Migration │ │ Provider │ │ │
│ │ │ Logic │ │ Orchestrator │ │ Selection │ │ │
│ │ └──────────────┘ └──────────────┘ └───────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────┐ ┌───────────────────────────┐ │
│ │ AcceptAPI │ │ Generic Resource APIs │ │
│ │ (APIExport) │ │ (Certificate, Database) │ │
│ │ │ │ (APIExport) │ │
│ │ Providers bind │ │ Consumers bind │ │
│ │ to declare caps │ │ to create resources │ │
│ └──────────────────┘ └───────────────────────────┘ │
│ │
└────────┬────────────────────────────────────────┬──────────────┘
│ │
│ APIBinding │ APIBinding
│ (Provider capabilities) │ (Consumer requests)
│ │
┌────▼─────────────────┐ ┌───────▼──────────────┐
│ Provider Workspace │ │ Consumer Workspace │
│ (InternalCA) │ │ (Team-A) │
│ │ │ │
│ ┌────────────────┐ │ │ ┌─────────────────┐│
│ │ AcceptAPI │ │ │ │ Certificate ││
│ │ - GVR: Cert │ │ │ │ - FQDN: app... ││
│ │ - Filter: │ │ │ │ - Status: Ready││
│ │ *.internal │ │ │ └─────────────────┘│
│ └────────────────┘ │ │ │
│ │ │ Secret synced │
│ cert-manager + │ │ automatically │
│ KRO │ │ │
│ │ └──────────────────────┘
│ Physical PKI │
│ │
└──────────────────────┘
Key Features¶
1. AcceptAPI Declaration¶
Provider Capability Advertising
Providers declare which APIs they can service with optional filtering based on resource properties.
Example: Accept certificates only for internal domains
apiVersion: broker.platform-mesh.io/v1alpha1
kind: AcceptAPI
metadata:
name: internal-ca-certificates
namespace: internal-ca-provider
spec:
gvr:
group: example.com
version: v1alpha1
resource: certificates
filters:
- key: spec.fqdn
suffix: ".internal.corp"
- key: spec.certType
valueIn: ["server", "client"]
status:
conditions:
- type: Available
status: "True"
Filter Capabilities
Suffix Matching: suffix: ".internal.corp" → Matches app.internal.corp
Value Lists: valueIn: ["postgres", "mysql"] → Matches specific values
Numeric Ranges: boundary: {min: 10, max: 1000} → Storage between 10-1000 GB
Multiple Filters: Logical AND - resource must satisfy ALL filters
2. Zero-Downtime Lifecycle Management¶
The Killer Feature
When a provider can no longer back a resource, the operator provisions a replacement elsewhere before switching consumers over. Critical for stateful services like databases.
How it Works:
sequenceDiagram
participant C as Consumer
participant RB as Resource Broker
participant P1 as Provider A
participant P2 as Provider B
C->>RB: Resource no longer matches Provider A
RB->>P2: Provision replacement on Provider B
P2->>RB: Resource Ready
RB->>RB: Initial data copy (if stateful)
RB->>C: Switch pointer to Provider B
RB->>P1: Cleanup old resource
- Detect Change: Resource properties no longer match current provider
- Provision New: Create replacement on matching provider (background)
- Data Copy: Replicate state if stateful (databases, storage)
- Cutover: Switch consumer reference (atomic operation)
- Cleanup: Remove old resource after confirmation
Database Migration Example
Scenario: Move PostgreSQL from AWS RDS to Azure Database
Traditional Approach:
- Downtime: 4-6 hours
- Risk: Data loss, connection errors
- Team coordination: DBAs, DevOps, App teams
With Resource Broker:
- Downtime: 0 seconds (read-only mode: ~30 seconds)
- Automated: Data copy → Cutover → Cleanup
- Single operator: Update label or field
3. Migration Framework¶
Staged Migration Control
Controlled transitions between providers via Migration and MigrationConfiguration resources with defined stages.
Migration Stages:
stages:
1. Initial Data Copy
├─ Backup current data
├─ Replicate to new provider
└─ Verify consistency
2. Cutover
├─ Enable read-only mode (brief)
├─ Final delta sync
├─ Switch active pointer
└─ Verify application connectivity
3. Finalize
├─ Monitor new provider
├─ Cleanup old resources
└─ Remove migration artifacts
Example MigrationConfiguration:
apiVersion: broker.platform-mesh.io/v1alpha1
kind: MigrationConfiguration
metadata:
name: database-aws-to-azure
spec:
from:
group: database.example.com
version: v1alpha1
kind: Database
to:
group: azure.database.example.com
version: v1alpha1
kind: AzureDatabase
stages:
- name: initial-copy
successConditions:
- "status.copyProgress == 100"
- "status.consistencyCheck == 'passed'"
templates:
backup-job: |
apiVersion: batch/v1
kind: Job
metadata:
name: db-backup-{{ .From.Name }}
spec:
template:
spec:
containers:
- name: pg-dump
image: postgres:15
command: ["pg_dump", "-h", "{{ .From.Host }}"]
progress: true # Allow progression to next stage
- name: cutover
successConditions:
- "status.ready == true"
- "status.connections > 0"
templates:
switch-config: |
# Update connection strings
# Enable new endpoint
Success Conditions
Each stage has CEL expressions that must evaluate to true before progressing. This ensures data integrity and prevents premature cutover.
Custom Resources (CRDs)¶
AcceptAPI¶
Purpose
Provider capability declaration - advertises which APIs a provider can serve
Structure:
apiVersion: broker.platform-mesh.io/v1alpha1
kind: AcceptAPI
metadata:
name: my-provider-api
namespace: provider-workspace
spec:
gvr: # GroupVersionResource to accept
group: example.com
version: v1alpha1
resource: databases
filters: # Optional resource constraints
- key: spec.engine
valueIn: ["postgres", "mysql"]
- key: spec.storage
boundary:
min: 10
max: 1000
status:
conditions:
- type: Available
status: "True"
reason: ProviderReady
Key Fields:
spec.gvr: The API this provider can servespec.filters: Optional constraints (which resources to accept)status.conditions: Provider availability status
Migration¶
Purpose
Resource migration between providers with zero downtime
Structure:
apiVersion: broker.platform-mesh.io/v1alpha1
kind: Migration
metadata:
name: db-migration-aws-to-azure
spec:
from: # Source resource
gvk:
group: example.com
version: v1alpha1
kind: Database
name: production-db
namespace: default
clusterName: aws-provider
to: # Target resource
gvk:
group: example.com
version: v1alpha1
kind: Database
name: production-db
namespace: default
clusterName: azure-provider
status:
state: CutoverCompleted # Pending | InitialInProgress | CutoverCompleted | Failed
stage: finalize
id: migration-12345
conditions:
- type: DataCopied
status: "True"
- type: CutoverComplete
status: "True"
Key Fields:
spec.from: Source resource referencespec.to: Target resource referencestatus.state: Migration phase (lifecycle progression)
MigrationConfiguration¶
Purpose
Define HOW to migrate between two resource types (the migration playbook)
Structure:
apiVersion: broker.platform-mesh.io/v1alpha1
kind: MigrationConfiguration
metadata:
name: database-migration-config
spec:
from: # Source GVK
group: example.com
version: v1alpha1
kind: Database
to: # Target GVK
group: azure.example.com
version: v1alpha1
kind: AzureDatabase
stages: # Ordered migration steps
- name: initial-copy
successConditions: # CEL expressions
- "status.copyProgress == 100"
templates: # Kubernetes resources to create
backup-job: |
# Job YAML template
progress: true # Advance to next stage on success
Key Fields:
spec.from/to: Source and target GVKsspec.stages: Ordered migration steps with success criteriaspec.stages[].templates: Kubernetes resources to create per stage
Integration with Platform Mesh¶
KCP Workspaces¶
KCP APIExport/APIBinding Pattern
Resource Broker leverages KCP's APIExport/APIBinding mechanism for workspace isolation and virtual resource access.
Architecture:
Platform Workspace (root:platform-mesh)
├── AcceptAPI (APIExport) ← Providers bind this to advertise capabilities
├── Certificate (APIExport) ← Consumers bind this to request resources
└── Resource Broker Operator (sees all via Virtual Workspace)
Provider Workspace (root:providers:internal-ca)
├── APIBinding → AcceptAPI (from platform)
├── AcceptAPI CR (declares: "I serve *.internal.corp certs")
└── Compute Cluster (physical cert-manager)
Consumer Workspace (root:orgs:acme:production)
├── APIBinding → Certificate (from platform)
└── Certificate CR (requests: cert for app.internal.corp)
Benefits of this Architecture
- No Direct Access: Consumers don't need provider kubeconfigs or credentials
- Workspace Isolation: Providers can't see other providers or consumers
- Virtual Workspaces: Resource Broker sees all resources through APIExport virtual workspaces
- Multi-Tenancy: Perfect for SaaS platforms with per-customer workspaces
Virtual Workspace Magic¶
Request Flow Through Virtual Workspaces
1. Consumer creates Certificate
├─ Workspace: root:orgs:acme:production
└─ Resource: Certificate(fqdn: app.internal.corp)
2. APIExport Virtual Workspace exposes it
├─ Resource Broker sees it via VirtualWorkspace
└─ All consumers' resources visible in one view
3. Resource Broker matches filters
├─ InternalCA: *.internal.corp MATCH
├─ ExternalCA: *.corp.com NO MATCH
└─ Routes to InternalCA
4. Provider sees resource via APIBinding
├─ InternalCA workspace has APIBinding to Certificate APIExport
└─ Certificate appears in provider workspace
5. Provider fulfills request
├─ KRO creates cert-manager Certificate
├─ cert-manager issues certificate
└─ Stores in Secret
6. Secret syncs back through Virtual Workspace
├─ api-syncagent syncs from provider cluster to KCP
├─ Secret appears in consumer workspace
└─ Consumer sees Certificate status: Ready + Secret reference
Use Cases¶
1. Multi-Tenant SaaS Platforms¶
Scenario: Different Backends Per Customer
Problem: Different customers need different database backends based on compliance, performance, or cost requirements.
Solution with Resource Broker:
# Customer A (GDPR compliance - needs EU data residency)
apiVersion: example.com/v1alpha1
kind: Database
metadata:
name: customer-a-db
labels:
region: eu-central-1
spec:
engine: postgres
storage: 100Gi
# Broker routes to EU-based provider automatically
Providers:
- EU Provider: Accepts
region: eu-* - US Provider: Accepts
region: us-* - On-Prem Provider: Accepts
deployment-model: on-premises
Customer Experience: Same API, automatic routing based on labels/fields
2. Disaster Recovery & Migration¶
Scenario: Provider Failover
Problem: Need to migrate databases from failing provider to healthy one with minimal downtime.
Solution:
apiVersion: broker.platform-mesh.io/v1alpha1
kind: Migration
metadata:
name: failover-migration
spec:
from:
name: production-db
clusterName: failing-provider
to:
name: production-db
clusterName: healthy-provider
Timeline:
- 00:00 - Migration started
- 00:15 - Initial data copy complete (15 min)
- 00:16 - Cutover (30 seconds downtime)
- 00:20 - Finalize and cleanup
- Total Downtime: 30 seconds (vs. hours traditional)
3. Cost Optimization¶
Scenario: Environment-Based Routing
Problem: Want to use cheaper providers for dev/staging, expensive enterprise providers for production.
Solution:
# Dev/Staging: Free Let's Encrypt
kind: AcceptAPI
metadata:
name: letsencrypt-ca
spec:
gvr:
resource: certificates
filters:
- key: metadata.labels.environment
valueIn: ["dev", "staging"]
---
# Production: Paid Enterprise CA with SLA
kind: AcceptAPI
metadata:
name: enterprise-ca
spec:
gvr:
resource: certificates
filters:
- key: metadata.labels.environment
valueIn: ["production"]
Cost Savings: ~70% reduction by routing non-prod to free providers
4. Hybrid Cloud¶
Scenario: Unified API Across Cloud and On-Prem
Problem: Need to support both on-premises PKI and cloud-based certificate authorities.
Solution: Single Certificate API, multiple providers
Certificate API (Generic)
├─ InternalCA (on-prem PKI) → *.internal.corp
├─ Let's Encrypt (cloud) → *.corp.com
└─ DigiCert (enterprise) → *.public.corp.com
Developer Experience: Same kubectl apply -f certificate.yaml regardless of target provider
Technical Details¶
Technology Stack¶
| Component | Technology | Version |
|---|---|---|
| Language | Go | 1.21+ |
| Framework | controller-runtime | v0.22+ |
| API Machinery | k8s.io/apimachinery | v0.35+ |
| KCP Integration | github.com/kcp-dev/kcp | Latest |
| Testing | Ginkgo + Gomega | - |
| Build | Docker + make | - |
Repository Structure¶
resource-broker/
├── api/
│ ├── broker/v1alpha1/ # Core CRD definitions
│ │ ├── acceptapi_types.go
│ │ ├── migration_types.go
│ │ └── migrationconfiguration_types.go
│ └── example/v1alpha1/ # Example resource types
│
├── cmd/ # Operator binary entry points
│ └── main.go
│
├── pkg/ # Core operator logic
│ ├── controllers/ # Reconciliation loops
│ │ ├── acceptapi_controller.go
│ │ └── migration_controller.go
│ └── routing/ # Provider selection logic
│
├── config/ # Kubernetes manifests
│ ├── crd/ # CRD YAML files
│ ├── rbac/ # RBAC permissions
│ └── manager/ # Operator deployment
│
├── examples/
│ └── kcp-certs/ # Certificate brokering example
│ ├── README.md # Step-by-step walkthrough
│ └── setup.sh
│
├── contrib/kcp/ # KCP integration components
├── dist/chart/ # Helm chart for deployment
└── test/ # Test suites
├── e2e/ # End-to-end tests
└── integration/ # Integration tests
Comparison with Alternatives¶
vs. Crossplane¶
| Feature | Resource Broker | Crossplane |
|---|---|---|
| Primary Focus | Dynamic routing & zero-downtime migration | Infrastructure provisioning (IaC) |
| Provider Model | KCP workspaces + APIExport | Provider packages |
| Migration Support | Built-in with staged migrations | Manual (requires custom logic) |
| Multi-Tenancy | Native (KCP integration) | Requires additional setup |
| Abstraction Level | High (generic APIs, filters) | Medium (compositions) |
| Use Case | Multi-tenant SaaS, frequent migrations | GitOps IaC, cloud provisioning |
When to Choose
Resource Broker: Multi-tenant platforms, zero-downtime requirements, dynamic provider selection
Crossplane: Single-tenant, GitOps workflows, cloud infrastructure provisioning
vs. Service Catalog¶
| Feature | Resource Broker | Service Catalog (DEPRECATED) |
|---|---|---|
| Status | Active development | Deprecated (archived) |
| Routing | Dynamic, policy-based | Static broker selection |
| Migration | Automated with stages | Not supported |
| KCP Integration | Native | None |
| Filter Matching | Advanced (suffix, valueIn, boundaries) | Basic plans |
Deployment¶
Prerequisites¶
# Required components
- Kubernetes 1.28+
- KCP v0.30.0+
- kubectl 1.28+
- Helm 3.12+ (optional, for Helm installation)
Helm Installation¶
# Add Helm repository (when available)
helm repo add platform-mesh https://platform-mesh.github.io/helm-charts
helm repo update
# Install resource-broker
helm install resource-broker platform-mesh/resource-broker \
--namespace platform-mesh-system \
--create-namespace \
--set image.tag=latest
# Verify installation
kubectl get deploy -n platform-mesh-system resource-broker
kubectl get crd | grep broker.platform-mesh.io
Manual Installation¶
# Clone repository
git clone https://github.com/platform-mesh/resource-broker.git
cd resource-broker
# Install CRDs
make install
# Deploy operator
make deploy IMG=platform-mesh/resource-broker:latest
# Verify CRDs installed
kubectl get crd | grep broker.platform-mesh.io
# Expected output:
# acceptapis.broker.platform-mesh.io
# migrations.broker.platform-mesh.io
# migrationconfigurations.broker.platform-mesh.io
# Check operator running
kubectl get pods -n resource-broker-system
Configuration¶
Operator Environment Variables
Configure Resource Broker behavior via deployment environment variables:
env:
- name: ENABLE_MIGRATION
value: "true" # Enable migration features
- name: RECONCILE_INTERVAL
value: "30s" # Reconciliation frequency
- name: LEADER_ELECTION
value: "true" # Enable HA leader election
- name: METRICS_ADDR
value: ":8080" # Prometheus metrics endpoint
- name: HEALTH_PROBE_ADDR
value: ":8081" # Health/readiness probes
- name: LOG_LEVEL
value: "info" # Logging level (debug, info, warn, error)
Monitoring & Observability¶
Prometheus Metrics¶
Exposed Metrics
Resource Broker exposes Prometheus metrics on port 8080 by default:
# AcceptAPI availability
resource_broker_acceptapi_available{name="internal-ca", namespace="provider"} 1
# Active migrations
resource_broker_migrations_active{state="CutoverInProgress"} 3
resource_broker_migrations_total{state="Completed"} 45
# Provider selection timing
resource_broker_routing_duration_seconds{provider="internal-ca"} 0.05
# Reconciliation metrics
resource_broker_reconcile_duration_seconds{controller="acceptapi"} 0.12
resource_broker_reconcile_errors_total{controller="migration"} 2
Health Checks¶
# Liveness probe
curl http://localhost:8081/healthz
# Returns: 200 OK if operator is alive
# Readiness probe
curl http://localhost:8081/readyz
# Returns: 200 OK if operator is ready to serve requests
# Leader election status
kubectl get lease -n platform-mesh-system resource-broker
Logging¶
# View operator logs
kubectl logs -n platform-mesh-system deployment/resource-broker -f
# Filter for specific events
kubectl logs -n platform-mesh-system deployment/resource-broker | grep -i "routing\|migration"
# Enable debug logging
kubectl set env -n platform-mesh-system deployment/resource-broker LOG_LEVEL=debug
Troubleshooting¶
Common Issues¶
Resources Not Routed to Provider¶
Symptom
Consumer creates resource but it never gets fulfilled. Resource stays in Pending state.
Debugging:
# Check AcceptAPI status
kubectl get acceptapi -A
kubectl describe acceptapi internal-ca -n provider-workspace
# Verify filters match resource
kubectl get acceptapi internal-ca -n provider-workspace -o yaml | grep -A 10 filters
kubectl get certificate my-cert -o yaml | grep -A 10 spec
# Check resource-broker routing logs
kubectl logs -n platform-mesh-system deployment/resource-broker \
| grep -i "routing\|filter\|provider selection"
Common Causes:
- Filters don't match resource properties
- AcceptAPI not in Available state
- Provider workspace not bound to AcceptAPI APIExport
- Virtual Workspace not accessible to Resource Broker
Migration Stuck in InitialInProgress¶
Symptom
Migration starts but never progresses past InitialInProgress state.
Debugging:
# Check migration status
kubectl get migration db-migration -o yaml
# Check stage conditions (CEL expressions)
kubectl get migration db-migration -o jsonpath='{.status.conditions}' | jq .
# Check MigrationConfiguration success conditions
kubectl get migrationconfiguration db-config -o yaml | grep -A 5 successConditions
# View migration controller logs
kubectl logs -n platform-mesh-system deployment/resource-broker \
| grep -i "migration\|stage\|success"
Common Causes:
- Success conditions never satisfied (CEL expression always false)
- Backup/copy job failed
- Target provider unavailable
- Insufficient permissions
AcceptAPI Not Available¶
Symptom
AcceptAPI CR shows Available: False in conditions.
Debugging:
# Check provider workspace APIBinding
kubectl get apibinding -n provider-workspace
# Verify AcceptAPI APIExport exists in platform
kubectl get apiexport acceptapi -n platform-workspace
# Check if Resource Broker can see AcceptAPI (via Virtual Workspace)
kubectl get acceptapi --all-namespaces
# Check provider cluster connectivity
kubectl get clusters -n provider-workspace
Development¶
Building from Source¶
# Clone repository
git clone https://github.com/platform-mesh/resource-broker.git
cd resource-broker
# Build operator binary
make build
# Run unit tests
make test
# Run integration tests
make test-integration
# Run locally (against current kubeconfig)
export KUBECONFIG=~/.kube/config
make run
Running Examples¶
# Run KCP certificate brokering example
cd examples/kcp-certs
# Setup clusters and install components
./setup.sh
# Follow step-by-step walkthrough
cat README.md
# Cleanup
./cleanup.sh
Why Document Resource Broker Now?¶
Architecture Decisions:
- Design CloudAPI with abstraction layers
- Avoid tight coupling to specific cloud providers
- Use resource models compatible with future brokering
API Design:
- Include label/annotation fields for future policies
- Support generic resource specs (not provider-specific)
- Design for eventual provider abstraction
Example - Good CloudAPI Design (Phase 1, but broker-ready):
apiVersion: cloudapi.example.com/v1alpha1
kind: VirtualMachine
metadata:
name: web-server
labels:
environment: production # Future: Route to premium providers
region: eu-central-1 # Future: GDPR-compliant providers only
spec:
# Generic specs, not AWS-specific
cpu: 4
memory: 16Gi
storage: 100Gi
# Provider field optional - can be auto-selected later
provider: aws # Explicit now, automatic in Phase 3
External Links¶
- Repository: https://github.com/platform-mesh/resource-broker
- KCP Project: https://github.com/kcp-dev/kcp
- Platform Mesh: https://github.com/platform-mesh
- Examples: https://github.com/platform-mesh/resource-broker/tree/main/examples
Project Status¶
Alpha Maturity
Resource Broker is in alpha stage (v1alpha1). API may change, production use should be carefully evaluated.
Activity Indicators:
- 360 commits
- 7 open issues
- 6 pull requests
- Active development
- Apache 2.0 License