Kubernetes GraphQL Gateway¶

Executive Summary

The Kubernetes GraphQL Gateway is a Go-based service that exposes Kubernetes resources through a GraphQL API, making cluster objects queryable and consumable in a developer-friendly manner. It automatically converts Kubernetes OpenAPI specifications into GraphQL schemas, enabling UI applications to fetch exactly the data they need with a single flexible query instead of multiple REST API calls.

GraphQL Basics¶

What is GraphQL?

GraphQL is a query language for APIs and a runtime for executing those queries. Created by Facebook in 2012 and open-sourced in 2015, it provides a more flexible and efficient alternative to REST APIs.

The Problem with REST APIs¶

Traditional REST Approach:

# To display a pod with its namespace and node information, you need multiple calls:

# 1. Get pod details
GET /api/v1/namespaces/default/pods/my-pod

# 2. Get namespace details
GET /api/v1/namespaces/default

# 3. Get node details
GET /api/v1/nodes/worker-node-1

# Result: 3 separate HTTP requests, lots of unnecessary data

Problems:

Over-fetching: REST endpoints return all fields, even if you only need a few
Under-fetching: Often need multiple requests to get all related data
API Versioning: Breaking changes require new endpoint versions
Documentation Drift: API documentation can become outdated

GraphQL Solution¶

GraphQL Approach:

# Single query to get exactly what you need
query {
  v1 {
    Pod(name: "my-pod", namespace: "default") {
      metadata {
        name
        namespace
      }
      spec {
        nodeName
        containers {
          name
          image
        }
      }
      status {
        phase
      }
    }
  }
}

# Result: 1 HTTP request, only the fields you asked for

Benefits:

Precise Data Fetching: Request exactly the fields you need, nothing more
Single Request: Fetch related data in one query
Strongly Typed: Schema defines all types and their relationships
Self-Documenting: Introspection reveals the entire API structure
Backward Compatible: Add new fields without breaking existing clients

Core GraphQL Concepts¶

Key Terminology

1. Schema¶

The schema defines the API structure using GraphQL Schema Definition Language (SDL):

# Type definition
type Pod {
  metadata: ObjectMeta!
  spec: PodSpec!
  status: PodStatus
}

# Query definition (read operations)
type Query {
  Pod(name: String!, namespace: String!): Pod
  Pods(namespace: String!): [Pod!]!
}

# Mutation definition (write operations)
type Mutation {
  createPod(input: PodInput!): Pod!
  deletePod(name: String!, namespace: String!): Boolean!
}

# Subscription definition (real-time updates)
type Subscription {
  podUpdated(namespace: String!): Pod!
}

2. Queries¶

Queries fetch data (equivalent to REST GET):

# Simple query
query {
  Pod(name: "nginx", namespace: "default") {
    metadata {
      name
    }
  }
}

# Query with variables
query GetPod($name: String!, $namespace: String!) {
  Pod(name: $name, namespace: $namespace) {
    metadata {
      name
      labels
    }
    spec {
      containers {
        name
        image
      }
    }
  }
}

# Variables (passed separately)
{
  "name": "nginx",
  "namespace": "default"
}

3. Mutations¶

Mutations modify data (equivalent to REST POST/PUT/DELETE):

mutation CreatePod($input: PodInput!) {
  createPod(input: $input) {
    metadata {
      name
      uid
    }
    status {
      phase
    }
  }
}

4. Subscriptions¶

Subscriptions provide real-time updates via WebSocket:

subscription WatchPods($namespace: String!) {
  podUpdated(namespace: $namespace) {
    metadata {
      name
    }
    status {
      phase
    }
  }
}

5. Fragments¶

Fragments enable reusable field selections:

fragment PodMetadata on Pod {
  metadata {
    name
    namespace
    labels
    annotations
  }
}

query {
  Pod(name: "nginx", namespace: "default") {
    ...PodMetadata
    spec {
      nodeName
    }
  }
}

GraphQL vs REST¶

Aspect	REST	GraphQL
Endpoints	Multiple endpoints per resource	Single endpoint (`/graphql`)
Data Fetching	Fixed response structure	Client specifies fields
Over-fetching	Common (get all fields)	None (get only requested fields)
Under-fetching	Common (multiple requests)	Rare (related data in one query)
Versioning	URL versioning (v1, v2)	Schema evolution (additive changes)
Type System	Not standardized	Strongly typed with introspection
Documentation	External (Swagger, OpenAPI)	Self-documenting (introspection)
Caching	HTTP caching (easy)	Requires custom caching
Learning Curve	Familiar	Requires learning query language

What is the Kubernetes GraphQL Gateway?¶

Purpose and Goals

The Kubernetes GraphQL Gateway provides a generic, reusable way to expose Kubernetes resources from within a cluster using GraphQL. It enables UI developers to consume Kubernetes objects in a developer-friendly manner while leveraging the GraphQL ecosystem.

The Problem It Solves¶

Challenge: Kubernetes REST API Issues

Verbosity: REST responses include many fields applications don't need
Multiple Requests: Fetching related resources requires multiple API calls
Complexity: Direct Kubernetes API client libraries are complex for UI developers
Over-fetching: Bandwidth wasted on unnecessary data (especially for large lists)
Discovery: Finding available resources and their structure is difficult

Solution: GraphQL Interface

Efficient Queries: Fetch only needed fields, reduce bandwidth
Single Request: Get related resources in one query
Developer Experience: Simple query language, easier than Kubernetes client libraries
Self-Documenting: Introspection reveals all available resources and fields
Real-Time Updates: WebSocket subscriptions for live resource changes

Use Cases¶

Portal UI Applications:

Display resource dashboards (pods, deployments, services)
Show resource relationships (pod → node, deployment → replicaset → pods)
Real-time status updates via subscriptions
Custom resource views with only relevant fields

CLI Tools:

Query multiple clusters with unified interface
Build custom resource explorers
Monitor resource state changes

Integration Services:

Sync Kubernetes state to external systems
Build automation workflows based on resource state
Aggregate data from multiple clusters

Architecture¶

Two-Component System

The Kubernetes GraphQL Gateway consists of two cooperating components: the Listener (watches clusters) and the Gateway (serves GraphQL API).

High-Level Architecture¶

┌──────────────────────────────────────────────────────────────────┐
│                   Kubernetes Clusters                             │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐     │
│  │ Cluster 1      │  │ Cluster 2      │  │ KCP Workspace  │     │
│  │ (prod)         │  │ (staging)      │  │ (org-acme)     │     │
│  │                │  │                │  │                │     │
│  │ - Pods         │  │ - Pods         │  │ - CRDs         │     │
│  │ - Deployments  │  │ - Deployments  │  │ - Custom Res   │     │
│  │ - Services     │  │ - Services     │  │                │     │
│  └────────┬───────┘  └────────┬───────┘  └────────┬───────┘     │
└───────────┼──────────────────┼──────────────────┼───────────────┘
            │                  │                  │
            │ Watch & Extract  │                  │
            │ OpenAPI Spec     │                  │
            │                  │                  │
            ▼                  ▼                  ▼
┌────────────────────────────────────────────────────────────────────┐
│                          Listener                                   │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │ Cluster/Workspace Watcher                                     │  │
│  │  - Monitors cluster API changes                               │  │
│  │  - Extracts OpenAPI specification                             │  │
│  │  - Writes spec files to local directory                       │  │
│  └──────────────────────────────────────────────────────────────┘  │
└─────────────────────────────┬──────────────────────────────────────┘
                              │
                              │ Write OpenAPI specs
                              │
                              ▼
                   ┌──────────────────────┐
                   │   ./bin/definitions   │
                   │   (File System)       │
                   │                       │
                   │  ├─ cluster-1.json    │
                   │  ├─ cluster-2.json    │
                   │  └─ workspace-1.json  │
                   └──────────┬────────────┘
                              │
                              │ Read & Monitor
                              │
                              ▼
┌────────────────────────────────────────────────────────────────────┐
│                           Gateway                                   │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │ File System Watcher                                           │  │
│  │  - Monitors definitions directory                             │  │
│  │  - Detects changes to OpenAPI spec files                      │  │
│  │  - Triggers schema regeneration                               │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                              │                                      │
│                              ▼                                      │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │ Schema Generator                                              │  │
│  │  - Converts OpenAPI spec → GraphQL schema                     │  │
│  │  - Generates types for each Kubernetes resource               │  │
│  │  - Creates queries, mutations, subscriptions                  │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                              │                                      │
│                              ▼                                      │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │ GraphQL Server                                                │  │
│  │  - HTTP Server (port 3000)                                    │  │
│  │  - GraphQL Playground UI                                      │  │
│  │  - Separate endpoint per cluster/workspace                    │  │
│  │                                                                │  │
│  │  Endpoints:                                                   │  │
│  │  - http://localhost:3000/cluster-1/graphql                    │  │
│  │  - http://localhost:3000/cluster-2/graphql                    │  │
│  │  - http://localhost:3000/workspace-1/graphql                  │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                              │                                      │
│                              ▼                                      │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │ Resolver                                                      │  │
│  │  - Translates GraphQL queries → Kubernetes API calls          │  │
│  │  - Handles authentication (Bearer token forwarding)           │  │
│  │  - Manages subscriptions (WebSocket → K8s Watch API)          │  │
│  └──────────────────────────────────────────────────────────────┘  │
└─────────────────────────────┬──────────────────────────────────────┘
                              │
                              │ Execute K8s API calls
                              │
                              ▼
                   ┌──────────────────────┐
                   │ Kubernetes API Server │
                   │                       │
                   │ - GET /api/v1/pods    │
                   │ - POST /api/v1/pods   │
                   │ - WATCH /api/v1/pods  │
                   └───────────────────────┘

Component Details¶

Listener¶

Cluster Monitor

The Listener component watches one or more Kubernetes clusters (or KCP workspaces) and caches their OpenAPI specifications locally.

Responsibilities:

Connect to Clusters/Workspaces: Uses kubeconfig credentials
Extract OpenAPI Spec: Fetches cluster's OpenAPI definition (/openapi/v2 or /openapi/v3)
Monitor Changes: Watches for API changes (new CRDs, version updates)
Write Spec Files: Saves OpenAPI definitions to ./bin/definitions/ directory
Update on Change: Regenerates spec files when cluster API changes

Output:

./bin/definitions/
├── cluster-prod.json       # Production cluster OpenAPI spec
├── cluster-staging.json    # Staging cluster OpenAPI spec
├── root.json              # KCP root workspace spec
└── root:org-acme.json     # KCP organization workspace spec

File Naming:

ClusterAccess mode: Uses cluster name or identifier
KCP mode: Uses workspace path (e.g., root:org-acme)

Gateway¶

GraphQL API Server

The Gateway component reads OpenAPI specification files and dynamically generates GraphQL schemas, serving them via HTTP endpoints.

Responsibilities:

Monitor Definitions Directory: Watch for new/changed OpenAPI spec files
Generate GraphQL Schemas: Convert OpenAPI specs to GraphQL types
Create Endpoints: Generate separate GraphQL endpoint per cluster/workspace
Serve GraphQL Playground: Interactive browser-based query interface
Execute Queries: Translate GraphQL → Kubernetes API calls
Handle Authorization: Forward Bearer tokens to Kubernetes API

Endpoint Structure:

http://localhost:3000/<cluster-or-workspace>/graphql

Examples: - http://localhost:3000/cluster-prod/graphql - http://localhost:3000/root/graphql - http://localhost:3000/root:org-acme/graphql

GraphQL Playground:

Open any endpoint in a browser to access the interactive GraphQL playground: - Write and execute queries - Explore schema via documentation explorer - View real-time results - Save and share queries

How It Works¶

OpenAPI to GraphQL Conversion

The gateway automatically converts Kubernetes OpenAPI specifications into GraphQL schemas, mapping Kubernetes resources to GraphQL types with queries, mutations, and subscriptions.

Schema Generation Process¶

1. Read OpenAPI Specification

The Listener extracts the cluster's OpenAPI spec:

{
  "swagger": "2.0",
  "info": { "title": "Kubernetes", "version": "v1.28.0" },
  "paths": {
    "/api/v1/namespaces/{namespace}/pods": {
      "get": { ... },
      "post": { ... }
    }
  },
  "definitions": {
    "io.k8s.api.core.v1.Pod": {
      "type": "object",
      "properties": {
        "metadata": { "$ref": "#/definitions/io.k8s.apimachinery.pkg.apis.meta.v1.ObjectMeta" },
        "spec": { "$ref": "#/definitions/io.k8s.api.core.v1.PodSpec" },
        "status": { "$ref": "#/definitions/io.k8s.api.core.v1.PodStatus" }
      }
    }
  }
}

2. Generate GraphQL Types

Convert OpenAPI definitions to GraphQL types:

# Generated from OpenAPI definition
type v1_Pod {
  apiVersion: String
  kind: String
  metadata: v1_ObjectMeta
  spec: v1_PodSpec
  status: v1_PodStatus
}

type v1_ObjectMeta {
  name: String
  namespace: String
  labels: StringMap
  annotations: StringMap
  creationTimestamp: String
  uid: String
}

type v1_PodSpec {
  containers: [v1_Container!]!
  nodeName: String
  restartPolicy: String
  serviceAccountName: String
}

type v1_Container {
  name: String!
  image: String!
  ports: [v1_ContainerPort!]
  env: [v1_EnvVar!]
}

# StringMap for labels/annotations (handles dotted keys)
scalar StringMap

3. Generate Queries

Create GraphQL queries for fetching resources:

type Query {
  # Get single pod
  v1_Pod(name: String!, namespace: String!): v1_Pod

  # List all pods in namespace
  v1_Pods(namespace: String!): [v1_Pod!]!

  # List pods with label selector
  v1_Pods(
    namespace: String!
    labelSelector: StringMapInput
  ): [v1_Pod!]!

  # Custom queries
  typeByCategory(name: String!): [TypeInfo!]!
}

4. Generate Mutations

Create GraphQL mutations for modifying resources:

type Mutation {
  # Create pod
  createv1_Pod(
    namespace: String!
    body: v1_PodInput!
  ): v1_Pod!

  # Update pod
  updatev1_Pod(
    name: String!
    namespace: String!
    body: v1_PodInput!
  ): v1_Pod!

  # Delete pod
  deletev1_Pod(
    name: String!
    namespace: String!
  ): Boolean!

  # Patch pod (partial update)
  patchv1_Pod(
    name: String!
    namespace: String!
    body: JSONPatch!
  ): v1_Pod!
}

5. Generate Subscriptions

Create GraphQL subscriptions for real-time updates:

type Subscription {
  # Watch pod changes
  v1_Pod(
    name: String!
    namespace: String!
  ): v1_Pod!

  # Watch all pods in namespace
  v1_Pods(
    namespace: String!
    labelSelector: StringMapInput
  ): v1_Pod!
}

Query Execution Flow¶

Example Query:

query GetPod {
  v1 {
    Pod(name: "nginx", namespace: "default") {
      metadata {
        name
        namespace
        creationTimestamp
      }
      spec {
        containers {
          name
          image
          ports {
            containerPort
          }
        }
      }
      status {
        phase
        podIP
      }
    }
  }
}

Execution Steps:

Client sends GraphQL query to http://gateway:3000/cluster-prod/graphql
Gateway validates query against generated schema
Check field existence
Validate argument types
Verify authorization

Resolver translates to Kubernetes API call

GET /api/v1/namespaces/default/pods/nginx
Authorization: Bearer <token-from-client>

Kubernetes API returns full Pod object (JSON)
Resolver filters response to include only requested fields

Gateway returns GraphQL response:

{
  "data": {
    "v1": {
      "Pod": {
        "metadata": {
          "name": "nginx",
          "namespace": "default",
          "creationTimestamp": "2026-02-03T10:00:00Z"
        },
        "spec": {
          "containers": [{
            "name": "nginx",
            "image": "nginx:1.21",
            "ports": [{ "containerPort": 80 }]
          }]
        },
        "status": {
          "phase": "Running",
          "podIP": "10.244.1.5"
        }
      }
    }
  }
}

Subscription Execution Flow¶

Example Subscription:

subscription WatchPod {
  v1_Pod(name: "nginx", namespace: "default") {
    status {
      phase
    }
  }
}

Execution Steps:

Client establishes WebSocket connection

Gateway opens Kubernetes Watch stream

GET /api/v1/namespaces/default/pods/nginx?watch=true
Authorization: Bearer <token>

Kubernetes sends events as Pod changes:

{"type": "MODIFIED", "object": { ... }}
{"type": "MODIFIED", "object": { ... }}

Gateway filters events and sends GraphQL updates:

{
  "data": {
    "v1_Pod": {
      "status": { "phase": "Running" }
    }
  }
}

Client receives real-time updates via WebSocket

Operation Modes¶

Two Operation Modes

The gateway supports two distinct modes for different Kubernetes environments: KCP Mode for multi-tenant KCP deployments and ClusterAccess Mode for standard Kubernetes clusters.

KCP Mode¶

Environment Variable: ENABLE_KCP=true

Use Case: Multi-tenant KCP (Kubernetes Control Plane) environments with virtual workspaces

Architecture:

┌─────────────────────────────────────────────────────────────┐
│                      KCP Server                              │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐      │
│  │ root         │  │ root:orgs    │  │ root:orgs:   │      │
│  │ workspace    │  │ workspace    │  │ acme-corp    │      │
│  └──────────────┘  └──────────────┘  └──────────────┘      │
└────────────┬──────────────┬──────────────┬──────────────────┘
             │              │              │
             │ Watch        │              │
             │              │              │
             ▼              ▼              ▼
      ┌──────────────────────────────────────────┐
      │     Listener (KCP Mode)                   │
      │  - Connects to KCP workspaces             │
      │  - Extracts per-workspace OpenAPI specs   │
      │  - Supports virtual workspaces            │
      └──────────────┬───────────────────────────┘
                     │
                     ▼
         ./bin/definitions/
         ├── root.json
         ├── root:orgs.json
         └── root:orgs:acme-corp.json
                     │
                     ▼
      ┌──────────────────────────────────────────┐
      │     Gateway (KCP Mode)                    │
      │  Endpoints:                               │
      │  - /root/graphql                          │
      │  - /root:orgs/graphql                     │
      │  - /root:orgs:acme-corp/graphql          │
      └───────────────────────────────────────────┘

Features:

Workspace Isolation: Each workspace gets separate GraphQL endpoint
Virtual Workspaces: Support for KCP virtual workspace views
Multi-Tenancy: Different schemas per organization/tenant
Dynamic Workspaces: Automatically discovers new workspaces

Configuration:

# Listener config for KCP mode
apiVersion: gateway.platform-mesh.io/v1alpha1
kind: ListenerConfig
spec:
  mode: kcp
  kcpConfig:
    server: https://kcp.platform-mesh.io:6443
    workspaces:
    - root
    - root:orgs
    - root:orgs:acme-corp
    virtualWorkspaces:
      enabled: true

Endpoint Examples:

# Root workspace
curl http://gateway:3000/root/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query": "{ v1 { Pods(namespace: \"default\") { metadata { name } } } }"}'

# Organization workspace
curl http://gateway:3000/root:orgs:acme-corp/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query": "{ accounts_v1alpha1 { Account { metadata { name } } } }"}'

ClusterAccess Mode¶

Environment Variable: ENABLE_KCP=false (or unset)

Use Case: Standard Kubernetes clusters (multi-cluster, not KCP)

Architecture:

┌───────────────┐  ┌───────────────┐  ┌───────────────┐
│ Cluster Prod  │  │ Cluster Stage │  │ Cluster Dev   │
│ (kubeconfig1) │  │ (kubeconfig2) │  │ (kubeconfig3) │
└───────┬───────┘  └───────┬───────┘  └───────┬───────┘
        │                  │                  │
        │ Watch            │                  │
        │                  │                  │
        ▼                  ▼                  ▼
 ┌──────────────────────────────────────────────────┐
 │     Listener (ClusterAccess Mode)                 │
 │  - Connects to multiple clusters                  │
 │  - Reads cluster kubeconfigs                      │
 │  - Extracts per-cluster OpenAPI specs             │
 └──────────────┬───────────────────────────────────┘
                │
                ▼
    ./bin/definitions/
    ├── cluster-prod.json
    ├── cluster-stage.json
    └── cluster-dev.json
                │
                ▼
 ┌──────────────────────────────────────────────────┐
 │     Gateway (ClusterAccess Mode)                  │
 │  Endpoints:                                       │
 │  - /cluster-prod/graphql                          │
 │  - /cluster-stage/graphql                         │
 │  - /cluster-dev/graphql                           │
 └───────────────────────────────────────────────────┘

Features:

Multi-Cluster: Connect to multiple independent Kubernetes clusters
Standard K8s: Works with any Kubernetes 1.20+ cluster
Simple Setup: Just provide kubeconfigs
Cluster Discovery: Automatically detects configured clusters

Configuration:

# Listener config for ClusterAccess mode
apiVersion: gateway.platform-mesh.io/v1alpha1
kind: ListenerConfig
spec:
  mode: clusterAccess
  clusters:
  - name: prod
    kubeconfig: /configs/prod-kubeconfig.yaml
  - name: staging
    kubeconfig: /configs/staging-kubeconfig.yaml
  - name: dev
    kubeconfig: /configs/dev-kubeconfig.yaml

Endpoint Examples:

# Production cluster
curl http://gateway:3000/cluster-prod/graphql \
  -H "Authorization: Bearer $PROD_TOKEN" \
  -d '{"query": "{ v1 { Pods(namespace: \"default\") { metadata { name } } } }"}'

# Staging cluster
curl http://gateway:3000/cluster-staging/graphql \
  -H "Authorization: Bearer $STAGE_TOKEN" \
  -d '{"query": "{ v1 { Pods(namespace: \"default\") { metadata { name } } } }"}'

Mode Comparison¶

Aspect	KCP Mode	ClusterAccess Mode
Use Case	Multi-tenant KCP environments	Standard multi-cluster K8s
Endpoint Format	`/workspace-path/graphql`	`/cluster-name/graphql`
Multi-Tenancy	Native (workspaces)	Manual (separate clusters)
Virtual Workspaces	✅ Supported	❌ Not applicable
Cluster Discovery	Automatic (KCP API)	Manual (kubeconfig list)
Authorization	KCP RBAC per workspace	K8s RBAC per cluster
CRDs	Workspace-scoped	Cluster-scoped
Complexity	Higher (KCP required)	Lower (standard K8s)

GraphQL Schema & Queries¶

Query Examples

The gateway automatically generates a GraphQL schema from Kubernetes resources. Here are practical examples of common operations.

Example Schema (Generated)¶

type Query {
  # Core Kubernetes resources
  v1 {
    Pod(name: String!, namespace: String!): v1_Pod
    Pods(namespace: String!, labelSelector: StringMapInput): [v1_Pod!]!

    Service(name: String!, namespace: String!): v1_Service
    Services(namespace: String!): [v1_Service!]!

    ConfigMap(name: String!, namespace: String!): v1_ConfigMap
    ConfigMaps(namespace: String!): [v1_ConfigMap!]!
  }

  apps_v1 {
    Deployment(name: String!, namespace: String!): apps_v1_Deployment
    Deployments(namespace: String!): [apps_v1_Deployment!]!
  }

  # Custom queries
  typeByCategory(name: String!): [TypeInfo!]!
}

type Mutation {
  v1 {
    createPod(namespace: String!, body: v1_PodInput!): v1_Pod!
    updatePod(name: String!, namespace: String!, body: v1_PodInput!): v1_Pod!
    deletePod(name: String!, namespace: String!): Boolean!
  }
}

type Subscription {
  v1 {
    Pod(name: String!, namespace: String!): v1_Pod!
    Pods(namespace: String!, labelSelector: StringMapInput): v1_Pod!
  }
}

Query Examples¶

1. Get Single Pod¶

query GetNginxPod {
  v1 {
    Pod(name: "nginx", namespace: "default") {
      metadata {
        name
        namespace
        uid
        creationTimestamp
        labels
      }
      spec {
        nodeName
        containers {
          name
          image
          ports {
            containerPort
            protocol
          }
        }
      }
      status {
        phase
        podIP
        conditions {
          type
          status
        }
      }
    }
  }
}

Response:

{
  "data": {
    "v1": {
      "Pod": {
        "metadata": {
          "name": "nginx",
          "namespace": "default",
          "uid": "abc123",
          "creationTimestamp": "2026-02-03T10:00:00Z",
          "labels": {
            "app": "nginx"
          }
        },
        "spec": {
          "nodeName": "worker-1",
          "containers": [{
            "name": "nginx",
            "image": "nginx:1.21",
            "ports": [{
              "containerPort": 80,
              "protocol": "TCP"
            }]
          }]
        },
        "status": {
          "phase": "Running",
          "podIP": "10.244.1.5",
          "conditions": [
            { "type": "Ready", "status": "True" },
            { "type": "ContainersReady", "status": "True" }
          ]
        }
      }
    }
  }
}

2. List Pods with Label Selector¶

query ListAppPods {
  v1 {
    Pods(
      namespace: "production"
      labelSelector: { app: "backend" }
    ) {
      metadata {
        name
        labels
      }
      status {
        phase
        podIP
      }
    }
  }
}

With Variables:

query ListPods($namespace: String!, $labelSelector: StringMapInput) {
  v1 {
    Pods(namespace: $namespace, labelSelector: $labelSelector) {
      metadata {
        name
      }
      status {
        phase
      }
    }
  }
}

# Variables
{
  "namespace": "production",
  "labelSelector": {
    "app": "backend",
    "tier": "api"
  }
}

3. Get Deployment with ReplicaSets and Pods¶

query GetDeploymentWithPods {
  apps_v1 {
    Deployment(name: "api-server", namespace: "production") {
      metadata {
        name
      }
      spec {
        replicas
        selector {
          matchLabels
        }
        template {
          spec {
            containers {
              name
              image
            }
          }
        }
      }
      status {
        availableReplicas
        readyReplicas
      }
    }
  }

  v1 {
    Pods(
      namespace: "production"
      labelSelector: { app: "api-server" }
    ) {
      metadata {
        name
      }
      status {
        phase
      }
    }
  }
}

4. Custom Query - Resources by Category¶

query GetStorageResources {
  typeByCategory(name: "storage") {
    group
    version
    kind
    scope
  }
}

Response:

{
  "data": {
    "typeByCategory": [
      {
        "group": "",
        "version": "v1",
        "kind": "PersistentVolume",
        "scope": "Cluster"
      },
      {
        "group": "",
        "version": "v1",
        "kind": "PersistentVolumeClaim",
        "scope": "Namespaced"
      },
      {
        "group": "storage.k8s.io",
        "version": "v1",
        "kind": "StorageClass",
        "scope": "Cluster"
      }
    ]
  }
}

Mutation Examples¶

1. Create Pod¶

mutation CreateNginxPod {
  v1 {
    createPod(
      namespace: "default"
      body: {
        metadata: {
          name: "my-nginx"
          labels: { app: "nginx" }
        }
        spec: {
          containers: [{
            name: "nginx"
            image: "nginx:1.21"
            ports: [{
              containerPort: 80
            }]
          }]
        }
      }
    ) {
      metadata {
        name
        uid
      }
      status {
        phase
      }
    }
  }
}

2. Update ConfigMap¶

mutation UpdateConfigMap {
  v1 {
    updateConfigMap(
      name: "app-config"
      namespace: "default"
      body: {
        data: {
          "app.properties": "key=value\nfoo=bar"
        }
      }
    ) {
      metadata {
        name
      }
      data
    }
  }
}

3. Delete Pod¶

mutation DeletePod {
  v1 {
    deletePod(
      name: "my-nginx"
      namespace: "default"
    )
  }
}

Subscription Examples¶

1. Watch Single Pod¶

subscription WatchNginxPod {
  v1 {
    Pod(name: "nginx", namespace: "default") {
      metadata {
        name
        resourceVersion
      }
      status {
        phase
        conditions {
          type
          status
        }
      }
    }
  }
}

Updates Stream (WebSocket):

// Initial state
{"data": {"v1": {"Pod": {"status": {"phase": "Pending"}}}}}

// Pod scheduled
{"data": {"v1": {"Pod": {"status": {"phase": "Pending", "conditions": [{"type": "PodScheduled", "status": "True"}]}}}}}

// Pod running
{"data": {"v1": {"Pod": {"status": {"phase": "Running"}}}}}

2. Watch All Pods in Namespace¶

subscription WatchProductionPods {
  v1 {
    Pods(namespace: "production") {
      metadata {
        name
      }
      status {
        phase
      }
    }
  }
}

Handling Dotted Keys (Labels, Annotations)¶

GraphQL Limitation

GraphQL field names cannot contain dots (.), but Kubernetes labels and annotations often use them (e.g., app.kubernetes.io/name). The gateway uses a StringMapInput scalar to handle this.

Problem:

# Kubernetes label (invalid GraphQL)
labels:
  app.kubernetes.io/name: "my-app"
  app.kubernetes.io/version: "1.0"

Solution:

# Query with dotted keys
query {
  v1 {
    Pods(
      namespace: "default"
      labelSelector: {
        "app.kubernetes.io/name": "my-app"
      }
    ) {
      metadata {
        name
        labels  # Returns as StringMap (handles dots)
      }
    }
  }
}

# Response
{
  "data": {
    "v1": {
      "Pods": [{
        "metadata": {
          "name": "my-app-abc123",
          "labels": {
            "app.kubernetes.io/name": "my-app",
            "app.kubernetes.io/version": "1.0"
          }
        }
      }]
    }
  }
}

Platform Mesh Integration¶

How Platform Mesh Uses the GraphQL Gateway

Platform Mesh Portal integrates the Kubernetes GraphQL Gateway to provide a unified, efficient interface for querying cluster resources across multiple tenants and workspaces.

Architecture Integration¶

┌────────────────────────────────────────────────────────────────┐
│                  Platform Mesh Portal                           │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │           Luigi Shell (OpenMFP)                           │  │
│  │  - Navigation                                             │  │
│  │  - Authentication (OIDC via Keycloak)                     │  │
│  │  - Global context (user, org, workspace)                  │  │
│  └──────────────────┬───────────────────────────────────────┘  │
│                     │                                           │
│  ┌──────────────────▼───────────────────────────────────────┐  │
│  │           Micro Frontends (iframes)                       │  │
│  │                                                            │  │
│  │  ┌─────────────────┐  ┌─────────────────┐               │  │
│  │  │ Accounts UI     │  │ Workspaces UI   │               │  │
│  │  │ (Angular)       │  │ (Angular)       │               │  │
│  │  │                 │  │                 │               │  │
│  │  │ GraphQL Client  │  │ GraphQL Client  │               │  │
│  │  └────────┬────────┘  └────────┬────────┘               │  │
│  │           │                     │                        │  │
│  └───────────┼─────────────────────┼────────────────────────┘  │
└──────────────┼─────────────────────┼────────────────────────────┘
               │                     │
               │ GraphQL Queries     │
               │ (with Bearer token) │
               │                     │
               ▼                     ▼
┌──────────────────────────────────────────────────────────────────┐
│           Kubernetes GraphQL Gateway                              │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │ Gateway (KCP Mode)                                          │  │
│  │                                                              │  │
│  │ Endpoints:                                                   │  │
│  │  - /root/graphql                                            │  │
│  │  - /root:orgs:acme-corp/graphql                             │  │
│  │  - /root:orgs:example-inc/graphql                           │  │
│  └────────────────────────────────────────────────────────────┘  │
└───────────────────────────┬──────────────────────────────────────┘
                            │
                            │ Authenticated K8s API calls
                            │
                            ▼
                ┌───────────────────────────┐
                │       KCP Server          │
                │                           │
                │  ├─ root workspace        │
                │  ├─ root:orgs:acme-corp   │
                │  └─ root:orgs:example-inc │
                └───────────────────────────┘

Use Cases in Platform Mesh¶

1. Account Dashboard¶

Requirement: Display all accounts with their status and metadata

Traditional REST Approach (inefficient):

// Multiple API calls
const accounts = await fetch('/api/accounts');
for (const account of accounts) {
  const details = await fetch(`/api/accounts/${account.id}`);
  const workspaces = await fetch(`/api/accounts/${account.id}/workspaces`);
  const users = await fetch(`/api/accounts/${account.id}/users`);
}
// Result: 1 + (3 * N) API calls for N accounts

GraphQL Approach (efficient):

query GetAccountDashboard {
  accounts_v1alpha1 {
    Accounts(namespace: "platform-mesh-system") {
      metadata {
        name
        creationTimestamp
        labels
      }
      spec {
        displayName
        owner
      }
      status {
        phase
        workspaceCount
        userCount
      }
    }
  }
}

Result: Single query, only requested fields, all data at once

2. Workspace Explorer¶

Requirement: Show workspace hierarchy with resources

query GetWorkspaceResources($workspace: String!) {
  # Get workspace info
  workspaces_v1alpha1 {
    Workspace(name: $workspace) {
      metadata {
        name
      }
      spec {
        type
        parent
      }
    }
  }

  # Get all pods in workspace
  v1 {
    Pods(namespace: $workspace) {
      metadata {
        name
      }
      status {
        phase
      }
    }
  }

  # Get all deployments
  apps_v1 {
    Deployments(namespace: $workspace) {
      metadata {
        name
      }
      spec {
        replicas
      }
      status {
        availableReplicas
      }
    }
  }
}

3. Real-Time Resource Monitoring¶

Requirement: Live updates when pods change state

subscription WatchOrgPods($workspace: String!) {
  v1 {
    Pods(namespace: $workspace) {
      metadata {
        name
      }
      status {
        phase
        conditions {
          type
          status
        }
      }
    }
  }
}

UI Implementation:

// Angular component
import { Apollo } from 'apollo-angular';
import gql from 'graphql-tag';

@Component({ ... })
export class PodsComponent {
  constructor(private apollo: Apollo) {}

  ngOnInit() {
    // Subscribe to pod updates
    this.apollo.subscribe({
      query: gql`
        subscription WatchPods($namespace: String!) {
          v1 {
            Pods(namespace: $namespace) {
              metadata { name }
              status { phase }
            }
          }
        }
      `,
      variables: { namespace: this.workspace }
    }).subscribe({
      next: (result) => {
        // Update UI with new pod state
        this.pods = result.data.v1.Pods;
        this.changeDetector.detectChanges();
      }
    });
  }
}

Benefits for Platform Mesh Portal¶

1. Reduced Network Traffic - Single query instead of multiple REST calls - Only fetch fields actually displayed in UI - Reduced bandwidth usage (important for large clusters)

2. Simplified Frontend Code - Declarative data fetching (describe what you need, not how to get it) - No manual REST endpoint management - Type-safe queries (with GraphQL code generation)

3. Better User Experience - Faster page loads (single request) - Real-time updates (subscriptions) - Responsive UI (efficient data fetching)

4. Multi-Tenant Support - Separate GraphQL endpoint per workspace - Workspace-scoped queries - Authorization handled by gateway

5. Developer Experience - GraphQL Playground for testing queries - Self-documenting API (introspection) - Easy to add new resource types (automatic from CRDs)

Authentication Flow¶

sequenceDiagram
    participant User
    participant Portal as Portal UI
    participant Luigi as Luigi Shell
    participant Keycloak
    participant GraphQL as GraphQL Gateway
    participant KCP as KCP Server

    User->>Portal: Open portal
    Portal->>Keycloak: Redirect to login
    Keycloak->>User: Show login page
    User->>Keycloak: Enter credentials
    Keycloak->>Portal: Redirect with auth code
    Portal->>Keycloak: Exchange code for token
    Keycloak->>Portal: Return JWT access token
    Portal->>Luigi: Store token in global context

    Note over Portal,Luigi: User navigates to Accounts page

    Luigi->>Portal: Load Accounts MFE
    Portal->>GraphQL: GraphQL query<br/>Authorization: Bearer <JWT>
    GraphQL->>KCP: Validate token & execute query
    KCP->>GraphQL: Return results
    GraphQL->>Portal: GraphQL response
    Portal->>User: Render accounts

Hold "Alt" / "Option" to enable pan & zoom

Configuration in Platform Mesh¶

Gateway Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kubernetes-graphql-gateway
  namespace: platform-mesh-system
spec:
  replicas: 3
  selector:
    matchLabels:
      app: graphql-gateway
  template:
    metadata:
      labels:
        app: graphql-gateway
    spec:
      containers:
      - name: listener
        image: ghcr.io/platform-mesh/kubernetes-graphql-gateway-listener:latest
        env:
        - name: ENABLE_KCP
          value: "true"
        volumeMounts:
        - name: definitions
          mountPath: /app/bin/definitions
        - name: kubeconfig
          mountPath: /config

      - name: gateway
        image: ghcr.io/platform-mesh/kubernetes-graphql-gateway-gateway:latest
        ports:
        - containerPort: 3000
          name: graphql
        env:
        - name: ENABLE_KCP
          value: "true"
        - name: LOCAL_DEVELOPMENT
          value: "false"
        - name: GATEWAY_INTROSPECTION_AUTHENTICATION
          value: "true"
        volumeMounts:
        - name: definitions
          mountPath: /app/bin/definitions

      volumes:
      - name: definitions
        emptyDir: {}
      - name: kubeconfig
        secret:
          secretName: kcp-kubeconfig

Service:

apiVersion: v1
kind: Service
metadata:
  name: graphql-gateway
  namespace: platform-mesh-system
spec:
  selector:
    app: graphql-gateway
  ports:
  - port: 3000
    targetPort: 3000
    name: graphql

HTTPRoute (Gateway API):

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: graphql-gateway
  namespace: platform-mesh-system
spec:
  parentRefs:
  - name: platform-mesh-gateway
    namespace: platform-mesh-system
  hostnames:
  - portal.platform-mesh.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /graphql
    backendRefs:
    - name: graphql-gateway
      port: 3000

Access URLs:

# Root workspace
https://portal.platform-mesh.example.com/graphql/root/graphql

# Organization workspaces
https://portal.platform-mesh.example.com/graphql/root:orgs:acme-corp/graphql
https://portal.platform-mesh.example.com/graphql/root:orgs:example-inc/graphql

Deployment¶

Installation and Setup

Prerequisites¶

Required:

Kubernetes cluster (1.20+) or KCP instance
kubectl configured with cluster access
Go 1.21+ (for local development)
Task (for running tasks)

Optional:

Docker (for containerized deployment)
Helm (for Helm chart deployment)

Local Development Setup¶

1. Clone Repository

git clone https://github.com/platform-mesh/kubernetes-graphql-gateway.git
cd kubernetes-graphql-gateway

2. Install Dependencies

# Install Taskfile (if not already installed)
# macOS
brew install go-task/tap/go-task

# Linux
sh -c "$(curl --location https://taskfile.dev/install.sh)" -- -d -b /usr/local/bin

# Verify installation
task --version

3. Configure Operation Mode

Choose your operation mode (see Operation Modes):

For KCP Mode:

export ENABLE_KCP=true
export LOCAL_DEVELOPMENT=true  # Disables auth for local testing

For ClusterAccess Mode:

export ENABLE_KCP=false
export LOCAL_DEVELOPMENT=true

4. Configure Cluster Access

Create kubeconfig files for your clusters/workspaces:

# For KCP
mkdir -p ~/.kube
cp /path/to/kcp/kubeconfig ~/.kube/kcp-config

# For standard clusters
cp /path/to/cluster/kubeconfig ~/.kube/cluster-config

5. Start the Listener

# Terminal 1: Run listener
task listener

# Listener will:
# - Connect to configured clusters/workspaces
# - Extract OpenAPI specifications
# - Write spec files to ./bin/definitions/
# - Monitor for changes

Output:

INFO[0000] Starting Listener in KCP mode
INFO[0000] Watching workspace: root
INFO[0001] OpenAPI spec extracted: ./bin/definitions/root.json
INFO[0001] Watching workspace: root:orgs:acme-corp
INFO[0002] OpenAPI spec extracted: ./bin/definitions/root:orgs:acme-corp.json
INFO[0002] Listener ready, monitoring for changes...

6. Start the Gateway

# Terminal 2: Run gateway
task gateway

# Gateway will:
# - Monitor ./bin/definitions/ directory
# - Generate GraphQL schemas from OpenAPI specs
# - Start HTTP server on port 3000
# - Serve GraphQL Playground UI

Output:

INFO[0000] Starting Gateway in KCP mode
INFO[0000] Watching definitions directory: ./bin/definitions
INFO[0001] Schema generated for workspace: root
INFO[0001] GraphQL endpoint ready: http://localhost:3000/root/graphql
INFO[0002] Schema generated for workspace: root:orgs:acme-corp
INFO[0002] GraphQL endpoint ready: http://localhost:3000/root:orgs:acme-corp/graphql
INFO[0002] Gateway ready, server listening on :3000
INFO[0002] GraphQL Playground: http://localhost:3000/

7. Test the Gateway

Open GraphQL Playground in browser:

# Open any endpoint
open http://localhost:3000/root/graphql

Try a test query:

query {
  v1 {
    Pods(namespace: "default") {
      metadata {
        name
      }
      status {
        phase
      }
    }
  }
}

Kubernetes Deployment¶

1. Build Container Images

# Build listener image
docker build -f cmd/listener/Dockerfile -t ghcr.io/yourorg/graphql-listener:latest .

# Build gateway image
docker build -f cmd/gateway/Dockerfile -t ghcr.io/yourorg/graphql-gateway:latest .

# Push images
docker push ghcr.io/yourorg/graphql-listener:latest
docker push ghcr.io/yourorg/graphql-gateway:latest

2. Create Kubernetes Resources

Namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: graphql-gateway

ConfigMap (for listener config):

apiVersion: v1
kind: ConfigMap
metadata:
  name: listener-config
  namespace: graphql-gateway
data:
  config.yaml: |
    mode: kcp
    kcpConfig:
      server: https://kcp.example.com:6443
      workspaces:
      - root
      - root:orgs

Secret (kubeconfig):

apiVersion: v1
kind: Secret
metadata:
  name: kcp-kubeconfig
  namespace: graphql-gateway
type: Opaque
data:
  kubeconfig: <base64-encoded-kubeconfig>

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: graphql-gateway
  namespace: graphql-gateway
spec:
  replicas: 2
  selector:
    matchLabels:
      app: graphql-gateway
  template:
    metadata:
      labels:
        app: graphql-gateway
    spec:
      # Shared volume for definitions
      volumes:
      - name: definitions
        emptyDir: {}
      - name: kubeconfig
        secret:
          secretName: kcp-kubeconfig

      containers:
      # Listener container
      - name: listener
        image: ghcr.io/yourorg/graphql-listener:latest
        env:
        - name: ENABLE_KCP
          value: "true"
        - name: KUBECONFIG
          value: /config/kubeconfig
        volumeMounts:
        - name: definitions
          mountPath: /app/bin/definitions
        - name: kubeconfig
          mountPath: /config
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 500m
            memory: 512Mi

      # Gateway container
      - name: gateway
        image: ghcr.io/yourorg/graphql-gateway:latest
        ports:
        - containerPort: 3000
          name: graphql
        env:
        - name: ENABLE_KCP
          value: "true"
        - name: LOCAL_DEVELOPMENT
          value: "false"
        - name: GATEWAY_INTROSPECTION_AUTHENTICATION
          value: "true"
        volumeMounts:
        - name: definitions
          mountPath: /app/bin/definitions
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 1000m
            memory: 1Gi
        livenessProbe:
          httpGet:
            path: /healthz
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 30
        readinessProbe:
          httpGet:
            path: /readyz
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 10

Service:

apiVersion: v1
kind: Service
metadata:
  name: graphql-gateway
  namespace: graphql-gateway
spec:
  selector:
    app: graphql-gateway
  ports:
  - port: 3000
    targetPort: 3000
    name: graphql
  type: ClusterIP

Ingress/HTTPRoute:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: graphql-gateway
  namespace: graphql-gateway
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - graphql.example.com
    secretName: graphql-tls
  rules:
  - host: graphql.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: graphql-gateway
            port:
              number: 3000

3. Apply Resources

# Apply all resources
kubectl apply -f namespace.yaml
kubectl apply -f configmap.yaml
kubectl apply -f secret.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml

# Verify deployment
kubectl get pods -n graphql-gateway
kubectl logs -n graphql-gateway deployment/graphql-gateway -c listener
kubectl logs -n graphql-gateway deployment/graphql-gateway -c gateway

4. Test Deployment

# Port-forward for testing
kubectl port-forward -n graphql-gateway svc/graphql-gateway 3000:3000

# Access GraphQL Playground
open http://localhost:3000/root/graphql

# Or use curl
curl http://localhost:3000/root/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "{ v1 { Pods(namespace: \"default\") { metadata { name } } } }"}'

Configuration¶

Configuration Options

Environment Variables¶

Variable	Default	Description
`ENABLE_KCP`	`false`	Enable KCP mode (true) or ClusterAccess mode (false)
`LOCAL_DEVELOPMENT`	`false`	Disable authentication for local testing
`GATEWAY_INTROSPECTION_AUTHENTICATION`	`false`	Require auth for introspection queries
`KUBECONFIG`	`~/.kube/config`	Path to kubeconfig file
`DEFINITIONS_DIR`	`./bin/definitions`	Directory for OpenAPI spec files
`GATEWAY_PORT`	`3000`	HTTP port for GraphQL server
`LOG_LEVEL`	`info`	Logging level (debug, info, warn, error)

Listener Configuration¶

ClusterAccess Mode:

# listener-config.yaml
mode: clusterAccess
clusters:
- name: prod
  kubeconfig: /configs/prod-kubeconfig.yaml
- name: staging
  kubeconfig: /configs/staging-kubeconfig.yaml

KCP Mode:

# listener-config.yaml
mode: kcp
kcpConfig:
  server: https://kcp.example.com:6443
  kubeconfig: /configs/kcp-kubeconfig.yaml
  workspaces:
  - root
  - root:orgs
  - root:orgs:acme-corp
  virtualWorkspaces:
    enabled: true

Gateway Configuration¶

# gateway-config.yaml
server:
  port: 3000
  cors:
    enabled: true
    allowedOrigins:
    - https://portal.example.com
    allowedMethods:
    - GET
    - POST
    - OPTIONS

authentication:
  enabled: true
  introspectionAuth: true
  tokenValidation:
    issuer: https://keycloak.example.com/realms/platform-mesh
    audience: graphql-gateway

definitions:
  directory: /app/bin/definitions
  watchInterval: 5s

logging:
  level: info
  format: json

Authorization & Authentication¶

Security Configuration

Default Authorization¶

All requests must include an Authorization header with a valid Bearer token:

curl http://gateway:3000/root/graphql \
  -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIs..." \
  -H "Content-Type: application/json" \
  -d '{"query": "{ v1 { Pods(namespace: \"default\") { metadata { name } } } }"}'

Local Development Mode¶

Disable authentication for local testing:

export LOCAL_DEVELOPMENT=true
task gateway

With LOCAL_DEVELOPMENT=true, no Authorization header is required.

Introspection Authentication¶

Schema Introspection

Introspection queries fetch the GraphQL schema structure (types, fields, etc.). By default, these are not protected by authorization.

Default Behavior (introspection allowed without auth):

# This works without Authorization header
query IntrospectionQuery {
  __schema {
    types {
      name
    }
  }
}

Enable Introspection Auth:

export GATEWAY_INTROSPECTION_AUTHENTICATION=true

With this enabled, introspection queries also require a valid Bearer token.

GraphQL Playground Authentication¶

Problem: GraphQL Playground cannot automatically pass Authorization headers when loading the schema.

Workaround:

Open GraphQL Playground
Click "HTTP HEADERS" at bottom

Add Authorization header:

{
  "Authorization": "Bearer YOUR_TOKEN_HERE"
}

Click "Reload Schema" button

Token Forwarding¶

The gateway forwards the Bearer token from the GraphQL request to the underlying Kubernetes API:

User Request:
  POST /root/graphql
  Authorization: Bearer <user-token>
  Body: { "query": "..." }

Gateway → Kubernetes API:
  GET /api/v1/namespaces/default/pods
  Authorization: Bearer <user-token>

Authorization is handled by Kubernetes RBAC:

User's token must have permissions for requested resources
Gateway does not perform authorization checks
Kubernetes API server validates token and enforces RBAC

Integration with Keycloak (Platform Mesh)¶

Authentication Flow:

User logs in via Keycloak (OIDC)
Portal receives JWT access token
Portal includes token in GraphQL requests
Gateway forwards token to KCP/Kubernetes
KCP validates token against Keycloak
KCP enforces RBAC permissions

Example Token Payload:

{
  "iss": "https://keycloak.example.com/realms/org-acme-corp",
  "sub": "user-123",
  "aud": "platform-mesh-portal",
  "exp": 1706980800,
  "iat": 1706952000,
  "email": "alice@acme-corp.com",
  "groups": ["org-admin", "workspace-editor"],
  "realm_access": {
    "roles": ["user"]
  },
  "resource_access": {
    "graphql-gateway": {
      "roles": ["viewer"]
    }
  }
}

Operations¶

Day 2 Operations

Monitoring¶

Health Checks:

# Liveness probe (is process running?)
curl http://gateway:3000/healthz

# Readiness probe (is service ready to serve traffic?)
curl http://gateway:3000/readyz

Prometheus Metrics (if enabled):

curl http://gateway:3000/metrics

Key Metrics: - graphql_request_duration_seconds: Query execution time - graphql_requests_total: Total number of requests - graphql_errors_total: Total number of errors - graphql_active_subscriptions: Number of active WebSocket subscriptions - listener_openapi_syncs_total: Number of OpenAPI spec updates - listener_sync_errors_total: Number of sync failures

Logging¶

View Listener Logs:

kubectl logs -n graphql-gateway deployment/graphql-gateway -c listener -f

View Gateway Logs:

kubectl logs -n graphql-gateway deployment/graphql-gateway -c gateway -f

Log Levels:

# Set log level
export LOG_LEVEL=debug  # debug, info, warn, error

# Structured JSON logging
{"level":"info","ts":1706952000,"msg":"Schema generated","workspace":"root"}
{"level":"debug","ts":1706952001,"msg":"Query executed","query":"Pods","duration":"45ms"}

Scaling¶

Horizontal Scaling:

# Scale gateway replicas
kubectl scale deployment/graphql-gateway --replicas=5 -n graphql-gateway

# Configure HPA
kubectl autoscale deployment/graphql-gateway \
  --cpu-percent=70 \
  --min=2 \
  --max=10 \
  -n graphql-gateway

Vertical Scaling:

# Increase resource limits
spec:
  template:
    spec:
      containers:
      - name: gateway
        resources:
          requests:
            cpu: 500m
            memory: 512Mi
          limits:
            cpu: 2000m
            memory: 2Gi

Troubleshooting¶

Common Issues

Gateway Not Generating Schemas¶

Symptoms: No GraphQL endpoints available

Diagnostic:

# Check listener is running
kubectl logs -n graphql-gateway deployment/graphql-gateway -c listener

# Verify definitions directory has spec files
kubectl exec -n graphql-gateway deployment/graphql-gateway -c gateway -- \
  ls -la /app/bin/definitions/

# Check gateway logs for errors
kubectl logs -n graphql-gateway deployment/graphql-gateway -c gateway

Common Causes:

Listener cannot connect to cluster/workspace (check kubeconfig)
OpenAPI spec extraction failed (check cluster API server)
Definitions directory not shared between containers (check volume mount)

Authentication Failures¶

Symptoms: 401 Unauthorized responses

Diagnostic:

# Verify token is included
curl -v http://gateway:3000/root/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"query": "{ __schema { types { name } } }"}'

# Check if LOCAL_DEVELOPMENT is enabled
kubectl get deployment/graphql-gateway -n graphql-gateway -o yaml | grep LOCAL_DEVELOPMENT

# Verify Kubernetes can validate token
kubectl auth can-i list pods --as=system:serviceaccount:default:test --token=$TOKEN

Fixes:

Ensure Authorization: Bearer <token> header is included
Verify token is valid (not expired)
Check token has required Kubernetes RBAC permissions
For local dev, enable LOCAL_DEVELOPMENT=true

Subscription Disconnects¶

Symptoms: WebSocket subscriptions disconnect unexpectedly

Diagnostic:

# Check for connection timeouts
kubectl logs -n graphql-gateway deployment/graphql-gateway -c gateway | grep -i timeout

# Verify WebSocket support in ingress/load balancer
curl -i -N -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  -H "Sec-WebSocket-Version: 13" \
  -H "Sec-WebSocket-Key: $(openssl rand -base64 16)" \
  http://gateway:3000/root/graphql

Fixes:

Configure ingress/load balancer for WebSocket support
Increase connection timeout settings
Check network policies allow WebSocket traffic

High Memory Usage¶

Symptoms: Gateway pods OOMKilled or high memory consumption

Causes: - Large number of active subscriptions - Complex GraphQL queries - Large cluster with many resources

Fixes:

# Increase memory limits
resources:
  requests:
    memory: 1Gi
  limits:
    memory: 4Gi

# Implement query complexity limits (future feature)
# Limit max subscription count per client

Best Practices¶

Recommendations

Performance¶

1. Use Field Selection Carefully

# Good: Only request needed fields
query {
  v1 {
    Pods(namespace: "default") {
      metadata { name }
      status { phase }
    }
  }
}

# Bad: Request all fields (over-fetching)
query {
  v1 {
    Pods(namespace: "default") {
      metadata { ... }
      spec { ... }
      status { ... }
    }
  }
}

2. Use Label Selectors

# Good: Filter at API level
query {
  v1 {
    Pods(
      namespace: "production"
      labelSelector: { app: "backend", tier: "api" }
    ) {
      metadata { name }
    }
  }
}

# Bad: Fetch all, filter client-side
query {
  v1 {
    Pods(namespace: "production") {
      metadata { name labels }
    }
  }
}
# Then filter in JavaScript

3. Batch Related Queries

# Good: Single query for related data
query {
  deployment: apps_v1 {
    Deployment(name: "api", namespace: "prod") { ... }
  }
  pods: v1 {
    Pods(namespace: "prod", labelSelector: { app: "api" }) { ... }
  }
  service: v1 {
    Service(name: "api", namespace: "prod") { ... }
  }
}

# Bad: Multiple separate queries
query { apps_v1 { Deployment(...) } }
query { v1 { Pods(...) } }
query { v1 { Service(...) } }

Security¶

1. Never Embed Tokens in Code

// Bad
const token = "eyJhbGciOiJSUzI1NiIs..."; // Hardcoded!

// Good
const token = await getTokenFromAuthProvider();

2. Validate Token Expiry Client-Side

import jwtDecode from 'jwt-decode';

function isTokenExpired(token: string): boolean {
  const decoded = jwtDecode(token);
  return decoded.exp * 1000 < Date.now();
}

if (isTokenExpired(token)) {
  token = await refreshToken();
}

3. Use HTTPS in Production

# Always use TLS for GraphQL endpoints
spec:
  tls:
  - hosts:
    - graphql.example.com
    secretName: graphql-tls

Development¶

1. Use GraphQL Code Generation

# Install
npm install -D @graphql-codegen/cli

# Generate TypeScript types from schema
npx graphql-codegen --schema http://localhost:3000/root/graphql --documents './src/**/*.graphql' --generates src/generated/graphql.ts

2. Use Fragments for Reusability

# Define fragments
fragment PodInfo on v1_Pod {
  metadata {
    name
    namespace
    labels
  }
  status {
    phase
    podIP
  }
}

# Reuse in queries
query GetPods {
  v1 {
    Pods(namespace: "default") {
      ...PodInfo
    }
  }
}

3. Use Query Variables

# Good: Parameterized query
query GetPod($name: String!, $namespace: String!) {
  v1 {
    Pod(name: $name, namespace: $namespace) { ... }
  }
}

# Bad: Hardcoded values
query {
  v1 {
    Pod(name: "nginx", namespace: "default") { ... }
  }
}

Monitoring¶

1. Track Query Performance

const start = Date.now();
const result = await apolloClient.query({ query: GET_PODS });
const duration = Date.now() - start;
console.log(`Query took ${duration}ms`);

2. Handle Errors Gracefully

apolloClient.query({ query: GET_PODS })
  .then(result => {
    if (result.errors) {
      // Partial success (some data, some errors)
      console.error('GraphQL errors:', result.errors);
    }
    return result.data;
  })
  .catch(error => {
    // Network or complete failure
    console.error('Request failed:', error);
  });

3. Monitor Subscription Health

subscription.subscribe({
  next: (data) => { /* handle update */ },
  error: (error) => {
    console.error('Subscription error:', error);
    // Reconnect logic
    setTimeout(() => resubscribe(), 5000);
  },
  complete: () => {
    console.log('Subscription closed');
  }
});

Platform Mesh Portal - How the portal uses GraphQL Gateway
Luigi Framework - Micro frontend orchestration
Platform Mesh Operator - Environment orchestration
Keycloak - Authentication and token management

Further Resources¶

Official Documentation¶

GitHub Repository: github.com/platform-mesh/kubernetes-graphql-gateway
Documentation: github.com/platform-mesh/kubernetes-graphql-gateway/tree/main/docs

GraphQL Resources¶

GraphQL Official: graphql.org
GraphQL Spec: spec.graphql.org
Apollo GraphQL: apollographql.com/docs
How to GraphQL: howtographql.com

Kubernetes Resources¶

Kubernetes API: kubernetes.io/docs/reference/using-api
OpenAPI Spec: kubernetes.io/docs/concepts/overview/kubernetes-api/#openapi-spec

Sources:

Kubernetes GraphQL Gateway GitHub Repository
GraphQL Official Documentation
Platform Mesh Portal Analysis
Internal Architecture Documentation