cloudflare-operator: A Kubernetes Operator for Cloudflare Tunnels and DNS Automation

Reconciliation patterns, CRD design, and DNS automation behind a Go-based Kubernetes Operator for Cloudflare Tunnels

@geomenaSat Mar 07 2026#kubernetes#open-source#side-project1,123 views

Exposing Kubernetes services to the internet remains a domain fraught with operational friction. The conventional approach demands public load balancers, TLS certificate management, ingress controllers, and firewall rules, each introducing its own failure surface and configuration overhead. Cloudflare Tunnels eliminate this complexity by establishing outbound-only connections from the cluster to Cloudflare's edge network, but their manual configuration through the cloudflared CLI does not scale when dozens of services require independent routing rules, DNS entries, and lifecycle management.

cloudflare-operator bridges this gap. It is a Kubernetes Operator written in Go that declaratively manages the full lifecycle of Cloudflare Tunnels, DNS records, and cloudflared daemon deployments through Custom Resource Definitions. When a TunnelBinding resource references a Service, the operator automatically updates the tunnel's routing configuration, creates the corresponding CNAME record in Cloudflare DNS, restarts the daemon pods to apply the changes, and registers a finalizer that reverses every modification upon deletion. The entire workflow, from tunnel creation to DNS propagation, collapses into a single kubectl apply.

This document presents the architectural decisions, reconciliation patterns, and CRD design that sustain the operator, with the technical depth that each of these choices warrants.

Operator Architecture

The operator follows the standard Kubernetes controller pattern: it watches Custom Resources, reconciles desired state against actual state, and drives convergence through the Cloudflare v4 API and the Kubernetes API simultaneously. Four CRDs define the declarative surface, and four reconcilers implement the convergence logic.

The following diagram illustrates the final architecture, where the Cloudflare Operator watches Tunnel CRDs, manages Secrets, ConfigMaps, and Deployments within the cluster, and coordinates with Cloudflare's edge network through API calls for tunnel and DNS record lifecycle management:

Operator Architecture

The Tunnel and ClusterTunnel CRDs define the tunnel itself, specifying whether to create a new one or reference an existing tunnel ID, which Cloudflare account and domain to operate against, and how to customize the cloudflared deployment. The TunnelBinding CRD maps Kubernetes Services to tunnel routing rules and DNS entries, following a pattern inspired by Kubernetes RBAC's RoleBinding with subjects and a tunnelRef. The AccessTunnel CRD enables cross-cluster service exposure through Cloudflare's Arbitrary TCP Access, allowing a database in one cluster to be consumed as a standard Kubernetes Service in another.

Custom Resource Definitions

The CRD design underwent a significant evolution from v1alpha1 to v1alpha2, driven by the recognition that prescriptive fields for deployment customization, such as size, image, nodeSelectors, and tolerations, could never anticipate every legitimate configuration need. The v1alpha2 specification replaces these with a single deployPatch field that accepts arbitrary kubectl-style JSON or YAML patches, enabling any deployment modification without requiring CRD schema changes.

ClusterTunnel — v1alpha2 specification
apiVersion: networking.cfargotunnel.com/v1alpha2
kind: ClusterTunnel
metadata:
  name: k3s-cluster-tunnel
spec:
  newTunnel:
    name: my-k8s-tunnel
  cloudflare:
    email: email@example.com
    domain: example.com
    secret: cloudflare-secrets
    accountId: <account-id>
  fallbackTarget: http_status:404
  noTlsVerify: false
  originCaPool: homelab-ca
  deployPatch: |
    spec:
      replicas: 2
      template:
        spec:
          containers:
            - name: cloudflared
              image: cloudflare/cloudflared:2025.4.0

The Tunnel and ClusterTunnel CRDs share an identical specification, differing only in scope: Tunnel is namespace-scoped while ClusterTunnel operates cluster-wide, mirroring the Issuer / ClusterIssuer pattern established by cert-manager. The newTunnel and existingTunnel fields are mutually exclusive, where the former instructs the operator to create a tunnel through the Cloudflare API and the latter references a pre-existing tunnel by ID or name.

TunnelBinding — Service-to-Tunnel mapping
apiVersion: networking.cfargotunnel.com/v1alpha1
kind: TunnelBinding
metadata:
  name: svc-binding
subjects:
  - name: whoami-1
  - name: whoami-2
  - name: postgres
    spec:
      fqdn: db.example.com
      protocol: tcp
      target: tcp://postgres.default.svc.cluster.local:5432
tunnelRef:
  kind: ClusterTunnel
  name: k3s-cluster-tunnel
  disableDNSUpdates: false

Each subject references a Kubernetes Service by name and optionally specifies the FQDN, protocol, target URL, and TLS configuration. When omitted, the operator infers sensible defaults: the FQDN derives from the service name concatenated with the tunnel's domain, the protocol defaults to HTTP, and the target resolves to the service's cluster DNS address.

AccessTunnel — Cross-cluster TCP access
apiVersion: networking.cfargotunnel.com/v1alpha1
kind: AccessTunnel
metadata:
  name: postgres
target:
  fqdn: db.example.com
  protocol: tcp
  svc:
    port: 5432

The AccessTunnel CRD deploys a cloudflared access client that connects to a tunnel-exposed service in another cluster, creating a local Kubernetes Service that applications can consume as if the remote resource were cluster-local. The result: postgres.default.svc:5432 in the client cluster transparently routes to the source cluster's database through Cloudflare's network.

Design Evolution

The operator did not begin as the multi-CRD system it is today. The initial approach was far simpler: a controller that watched Ingress resources, read their annotations to determine the target domain and ConfigMap, and modified the cloudflared configuration accordingly. The cloudflared deployment itself was manually provisioned following the official Kubernetes deployment guide.

Initial Architecture

The limitations of this approach became evident quickly: tunnels still had to be created manually through the Cloudflare dashboard, credentials required manual Secret creation, and DNS entries demanded separate configuration. These operational burdens motivated the introduction of a Tunnel Custom Resource and Controller combo, the Operator Pattern, that automates the end-to-end flow. The controller evolved further when it became clear that cloudflared can proxy not only HTTP/S but also TCP and UDP traffic, eliminating the need for an Ingress resource entirely. The controller was modified to work on Services rather than Ingresses, reducing the required resources to just a Deployment and a Service. The final architecture, shown in the diagram above, reflects the culmination of these iterations.

Reconciliation Engine

The reconciliation logic employs an Adapter pattern and a Generic Reconciler interface to share the vast majority of code between the Tunnel and ClusterTunnel controllers. The TunnelAdapter and ClusterTunnelAdapter wrap their respective CRD types behind a unified Tunnel interface, allowing the generic reconciler to operate without knowledge of the underlying scope.

Tunnel Reconciliation Flow

The tunnel reconciler executes a deterministic sequence upon each reconciliation cycle:

  1. Setup: Validates that exactly one of newTunnel or existingTunnel is specified. For new tunnels, it calls the Cloudflare API to create the tunnel with a cryptographically random 32-byte secret, then registers a finalizer for cleanup. For existing tunnels, it loads credentials from the referenced Kubernetes Secret.

  2. Status Update: Validates all Cloudflare identifiers, including account ID, tunnel ID, and zone ID, through the API, updating the resource's status subresource and applying labels that encode the tunnel name, ID, domain, and cluster scope flag.

  3. Resource Creation: Generates three managed Kubernetes resources: a Secret containing the credentials.json file, a ConfigMap with the cloudflared YAML configuration including ingress rules and origin request settings, and a Deployment that mounts both volumes and runs the daemon with hardened security contexts such as runAsNonRoot, readOnlyRootFilesystem, and RuntimeDefault seccomp profile.

  4. Cleanup: When the CRD is deleted, the finalizer scales the deployment to zero, waits for pod termination, calls the Cloudflare API to delete the tunnel and its connections, and then removes itself to allow garbage collection to proceed.

TunnelBinding Reconciliation

The TunnelBinding controller orchestrates three systems in a single reconciliation. The service_controller.go, now evolved into tunnelbinding_controller.go, performs the following operations:

  • Modify the ConfigMap to include the new Service to be tunneled
  • Add a DNS entry on Cloudflare to route traffic to this tunnel
  • Restart the cloudflared Pods to make the configuration change take effect
  • Add a finalizer to delete the DNS entry once the resource is deleted

DNS Ownership Tracking

A subtle but critical design decision governs DNS record management. The operator creates a companion TXT record with a _managed. prefix for every CNAME it manages. This TXT record contains a JSON payload with the DNS record ID, tunnel ID, and tunnel name, serving as an ownership marker that prevents the operator from accidentally deleting DNS records it did not create. The --overwrite-unmanaged-dns flag controls whether the operator may take ownership of pre-existing records, and is disabled by default to prevent destructive surprises.

Cloudflare API Client

The internal API client wraps the official cloudflare-go SDK with operator-specific logic for tunnel lifecycle management, credential generation, and DNS record orchestration.

Deployment Architecture

The operator generates a cloudflared Deployment with a security-hardened pod specification. Each container runs as non-root with a read-only root filesystem, mounts three volumes, namely the credentials Secret, the configuration ConfigMap, and optionally a CA certificate Secret for origin TLS verification, and exposes a metrics endpoint on port 2000. The v1alpha2 deployPatch field enables arbitrary customization through strategic merge patches, supporting replicas, node selectors, tolerations, resource limits, and image overrides without requiring CRD schema modifications.

Managed ResourcePurposeSource
Secretcredentials.json for tunnel authenticationGenerated by operator or loaded from existing
ConfigMapconfig.yaml with ingress rules, metrics, and origin request settingsMaintained by Tunnel + TunnelBinding controllers
Deploymentcloudflared daemon pods with security contexts and volume mountsCreated by Tunnel controller, patched by deployPatch
CNAME RecordDNS entry pointing FQDN to {tunnelId}.cfargotunnel.comCreated by TunnelBinding controller via Cloudflare API
TXT Record_managed. ownership marker with JSON metadataCreated alongside CNAME for provenance tracking

Routing Patterns

The operator supports two distinct routing architectures, each suited to different operational requirements.

Direct routing configures cloudflared itself as the reverse proxy, with each TunnelBinding subject receiving its own ingress rule in the daemon's configuration. This approach minimizes moving parts and works well for straightforward service exposure. Reverse proxy routing points all tunnel traffic to an existing ingress controller, such as ingress-nginx or Traefik, using a wildcard FQDN binding, delegating routing decisions to the ingress layer. This pattern preserves existing ingress configurations and enables capabilities that cloudflared does not natively support, such as custom SSO integration with Authelia or advanced path-based routing. Both patterns remain fully compatible with traditional Ingress resources for local cluster access, and tools like external-dns can coexist without additional configuration.

A significant value proposition of the tunnel-based approach is its integration with Cloudflare for Teams. Since all traffic traverses Cloudflare's network, administrators can define granular access control policies through the Zero Trust dashboard, restricting which users or groups may reach specific services without modifying any cluster-side configuration. This capability transforms the operator from a mere connectivity tool into a component of a comprehensive Zero Trust architecture.

Technology Stack

ComponentTechnologyRole
LanguageGo 1.24Operator implementation
Frameworkcontroller-runtime v0.20Kubernetes reconciliation primitives
ScaffoldingKubebuilder v4Project structure and code generation
API Clientcloudflare-go v0.115Official Cloudflare v4 API SDK
LoggingzapStructured, leveled logging
CRD Versionsv1alpha1, v1alpha2Conversion webhooks for seamless migration
DeploymentKustomizeDeclarative installation manifests
Containerdistroless/static:nonrootMinimal, security-hardened runtime image
CI/CDGitHub ActionsTest matrix across platforms

Open Source and Community

cloudflare-operator is an open-source project published under the Apache 2.0 license, reflecting the conviction that infrastructure tooling must be transparent, auditable, and composable without vendor lock-in. The project is not affiliated with, endorsed by, or officially supported by Cloudflare Inc., but rather an independent engineering effort that leverages Cloudflare's public APIs and the cloudflared daemon to automate what would otherwise require repetitive manual configuration.

The value this project contributes to the Kubernetes community extends beyond tunnel management. The codebase demonstrates how to architect a production-grade Kubernetes Operator in Go with proper separation of concerns: the Adapter pattern for CRD polymorphism, the Generic Reconciler interface for shared reconciliation logic, finalizers for reliable cleanup, and strategic merge patches for deployment customization without CRD proliferation. The DNS ownership tracking system, with its _managed. TXT records and provenance-aware deletion logic, addresses a problem that many operators handle poorly or not at all: ensuring that automated systems can safely reverse their side effects without destroying resources they did not create.

The repository is available at github.com/geo-mena/cloudflare-operator, and the documentation at cf-operator-docs.tofi.pro, where every architectural decision documented in this article can be verified directly in the source code. Contributions, whether in the form of new CRD capabilities, additional cloud provider integrations, webhook enhancements, or documentation improvements, are welcome and represent exactly the kind of collaboration that strengthens the ecosystem of Kubernetes-native infrastructure tooling.