Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

Main Branch CI/CD PR CI Integration Tests codecov

Bindy is a high-performance Kubernetes controller written in Rust that manages BIND9 DNS infrastructure through Custom Resource Definitions (CRDs). It enables you to manage DNS zones and records as native Kubernetes resources, bringing the declarative Kubernetes paradigm to DNS management.

What is Bindy?

Bindy watches for DNS-related Custom Resources in your Kubernetes cluster and automatically generates and manages BIND9 zone configurations. It replaces traditional manual DNS management with a declarative, GitOps-friendly approach.

Key Features

  • High Performance - Native Rust implementation with async/await and zero-copy operations
  • RNDC Protocol - Native BIND9 management via Remote Name Daemon Control (RNDC) with TSIG authentication
  • Label Selectors - Target specific BIND9 instances using Kubernetes label selectors
  • Dynamic Zone Management - Automatically create and manage DNS zones using RNDC commands
  • Multi-Record Types - Support for A, AAAA, CNAME, MX, TXT, NS, SRV, and CAA records
  • Declarative DNS - Manage DNS as Kubernetes resources with full GitOps support
  • Security First - TSIG-authenticated RNDC communication, non-root containers, RBAC-ready
  • Status Tracking - Complete status subresources for all resources
  • Primary/Secondary Support - Built-in support for primary and secondary DNS architectures with zone transfers

Why Bindy?

Traditional DNS management involves:

  • Manual editing of zone files
  • SSH access to DNS servers
  • No audit trail or version control
  • Difficult disaster recovery
  • Complex multi-region setups

Bindy transforms this by:

  • Managing DNS as Kubernetes resources
  • Full GitOps workflow support
  • Native RNDC protocol for direct BIND9 control
  • Built-in audit trail via Kubernetes events
  • Simple disaster recovery (backup your CRDs)
  • Seamless multi-region DNS distribution with zone transfers

Who Should Use Bindy?

Bindy is ideal for:

  • Platform Engineers building internal DNS infrastructure
  • DevOps Teams managing DNS alongside their Kubernetes workloads
  • SREs requiring automated, auditable DNS management
  • Organizations running self-hosted BIND9 DNS servers
  • Multi-region Deployments needing distributed DNS infrastructure

Quick Example

Here’s how simple it is to create a DNS zone with records:

# Create a DNS zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
spec:
  zoneName: example.com
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
  ttl: 3600

---
# Add an A record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

Apply it to your cluster:

kubectl apply -f dns-config.yaml

Bindy automatically:

  1. Finds matching BIND9 instances using pod discovery
  2. Connects to BIND9 via RNDC protocol (port 953)
  3. Creates zones and records using native RNDC commands
  4. Tracks status and conditions in real-time

Next Steps

Performance Characteristics

  • Startup Time: <1 second
  • Memory Usage: ~50MB baseline
  • Zone Creation Latency: <500ms per zone (via RNDC)
  • Record Addition Latency: <200ms per record (via RNDC)
  • RNDC Command Execution: <100ms typical
  • Controller Overhead: Negligible CPU when idle

Project Status

Bindy is actively developed and used in production environments. The project follows semantic versioning and maintains backward compatibility within major versions.

Current version: v0.1.0

Support & Community

License

Bindy is open-source software licensed under the MIT License.

Installation

This section guides you through installing Bindy in your Kubernetes cluster.

Overview

Installing Bindy involves these steps:

  1. Prerequisites - Ensure your environment meets the requirements
  2. Install CRDs - Deploy Custom Resource Definitions
  3. Create RBAC - Set up service accounts and permissions
  4. Deploy Controller - Install the Bindy controller
  5. Create BIND9 Instances - Deploy your DNS servers

Installation Methods

Standard Installation

The standard installation uses kubectl to apply YAML manifests:

# Create namespace
kubectl create namespace dns-system

# Install CRDs (use kubectl create to avoid annotation size limits)
kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

# Install RBAC
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

# Deploy controller
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

Development Installation

For development or testing, you can build and deploy from source:

# Clone the repository
git clone https://github.com/firestoned/bindy.git
cd bindy

# Build the controller
cargo build --release

# Build Docker image
docker build -t bindy:dev .

# Deploy with your custom image
kubectl apply -f deploy/

Verification

After installation, verify that all components are running:

# Check CRDs are installed
kubectl get crd | grep bindy.firestoned.io

# Check controller is running
kubectl get pods -n dns-system

# Check controller logs
kubectl logs -n dns-system -l app=bind9-controller

You should see output similar to:

NAME                                READY   STATUS    RESTARTS   AGE
bind9-controller-7d4b8c4f9b-x7k2m   1/1     Running   0          1m

Next Steps

Prerequisites

Before installing Bindy, ensure your environment meets these requirements.

Kubernetes Cluster

  • Kubernetes Version: 1.24 or later
  • Access Level: Cluster admin access (for CRD and RBAC installation)
  • Namespace: Ability to create namespaces (recommended: dns-system)

Supported Kubernetes Distributions

Bindy has been tested on:

  • Kubernetes (vanilla)
  • k0s
  • MKE
  • k0RDENT
  • Amazon EKS
  • Google GKE
  • Azure AKS
  • Red Hat OpenShift
  • k3s
  • kind (for development/testing)

Client Tools

Required

Optional (for development)

Cluster Resources

Minimum Requirements

  • CPU: 100m per controller pod
  • Memory: 128Mi per controller pod
  • Storage:
    • Minimal for controller (configuration only)
    • StorageClass: Required for persistent zone data (optional but recommended)
  • CPU: 500m per controller pod (2 replicas)
  • Memory: 512Mi per controller pod
  • High Availability: 3 controller replicas across different nodes

BIND9 Infrastructure

Bindy manages existing BIND9 servers. You’ll need:

  • BIND9 version 9.16 or later (9.18+ recommended)
  • Network connectivity from Bindy controller to BIND9 pods
  • Shared volume for zone files (ConfigMap, PVC, or similar)

Network Requirements

Controller to API Server

  • Outbound HTTPS (443) to Kubernetes API server
  • Required for watching resources and updating status

Controller to BIND9 Pods

  • Access to BIND9 configuration volumes
  • Typical setup uses Kubernetes ConfigMaps or PersistentVolumes

BIND9 to Network

  • UDP/TCP port 53 for DNS queries
  • Port 953 for RNDC (if using remote name daemon control)
  • Zone transfer ports (configured in BIND9)

Permissions

Cluster-Level Permissions Required

The person installing Bindy needs:

# Ability to create CRDs
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: ["create", "get", "list"]

# Ability to create ClusterRoles and ClusterRoleBindings
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["clusterroles", "clusterrolebindings"]
  verbs: ["create", "get", "list"]

Namespace Permissions Required

For the DNS system namespace:

  • Create ServiceAccounts
  • Create Deployments
  • Create ConfigMaps
  • Create Services

Storage Provisioner

For persistent zone data storage across pod restarts, you need a StorageClass configured in your cluster.

Production Environments

Use your cloud provider’s StorageClass:

  • AWS: EBS (gp3 or gp2)
  • GCP: Persistent Disk (pd-standard or pd-ssd)
  • Azure: Azure Disk (managed-premium or managed)
  • On-Premises: NFS, Ceph, or other storage solutions

Verify a default StorageClass exists:

kubectl get storageclass

Development/Testing (Kind, k3s, local clusters)

For local development, install the local-path provisioner:

# Install local-path provisioner
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.28/deploy/local-path-storage.yaml

# Wait for provisioner to be ready
kubectl wait --for=condition=available --timeout=60s \
  deployment/local-path-provisioner -n local-path-storage

# Check if local-path StorageClass was created
if kubectl get storageclass local-path &>/dev/null; then
  # Set local-path as default if no default exists
  kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
else
  # Create a default StorageClass using local-path provisioner
  cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: default
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
EOF
fi

# Verify installation
kubectl get storageclass

Expected output (either local-path or default will be marked as default):

NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  1m

Or:

NAME                PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
default (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  1m

Note: The local-path provisioner stores data on the node’s local disk. It’s not suitable for production but works well for development and testing.

Optional Components

For Production Deployments

  • Monitoring: Prometheus for metrics collection
  • Logging: Elasticsearch/Loki for log aggregation
  • GitOps: ArgoCD or Flux for declarative management
  • Backup: Velero for disaster recovery

For Development

  • kind: Local Kubernetes for testing
  • tilt: For rapid development cycles
  • k9s: Terminal UI for Kubernetes

Verification

Check your cluster meets the requirements:

# Check Kubernetes version
kubectl version --short

# Check you have cluster-admin access
kubectl auth can-i create customresourcedefinitions

# Check available resources
kubectl top nodes

# Verify connectivity
kubectl cluster-info

Expected output:

Client Version: v1.28.0
Server Version: v1.27.3
yes

Next Steps

Once your environment meets these prerequisites:

  1. Install CRDs
  2. Deploy the Controller
  3. Quick Start Guide

Quick Start

Get Bindy running in 5 minutes with this quick start guide.

Step 1: Install Storage Provisioner (Optional)

For persistent zone data storage, install a storage provisioner. For Kind clusters or local development:

# Install local-path provisioner
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.28/deploy/local-path-storage.yaml

# Wait for provisioner to be ready
kubectl wait --for=condition=available --timeout=60s \
  deployment/local-path-provisioner -n local-path-storage

# Set as default StorageClass (or create one if it doesn't exist)
if kubectl get storageclass local-path &>/dev/null; then
  kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
else
  # Create default StorageClass if local-path wasn't created
  cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: default
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
EOF
fi

# Verify StorageClass is available
kubectl get storageclass

Note: For production clusters, use your cloud provider’s StorageClass (AWS EBS, GCP PD, Azure Disk, etc.)

Step 2: Install Bindy

# Create namespace
kubectl create namespace dns-system

# Install CRDs (use kubectl create to avoid annotation size limits)
kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

# Install RBAC
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

# Deploy controller
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

# Wait for controller to be ready
kubectl wait --for=condition=available --timeout=300s \
  deployment/bind9-controller -n dns-system

Step 3: Create a BIND9 Cluster

First, create a cluster configuration that defines shared settings:

Create a file bind9-cluster.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"

⚠️ Warning: There are NO defaults for allowQuery and allowTransfer. If you don’t specify these fields, BIND9’s default behavior applies (no queries or transfers allowed). Always explicitly configure these fields for your security requirements.

Apply it:

kubectl apply -f bind9-cluster.yaml

Optional: Add Persistent Storage

To persist zone data across pod restarts, you can add PersistentVolumeClaims to your Bind9Cluster or Bind9Instance.

First, create a PVC for zone data storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bind9-zones-pvc
  namespace: dns-system
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  # Uses default StorageClass if not specified
  # storageClassName: local-path

Then update your Bind9Cluster to use the PVC:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
  # Add persistent storage for zones
  volumes:
    - name: zones
      persistentVolumeClaim:
        claimName: bind9-zones-pvc
  volumeMounts:
    - name: zones
      mountPath: /var/cache/bind

Or add storage to a specific Bind9Instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns
  role: primary  # Required: primary or secondary
  replicas: 1
  # Instance-specific storage (overrides cluster-level)
  volumes:
    - name: zones
      persistentVolumeClaim:
        claimName: bind9-primary-zones-pvc
  volumeMounts:
    - name: zones
      mountPath: /var/cache/bind

Note: When using PVCs with accessMode: ReadWriteOnce, each replica needs its own PVC since the volume can only be mounted by one pod at a time. For multi-replica setups, use ReadWriteMany if your storage class supports it, or create separate PVCs per instance.

Step 4: Create a BIND9 Instance

Now create an instance that references the cluster:

Create a file bind9-instance.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References the Bind9Cluster
  role: primary  # Required: primary or secondary
  replicas: 1

Apply it:

kubectl apply -f bind9-instance.yaml

Step 5: Create a DNS Zone

Create a file dns-zone.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production-dns  # References the Bind9Cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Apply it:

kubectl apply -f dns-zone.yaml

Step 6: Add DNS Records

Create a file dns-records.yaml:

Note: DNS records reference zones using zoneRef, which is the Kubernetes resource name of the DNSZone (e.g., example-com for a DNSZone named example-com).

# Web server A record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

---
# Blog CNAME record
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.
  ttl: 300

---
# Mail server MX record
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
  ttl: 3600

---
# SPF TXT record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: "@"
  text:
    - "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Apply them:

kubectl apply -f dns-records.yaml

Step 7: Verify Your DNS Configuration

Check the status of your resources:

# Check BIND9 cluster
kubectl get bind9clusters -n dns-system

# Check BIND9 instance
kubectl get bind9instances -n dns-system

# Check DNS zone
kubectl get dnszones -n dns-system

# Check DNS records
kubectl get arecords,cnamerecords,mxrecords,txtrecords -n dns-system

# View detailed status
kubectl describe dnszone example-com -n dns-system

You should see output like:

NAME          ZONE          STATUS   AGE
example-com   example.com   Ready    1m

Step 8: Test DNS Resolution

If your BIND9 instance is exposed (via LoadBalancer or NodePort):

# Get the BIND9 service IP
kubectl get svc -n dns-system

# Test DNS query (replace <BIND9-IP> with actual IP)
dig @<BIND9-IP> www.example.com
dig @<BIND9-IP> blog.example.com
dig @<BIND9-IP> example.com MX
dig @<BIND9-IP> example.com TXT

What’s Next?

You’ve successfully deployed Bindy and created your first DNS zone with records!

Learn More

Common Next Steps

  1. Add Secondary DNS Instances for high availability
  2. Configure Zone Transfers between primary and secondary
  3. Set up Monitoring to track DNS performance
  4. Integrate with GitOps for automated deployments
  5. Configure DNSSEC for enhanced security

Production Checklist

Before going to production:

  • Deploy multiple controller replicas for HA
  • Set up primary and secondary DNS instances
  • Configure resource limits and requests
  • Enable monitoring and alerting
  • Set up backup for CRD definitions
  • Configure RBAC properly
  • Review security settings
  • Test disaster recovery procedures

Troubleshooting

If something doesn’t work:

  1. Check controller logs:

    kubectl logs -n dns-system -l app=bind9-controller -f
    
  2. Check resource status:

    kubectl describe dnszone example-com -n dns-system
    
  3. Verify CRDs are installed:

    kubectl get crd | grep bindy.firestoned.io
    

See the Troubleshooting Guide for more help.

Installing CRDs

Custom Resource Definitions (CRDs) extend Kubernetes with new resource types for DNS management.

What are CRDs?

CRDs define the schema for custom resources in Kubernetes. Bindy uses CRDs to represent:

  • BIND9 clusters (cluster-level configuration)
  • BIND9 instances (individual DNS server deployments)
  • DNS zones
  • DNS records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

Installation

Install all Bindy CRDs:

kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

Or install from local files:

cd bindy
kubectl create -f deploy/crds/

Important: Use kubectl create instead of kubectl apply to avoid the 256KB annotation size limit that can occur with large CRDs like Bind9Instance.

Updating Existing CRDs

To update CRDs that are already installed:

kubectl replace --force -f deploy/crds/

The --force flag deletes and recreates the CRDs, which is necessary to avoid annotation size limits.

Verify Installation

Check that all CRDs are installed:

kubectl get crd | grep bindy.firestoned.io

Expected output:

aaaarecords.bindy.firestoned.io         2024-01-01T00:00:00Z
arecords.bindy.firestoned.io            2024-01-01T00:00:00Z
bind9clusters.bindy.firestoned.io       2024-01-01T00:00:00Z
bind9instances.bindy.firestoned.io      2024-01-01T00:00:00Z
caarecords.bindy.firestoned.io          2024-01-01T00:00:00Z
cnamerecords.bindy.firestoned.io        2024-01-01T00:00:00Z
dnszones.bindy.firestoned.io            2024-01-01T00:00:00Z
mxrecords.bindy.firestoned.io           2024-01-01T00:00:00Z
nsrecords.bindy.firestoned.io           2024-01-01T00:00:00Z
srvrecords.bindy.firestoned.io          2024-01-01T00:00:00Z
txtrecords.bindy.firestoned.io          2024-01-01T00:00:00Z

CRD Details

For detailed specifications of each CRD, see:

Next Steps

Deploying the Controller

The Bindy controller watches for DNS resources and manages BIND9 configurations.

Prerequisites

Before deploying the controller:

  1. CRDs must be installed
  2. RBAC must be configured
  3. Namespace must exist (dns-system recommended)

Installation

Create Namespace

kubectl create namespace dns-system

Install RBAC

kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

This creates:

  • ServiceAccount for the controller
  • ClusterRole with required permissions
  • ClusterRoleBinding to bind them together

Deploy Controller

kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

Wait for Readiness

kubectl wait --for=condition=available --timeout=300s \
  deployment/bind9-controller -n dns-system

Verify Deployment

Check controller pod status:

kubectl get pods -n dns-system -l app=bind9-controller

Expected output:

NAME                                READY   STATUS    RESTARTS   AGE
bind9-controller-7d4b8c4f9b-x7k2m   1/1     Running   0          1m

Check controller logs:

kubectl logs -n dns-system -l app=bind9-controller -f

You should see:

{"timestamp":"2024-01-01T00:00:00Z","level":"INFO","message":"Starting Bindy controller"}
{"timestamp":"2024-01-01T00:00:01Z","level":"INFO","message":"Watching DNSZone resources"}
{"timestamp":"2024-01-01T00:00:01Z","level":"INFO","message":"Watching DNS record resources"}

Configuration

Environment Variables

Configure the controller via environment variables:

VariableDefaultDescription
RUST_LOGinfoLog level (error, warn, info, debug, trace)
BIND9_ZONES_DIR/etc/bind/zonesDirectory for zone files
RECONCILE_INTERVAL300Reconciliation interval in seconds

Edit the deployment to customize:

env:
  - name: RUST_LOG
    value: "debug"
  - name: BIND9_ZONES_DIR
    value: "/var/lib/bind/zones"

Resource Limits

For production, set appropriate resource limits:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

High Availability

Run multiple replicas with leader election:

spec:
  replicas: 3

Troubleshooting

Controller Not Starting

  1. Check pod events:

    kubectl describe pod -n dns-system -l app=bind9-controller
    
  2. Check if CRDs are installed:

    kubectl get crd | grep bindy.firestoned.io
    
  3. Check RBAC permissions:

    kubectl auth can-i list dnszones --as=system:serviceaccount:dns-system:bind9-controller
    

High Memory Usage

If the controller uses excessive memory:

  1. Reduce log level: RUST_LOG=warn
  2. Increase resource limits
  3. Check for memory leaks in logs

Next Steps

Basic Concepts

This section introduces the core concepts behind Bindy and how it manages DNS infrastructure in Kubernetes.

The Kubernetes Way

Bindy follows Kubernetes patterns and idioms:

  • Declarative Configuration - You declare what DNS records should exist, Bindy makes it happen
  • Custom Resources - DNS zones and records are Kubernetes resources
  • Controllers - Bindy watches resources and reconciles state
  • Labels and Selectors - Target specific BIND9 instances using labels
  • Status Subresources - Track the health and state of DNS resources

Core Resources

Bindy introduces these Custom Resource Definitions (CRDs):

Infrastructure Resources

  • Bind9Cluster - Cluster-level configuration (version, shared config, TSIG keys, ACLs)
  • Bind9Instance - Individual BIND9 DNS server deployment (inherits from cluster)

DNS Resources

  • DNSZone - Defines a DNS zone with SOA record (references a cluster)
  • DNS Records - Individual DNS record types:
    • ARecord (IPv4)
    • AAAARecord (IPv6)
    • CNAMERecord (Canonical Name)
    • MXRecord (Mail Exchange)
    • TXTRecord (Text)
    • NSRecord (Name Server)
    • SRVRecord (Service)
    • CAARecord (Certificate Authority Authorization)

How It Works

graph TB
    subgraph k8s["Kubernetes API"]
        zone["DNSZone"]
        arecord["ARecord"]
        mx["MXRecord"]
        txt["TXTRecord"]
        more["..."]
    end

    controller["Bindy Controller<br/>• Watches CRDs<br/>• Reconciles state<br/>• RNDC client<br/>• TSIG authentication"]

    bind9["BIND9 Instances<br/>• rndc daemon (port 953)<br/>• Primary servers<br/>• Secondary servers<br/>• Dynamic zones<br/>• DNS queries (port 53)"]

    zone --> controller
    arecord --> controller
    mx --> controller
    txt --> controller
    more --> controller

    controller -->|"RNDC Protocol<br/>(Port 953/TCP)<br/>TSIG/HMAC-SHA256"| bind9

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px

Reconciliation Loop

  1. Watch - Controller watches for changes to DNS resources
  2. Discover - Finds BIND9 instance pods via Kubernetes API
  3. Authenticate - Loads RNDC key from Kubernetes Secret
  4. Execute - Sends RNDC commands to BIND9 (addzone, reload, etc.)
  5. Verify - BIND9 executes command and returns success/error
  6. Status - Reports success or failure via status conditions

RNDC Protocol

Bindy uses the native BIND9 Remote Name Daemon Control (RNDC) protocol for managing DNS zones and servers. This provides:

  • Direct Control - Native BIND9 management without intermediate files
  • Real-time Operations - Immediate feedback on success or failure
  • Atomic Commands - Operations succeed or fail atomically
  • Secure Communication - TSIG authentication with HMAC-SHA256

RNDC Commands

Common RNDC operations used by Bindy:

  • addzone <zone> - Dynamically add a new zone
  • delzone <zone> - Remove a zone
  • reload <zone> - Reload zone data
  • notify <zone> - Trigger zone transfer to secondaries
  • zonestatus <zone> - Query zone status
  • retransfer <zone> - Force zone transfer from primary

TSIG Authentication

All RNDC communication is secured using TSIG (Transaction Signature):

  • Authentication - Verifies command source is authorized
  • Integrity - Prevents command tampering
  • Replay Protection - Timestamp validation prevents replay attacks
  • Key Storage - RNDC keys stored in Kubernetes Secrets
  • Per-Instance Keys - Each BIND9 instance has unique HMAC-SHA256 key

Cluster References

Instead of label selectors, zones now reference a specific BIND9 cluster:

# DNS Zone references a cluster
spec:
  zoneName: example.com
  clusterRef: my-dns-cluster  # References Bind9Instance name

This simplifies:

  • Zone placement - Direct reference to cluster
  • Pod discovery - Find instances by cluster name
  • RNDC key lookup - Keys named {clusterRef}-rndc-key

Resource Relationships

graph BT
    records["DNS Records<br/>(A, CNAME, MX, etc.)"]
    zone["DNSZone<br/>(has clusterRef)"]
    instance["Bind9Instance<br/>(has clusterRef)"]
    cluster["Bind9Cluster<br/>(cluster config)"]

    records -->|"references<br/>zone field"| zone
    zone -->|"references<br/>clusterRef"| instance
    instance -->|"references<br/>clusterRef"| cluster

    style records fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style zone fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style instance fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style cluster fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px

Three-Tier Hierarchy

  1. Bind9Cluster - Cluster-level configuration

    • Shared BIND9 version
    • Common config (recursion, DNSSEC, forwarders)
    • TSIG keys for zone transfers
    • ACL definitions
  2. Bind9Instance - Instance deployment

    • References a Bind9Cluster via clusterRef
    • Can override cluster config
    • Has RNDC key for management
    • Manages pods and services
  3. DNSZone - DNS zone definition

    • References a Bind9Instance via clusterRef
    • Contains SOA record
    • Applied to instance via RNDC
  4. DNS Records - Individual records

    • Reference a DNSZone by name
    • Added to zone via RNDC (planned: nsupdate)

RNDC Key Secret Relationship

graph TD
    instance["Bind9Instance:<br/>my-dns-instance"]
    secret["Secret:<br/>my-dns-instance-rndc-key"]
    data["data:<br/>key-name: my-dns-instance<br/>algorithm: hmac-sha256<br/>secret: base64-encoded-key"]

    instance -->|creates/expects| secret
    secret --> data

    style instance fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style secret fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style data fill:#fce4ec,stroke:#880e4f,stroke-width:2px

The controller uses this Secret to authenticate RNDC commands to the BIND9 instance.

Status and Conditions

All resources report their status:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: Zone created for 2 instances
      lastTransitionTime: 2024-01-01T00:00:00Z
  observedGeneration: 1
  matchedInstances: 2

Status conditions follow Kubernetes conventions:

  • Type - What aspect (Ready, Synced, etc.)
  • Status - True, False, or Unknown
  • Reason - Machine-readable reason code
  • Message - Human-readable description

Next Steps

Architecture Overview

This page provides a detailed overview of Bindy’s architecture and design principles.

High-Level Architecture

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph crds["Custom Resource Definitions"]
            crd1["Bind9Instance"]
            crd2["DNSZone"]
            crd3["ARecord, MXRecord, ..."]
        end

        subgraph controller["Bindy Controller (Rust)"]
            reconciler1["Instance<br/>Reconciler"]
            reconciler2["Zone<br/>Reconciler"]
            reconciler3["Records<br/>Reconciler"]
            zonegen["Zone File Generator"]
        end

        subgraph bind9["BIND9 Instances"]
            primary["Primary DNS<br/>(us-east)"]
            secondary1["Secondary DNS<br/>(us-west)"]
            secondary2["Secondary DNS<br/>(eu)"]
        end
    end

    clients["Clients<br/>• Apps<br/>• Services<br/>• External"]

    crds -->|watches| controller
    controller -->|configures| bind9
    primary -->|AXFR| secondary1
    secondary1 -->|AXFR| secondary2
    bind9 -->|"DNS queries<br/>(UDP/TCP 53)"| clients

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style crds fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style clients fill:#fce4ec,stroke:#880e4f,stroke-width:2px

Components

Bindy Controller

The controller is written in Rust using the kube-rs library. It consists of:

1. Reconcilers

Each reconciler handles a specific resource type:

  • Bind9Instance Reconciler - Manages BIND9 instance lifecycle

    • Creates StatefulSets for BIND9 pods
    • Configures services and networking
    • Updates instance status
  • Bind9Cluster Reconciler - Manages cluster-level configuration

    • Manages finalizers for cascade deletion
    • Creates and reconciles managed instances
    • Propagates global configuration to instances
    • Tracks cluster-wide status
  • DNSZone Reconciler - Manages DNS zones

    • Evaluates label selectors
    • Generates zone files
    • Updates zone configuration
    • Reports matched instances
  • Record Reconcilers - Manage individual DNS records

    • One reconciler per record type (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
    • Validates record specifications
    • Appends records to zone files
    • Updates record status

2. Zone File Generator

Generates BIND9-compatible zone files from Kubernetes resources:

#![allow(unused)]
fn main() {
// Simplified example
pub fn generate_zone_file(zone: &DNSZone, records: Vec<DNSRecord>) -> String {
    let mut zone_file = String::new();

    // SOA record
    zone_file.push_str(&format_soa_record(&zone.spec.soa_record));

    // NS records
    for ns in &zone.spec.name_servers {
        zone_file.push_str(&format_ns_record(ns));
    }

    // Individual records
    for record in records {
        zone_file.push_str(&format_record(record));
    }

    zone_file
}
}

Custom Resource Definitions (CRDs)

CRDs define the schema for DNS resources:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: dnszones.bindy.firestoned.io
spec:
  group: bindy.firestoned.io
  names:
    kind: DNSZone
    plural: dnszones
  scope: Namespaced
  versions:
    - name: v1alpha1
      served: true
      storage: true

BIND9 Instances

BIND9 servers managed by Bindy:

  • Deployed as Kubernetes StatefulSets
  • Configuration via ConfigMaps
  • Zone files mounted from ConfigMaps or PVCs
  • Support for primary and secondary architectures

Data Flow

Zone Creation Flow

  1. User creates DNSZone resource

    kubectl apply -f dnszone.yaml
    
  2. Controller watches and receives event

    #![allow(unused)]
    fn main() {
    // Watch stream receives create event
    stream.next().await
    }
  3. DNSZone reconciler evaluates selector

    #![allow(unused)]
    fn main() {
    // Find matching Bind9Instances
    let instances = find_matching_instances(&zone.spec.instance_selector).await?;
    }
  4. Generate zone file for each instance

    #![allow(unused)]
    fn main() {
    // Create zone configuration
    let zone_file = generate_zone_file(&zone, &records)?;
    }
  5. Update BIND9 configuration

    #![allow(unused)]
    fn main() {
    // Apply ConfigMap with zone file
    update_bind9_config(&instance, &zone_file).await?;
    }
  6. Update DNSZone status

    #![allow(unused)]
    fn main() {
    // Report success
    update_status(&zone, conditions, matched_instances).await?;
    }

Managed Instance Creation Flow

When a Bind9Cluster specifies replica counts, the controller automatically creates instances:

flowchart TD
    A[Bind9Cluster Created] --> B{Has primary.replicas?}
    B -->|Yes| C[Create primary-0, primary-1, ...]
    B -->|No| D{Has secondary.replicas?}
    C --> D
    D -->|Yes| E[Create secondary-0, secondary-1, ...]
    D -->|No| F[No instances created]
    E --> G[Add management labels]
    G --> H[Instances inherit cluster config]
  1. User creates Bind9Cluster with replicas

    apiVersion: bindy.firestoned.io/v1alpha1
    kind: Bind9Cluster
    metadata:
      name: production-dns
    spec:
      primary:
        replicas: 2
      secondary:
        replicas: 3
    
  2. Bind9Cluster reconciler evaluates replica counts

    #![allow(unused)]
    fn main() {
    let primary_replicas = cluster.spec.primary.as_ref()
        .and_then(|p| p.replicas).unwrap_or(0);
    }
  3. Create missing instances with management labels

    #![allow(unused)]
    fn main() {
    let mut labels = BTreeMap::new();
    labels.insert("bindy.firestoned.io/managed-by", "Bind9Cluster");
    labels.insert("bindy.firestoned.io/cluster", &cluster_name);
    labels.insert("bindy.firestoned.io/role", "primary");
    }
  4. Instances inherit cluster configuration

    #![allow(unused)]
    fn main() {
    let instance_spec = Bind9InstanceSpec {
        cluster_ref: cluster_name.clone(),
        version: cluster.spec.version.clone(),
        config: None,  // Inherit from cluster
        // ...
    };
    }
  5. Self-healing: Recreate deleted instances

    • Controller detects missing managed instances
    • Automatically recreates them with same configuration

Cascade Deletion Flow

When a Bind9Cluster is deleted, all its instances are automatically cleaned up:

flowchart TD
    A[kubectl delete bind9cluster] --> B[Deletion timestamp set]
    B --> C{Finalizer present?}
    C -->|Yes| D[Controller detects deletion]
    D --> E[Find all instances with clusterRef]
    E --> F[Delete each instance]
    F --> G{All deleted?}
    G -->|Yes| H[Remove finalizer]
    G -->|No| I[Retry deletion]
    H --> J[Cluster deleted]
    I --> F
  1. User deletes Bind9Cluster

    kubectl delete bind9cluster production-dns
    
  2. Finalizer prevents immediate deletion

    #![allow(unused)]
    fn main() {
    if cluster.metadata.deletion_timestamp.is_some() {
        // Cleanup before allowing deletion
        delete_cluster_instances(&client, &namespace, &name).await?;
    }
    }
  3. Find and delete all referencing instances

    #![allow(unused)]
    fn main() {
    let instances: Vec<_> = all_instances.into_iter()
        .filter(|i| i.spec.cluster_ref == cluster_name)
        .collect();
    
    for instance in instances {
        api.delete(&instance_name, &DeleteParams::default()).await?;
    }
    }
  4. Remove finalizer once cleanup complete

    #![allow(unused)]
    fn main() {
    let mut finalizers = cluster.metadata.finalizers.unwrap_or_default();
    finalizers.retain(|f| f != FINALIZER_NAME);
    }

Record Addition Flow

  1. User creates DNS record resource
  2. Controller receives event
  3. Record reconciler validates zone reference
  4. Append record to existing zone file
  5. Reload BIND9 configuration
  6. Update record status

Zone Transfer Configuration Flow

For primary/secondary DNS architectures, zones must be configured with zone transfer settings:

flowchart TD
    A[DNSZone Reconciliation] --> B[Discover Secondary Pods]
    B --> C{Secondary IPs Found?}
    C -->|Yes| D[Configure zone with<br/>also-notify & allow-transfer]
    C -->|No| E[Configure zone<br/>without transfers]
    D --> F[Store IPs in<br/>DNSZone.status.secondaryIps]
    E --> F
    F --> G[Next Reconciliation]
    G --> H[Compare Current vs Stored IPs]
    H --> I{IPs Changed?}
    I -->|Yes| J[Delete & Recreate Zones]
    I -->|No| K[No Action]
    J --> B
    K --> G

Implementation Details:

  1. Secondary Discovery - On every reconciliation:

    #![allow(unused)]
    fn main() {
    // Find all Bind9Instance resources with role=secondary for this cluster
    let instance_api: Api<Bind9Instance> = Api::namespaced(client.clone(), namespace);
    let lp = ListParams::default().labels(&format!("cluster={cluster_name},role=secondary"));
    let instances = instance_api.list(&lp).await?;
    
    // Collect IPs from running pods
    for instance in instances {
        let pod_ips = get_pod_ips(&client, namespace, &instance).await?;
        secondary_ips.extend(pod_ips);
    }
    }
  2. Zone Transfer Configuration - Pass secondary IPs to zone creation:

    #![allow(unused)]
    fn main() {
    let zone_config = ZoneConfig {
        // ... other fields ...
        also_notify: Some(secondary_ips.clone()),
        allow_transfer: Some(secondary_ips.clone()),
    };
    }
  3. Change Detection - Compare IPs on each reconciliation:

    #![allow(unused)]
    fn main() {
    // Get stored IPs from status
    let stored_ips = dnszone.status.as_ref()
        .and_then(|s| s.secondary_ips.as_ref());
    
    // Compare sorted lists
    let secondaries_changed = match stored_ips {
        Some(stored) => {
            let mut stored = stored.clone();
            let mut current = current_secondary_ips.clone();
            stored.sort();
            current.sort();
            stored != current
        }
        None => !current_secondary_ips.is_empty(),
    };
    
    // Recreate zones if IPs changed
    if secondaries_changed {
        delete_dnszone(client.clone(), dnszone.clone(), zone_manager).await?;
        add_dnszone(client.clone(), dnszone.clone(), zone_manager).await?;
    }
    }
  4. Status Tracking - Store current IPs for future comparison:

    #![allow(unused)]
    fn main() {
    let new_status = DNSZoneStatus {
        conditions: vec![ready_condition],
        observed_generation: dnszone.metadata.generation,
        record_count: Some(total_records),
        secondary_ips: Some(current_secondary_ips),  // Store for next reconciliation
    };
    }

Why This Matters:

  • Self-healing: When secondary pods are rescheduled/restarted and get new IPs, zones automatically update
  • No manual intervention: Primary zones always have correct secondary IPs for zone transfers
  • Automatic recovery: Zone transfers resume within one reconciliation period (~5-10 minutes) after IP changes
  • Minimal overhead: Leverages existing reconciliation loop, no additional watchers needed

Concurrency Model

Bindy uses Rust’s async/await with Tokio runtime:

#[tokio::main]
async fn main() -> Result<()> {
    // Spawn multiple reconcilers concurrently
    tokio::try_join!(
        run_bind9instance_controller(),
        run_dnszone_controller(),
        run_record_controllers(),
    )?;
    Ok(())
}

Benefits:

  • Concurrent reconciliation - Multiple resources reconciled simultaneously
  • Non-blocking I/O - Efficient API server communication
  • Low memory footprint - Async tasks use minimal memory
  • High throughput - Handle thousands of DNS records efficiently

Resource Watching

The controller uses Kubernetes watch API with reflector caching:

#![allow(unused)]
fn main() {
let api: Api<DNSZone> = Api::all(client);
let watcher = watcher(api, ListParams::default());

// Reflector caches resources locally
let store = reflector::store::Writer::default();
let reader = store.as_reader();
let reflector = reflector(store, watcher);

// Process events
while let Some(event) = stream.try_next().await? {
    match event {
        Applied(zone) => reconcile_zone(zone).await?,
        Deleted(zone) => cleanup_zone(zone).await?,
        Restarted(_) => refresh_all().await?,
    }
}
}

Error Handling

Multi-layer error handling strategy:

  1. Validation Errors - Caught early, reported in status
  2. Reconciliation Errors - Retried with exponential backoff
  3. Fatal Errors - Logged and cause controller restart
  4. Status Reporting - All errors visible in resource status
#![allow(unused)]
fn main() {
match reconcile_zone(&zone).await {
    Ok(_) => update_status(Ready, "Synchronized"),
    Err(e) => {
        log::error!("Failed to reconcile zone: {}", e);
        update_status(NotReady, e.to_string());
        // Requeue for retry
        Err(e)
    }
}
}

Performance Optimizations

1. Incremental Updates

Only regenerate zone files when records change, not on every reconciliation.

2. Caching

Local cache of BIND9 instances to avoid repeated API calls.

3. Batch Processing

Group related updates to minimize BIND9 reloads.

4. Zero-Copy Operations

Use string slicing and references to avoid unnecessary allocations.

5. Compiled Binary

Rust compilation produces optimized native code with no runtime overhead.

Security Architecture

RBAC

Controller uses least-privilege service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bind9-controller
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bind9-controller
rules:
  - apiGroups: ["bindy.firestoned.io"]
    resources: ["dnszones", "arecords", ...]
    verbs: ["get", "list", "watch", "update"]

Non-Root Containers

Controller runs as non-root user:

USER 65532:65532

Network Policies

Limit controller network access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-controller
spec:
  podSelector:
    matchLabels:
      app: bind9-controller
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 443  # API server only

Scalability

Horizontal Scaling - Operator Leader Election

Multiple controller replicas use Kubernetes Lease-based leader election for high availability:

sequenceDiagram
    participant O1 as Operator Instance 1
    participant O2 as Operator Instance 2
    participant L as Kubernetes Lease
    participant K as Kubernetes API

    O1->>L: Acquire lease
    L-->>O1: Lease granted
    O1->>K: Start reconciliation
    O2->>L: Try acquire lease
    L-->>O2: Lease already held
    O2->>O2: Wait in standby

    Note over O1: Instance fails
    O2->>L: Acquire lease
    L-->>O2: Lease granted
    O2->>K: Start reconciliation

Implementation:

#![allow(unused)]
fn main() {
// Create lease manager with configuration
let lease_manager = LeaseManagerBuilder::new(client.clone(), &lease_name)
    .with_namespace(&lease_namespace)
    .with_identity(&identity)
    .with_duration(Duration::from_secs(15))
    .with_grace(Duration::from_secs(2))
    .build()
    .await?;

// Watch leadership status
let (leader_rx, lease_handle) = lease_manager.watch().await;

// Run controllers with leader monitoring
tokio::select! {
    result = monitor_leadership(leader_rx) => {
        warn!("Leadership lost! Stopping all controllers...");
    }
    result = run_all_controllers() => {
        // Normal controller execution
    }
}
}

Failover characteristics:

  • Lease duration: 15 seconds (configurable)
  • Automatic failover: ~15 seconds if leader fails
  • Zero data loss: New leader resumes from Kubernetes state
  • Multiple replicas: Support for 2-5+ operator instances

Resource Limits

Recommended production configuration:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

Can handle:

  • 1000+ DNS zones
  • 10,000+ DNS records
  • <100ms average reconciliation time

Next Steps

Technical Architecture

System Overview

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph namespace["DNS System Namespace (dns-system)"]
            subgraph controller["Rust Controller Pod"]
                subgraph eventloop["Main Event Loop<br/>(runs concurrently via Tokio)"]
                    dnszone_ctrl["DNSZone Controller"]
                    arecord_ctrl["ARecord Controller"]
                    txt_ctrl["TXTRecord Controller"]
                    cname_ctrl["CNAMERecord Controller"]
                end

                subgraph reconcilers["Reconcilers"]
                    rec_dnszone["reconcile_dnszone()"]
                    rec_a["reconcile_a_record()"]
                    rec_txt["reconcile_txt_record()"]
                    rec_cname["reconcile_cname_record()"]
                end

                subgraph manager["BIND9 Manager"]
                    create_zone["create_zone_file()"]
                    add_a["add_a_record()"]
                    add_txt["add_txt_record()"]
                    delete_zone["delete_zone()"]
                end
            end

            subgraph bind9["BIND9 Instance Pods (scaled)"]
                zones["/etc/bind/zones/db.example.com<br/>/etc/bind/zones/db.internal.local<br/>..."]
            end
        end

        subgraph etcd["Custom Resources (in etcd)"]
            instances["• Bind9Instance (primary-dns, secondary-dns)"]
            dnszones["• DNSZone (example-com, internal-local)"]
            arecords["• ARecord (www, api, db, ...)"]
            txtrecords["• TXTRecord (spf, dmarc, ...)"]
            cnamerecords["• CNAMERecord (blog, cache, ...)"]
        end
    end

    eventloop --> reconcilers
    reconcilers --> manager
    manager --> bind9

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style namespace fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style eventloop fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reconcilers fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style manager fill:#ffe0b2,stroke:#e65100,stroke-width:2px
    style bind9 fill:#f1f8e9,stroke:#33691e,stroke-width:2px
    style etcd fill:#e0f2f1,stroke:#004d40,stroke-width:2px

Control Flow

1. DNSZone Creation Flow

User creates DNSZone
    ↓
Kubernetes API Server stores in etcd
    ↓
Watch event triggered
    ↓
Controller receives event (via kube-rs runtime)
    ↓
reconcile_dnszone_wrapper() called
    ↓
reconcile_dnszone() logic:
  1. Extract DNSZone spec
  2. Evaluate instanceSelector against Bind9Instance labels
  3. Find matching instances (e.g., 2 matching)
  4. Call zone_manager.create_zone_file()
  5. Zone file created in /etc/bind/zones/db.example.com
  6. Update DNSZone status with "Ready" condition
    ↓
Status Update (via API)
    ↓
Done, requeue after 5 minutes

2. Record Creation Flow

User creates ARecord
    ↓
Kubernetes API Server stores in etcd
    ↓
Watch event triggered
    ↓
Controller receives event
    ↓
reconcile_a_record_wrapper() called
    ↓
reconcile_a_record() logic:
  1. Extract ARecord spec (zone, name, ip, ttl)
  2. Call zone_manager.add_a_record()
  3. Record appended to zone file
  4. Update ARecord status with "Ready" condition
    ↓
Status Update (via API)
    ↓
Done, requeue after 5 minutes

Concurrency Model

graph TB
    subgraph runtime["Main Tokio Runtime"]
        dnszone_task["DNSZone Controller Task<br/>(watches DNSZone resources)"]
        arecord_task["ARecord Controller Task<br/>(concurrent)"]
        txt_task["TXTRecord Controller Task<br/>(concurrent)"]
        cname_task["CNAME Controller Task<br/>(concurrent)"]

        dnszone_task --> arecord_task
        arecord_task --> txt_task
        txt_task --> cname_task
    end

    note["All tasks run concurrently via Tokio's<br/>thread pool without blocking each other."]

    style runtime fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style dnszone_task fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style arecord_task fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style txt_task fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style cname_task fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style note fill:#fffde7,stroke:#f57f17,stroke-width:1px

Data Structures

CRD Type Hierarchy

trait CustomResource (from kube-derive)
    │
    ├─→ Bind9Instance
    │       └─ spec: Bind9InstanceSpec
    │       └─ status: Bind9InstanceStatus
    │
    ├─→ DNSZone
    │       └─ spec: DNSZoneSpec
    │       │        ├─ zone_name: String
    │       │        ├─ instance_selector: LabelSelector
    │       │        └─ soa_record: SOARecord
    │       └─ status: DNSZoneStatus
    │
    ├─→ ARecord
    │       └─ spec: ARecordSpec
    │       └─ status: RecordStatus
    │
    ├─→ TXTRecord
    │       └─ spec: TXTRecordSpec
    │       └─ status: RecordStatus
    │
    └─→ CNAMERecord
            └─ spec: CNAMERecordSpec
            └─ status: RecordStatus

Label Selector

LabelSelector
    ├─ match_labels: Option<BTreeMap<String, String>>
    │       └─ "dns-role": "primary"
    │       └─ "environment": "production"
    │
    └─ match_expressions: Option<Vec<LabelSelectorRequirement>>
            ├─ key: "dns-role"
            │  operator: "In"
            │  values: ["primary", "secondary"]
            │
            └─ key: "environment"
               operator: "In"
               values: ["production", "staging"]

Zone File Generation

Input: DNSZone resource
    │
    ├─ zone_name: "example.com"
    ├─ soa_record:
    │   ├─ primary_ns: "ns1.example.com."
    │   ├─ admin_email: "admin@example.com"
    │   ├─ serial: 2024010101
    │   ├─ refresh: 3600
    │   ├─ retry: 600
    │   ├─ expire: 604800
    │   └─ negative_ttl: 86400
    │
    └─ ttl: 3600

Processing:
    1. Create file: /etc/bind/zones/db.example.com
    2. Write SOA record header
    3. Add NS record for primary
    4. Set default TTL

Output: /etc/bind/zones/db.example.com
    │
    ├─ $TTL 3600
    ├─ @ IN SOA ns1.example.com. admin.example.com. (
    │       2024010101  ; serial
    │       3600        ; refresh
    │       600         ; retry
    │       604800      ; expire
    │       86400 )     ; minimum
    ├─ @ IN NS ns1.example.com.
    │
    └─ (waiting for record additions)

Then for each ARecord, TXTRecord, etc:
    Append:
    www 300 IN A 192.0.2.1
    @ 3600 IN TXT "v=spf1 include:_spf.example.com ~all"
    blog 300 IN CNAME www.example.com.

Error Handling Strategy

Reconciliation Error
    │
    ├─→ Log error with context
    ├─→ Update resource status with error condition
    ├─→ Return error to controller
    │
    └─→ Error Policy Handler:
        ├─ If transient (file not found, etc.)
        │   └─ Requeue after 30 seconds (exponential backoff possible)
        │
        └─ If persistent (validation error, etc.)
            └─ Log and skip (manual intervention needed)

Dependencies Flow

main.rs
    ├─→ crd.rs (type definitions)
    │   ├─ Bind9Instance
    │   ├─ DNSZone
    │   ├─ ARecord
    │   ├─ TXTRecord
    │   ├─ CNAMERecord
    │   └─ LabelSelector
    │
    ├─→ bind9.rs (zone management)
    │   └─ Bind9Manager
    │
    ├─→ reconcilers/
    │   ├─ dnszone.rs
    │   │   ├─ reconcile_dnszone()
    │   │   ├─ delete_dnszone()
    │   │   └─ update_status()
    │   │
    │   └─ records.rs
    │       ├─ reconcile_a_record()
    │       ├─ reconcile_txt_record()
    │       └─ reconcile_cname_record()
    │
    └─→ Tokio (async runtime)
        └─ kube-rs (Kubernetes client)

Performance Characteristics

Memory Layout

Rust Controller (typical): ~50MB
    ├─ Binary loaded: ~20MB
    ├─ Tokio runtime: ~10MB
    ├─ In-flight reconciliations: ~5MB
    ├─ Caches/buffers: ~5MB
    └─ Misc overhead: ~10MB

vs Python Operator: ~250MB+
    ├─ Python interpreter: ~50MB
    ├─ Dependencies: ~100MB
    ├─ Kopf framework: ~50MB
    └─ Runtime data: ~50MB+

Latency Profile

Operation                    Rust         Python
─────────────────────────────────────────────────
Create DNSZone              <100ms       500-1000ms
Add A Record                <50ms        200-500ms
Evaluate label selector     <20ms        100-300ms
Update status              <30ms        150-300ms
Controller startup         <1s          5-10s
Full zone reconciliation   <500ms       2-5s

Scalability

With Rust Controller:
    • 10 zones: <1s reconciliation
    • 100 zones: <5s reconciliation
    • 1000 records: <10s total reconciliation
    • Handles hundreds of events/sec

vs Python Operator:
    • 10 zones: 5-10s reconciliation
    • 100 zones: 50-100s reconciliation
    • 1000 records: 30-60s total reconciliation
    • Struggles with >10 events/sec

RBAC Requirements

cluster-role: bind9-controller
    │
    ├─ [get, list, watch] on dnszones
    ├─ [get, list, watch] on arecords
    ├─ [get, list, watch] on txtrecords
    ├─ [get, list, watch] on cnamerecords
    ├─ [get, list, watch] on bind9instances
    │
    └─ [update, patch] on [*/status]
        └─ (for updating status subresources)

State Management

Kubernetes etcd (Source of Truth)
    │
    ├─→ Store DNSZone resources
    ├─→ Store Record resources
    ├─→ Store status conditions
    │
    └─→ Controller watches via kube-rs
        │
        ├─→ Detects changes
        ├─→ Triggers reconciliation
        ├─→ Generates zone files
        │
        └─→ BIND9 pod reads zone files
            ├─→ Loads into memory
            └─→ Serves DNS queries

Extension Points

Current Implementation:
    • DNSZone → Zone file creation
    • ARecord → A record addition
    • TXTRecord → TXT record addition
    • CNAMERecord → CNAME record addition

Future Extensions (easy to add):
    • AAAARecord → IPv6 support
    • MXRecord → Mail record support
    • NSRecord → Nameserver support
    • SRVRecord → Service record support
    • Health endpoints → Liveness/readiness
    • Metrics → Prometheus integration
    • Webhooks → Custom validation
    • Finalizers → Graceful cleanup

This architecture provides a clean, performant, and extensible foundation for managing DNS infrastructure in Kubernetes.

HTTP API Sidecar Architecture

This page provides a detailed overview of Bindy’s architecture that uses an HTTP API sidecar (bindcar) to manage BIND9 instances. The sidecar executes RNDC commands locally within the pod, providing a modern RESTful interface for DNS management.

High-Level Architecture

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph crds["Custom Resource Definitions (CRDs)"]
            cluster["Bind9Cluster<br/>(cluster config)"]
            instance["Bind9Instance"]
            zone["DNSZone"]
            records["ARecord, AAAARecord,<br/>TXTRecord, MXRecord, etc."]

            cluster --> instance
            instance --> zone
            zone --> records
        end

        subgraph controller["Bindy Controller (Rust)"]
            rec1["Bind9Cluster<br/>Reconciler"]
            rec2["Bind9Instance<br/>Reconciler"]
            rec3["DNSZone<br/>Reconciler"]
            rec4["DNS Record<br/>Reconcilers"]
            manager["Bind9Manager (RNDC Client)<br/>• add_zone() • reload_zone()<br/>• delete_zone() • notify_zone()<br/>• zone_status() • freeze_zone()"]
        end

        subgraph bind9["BIND9 Instances (Pods)"]
            subgraph primary_pod["Primary Pod (bind9-primary)"]
                primary_bind["BIND9 Container<br/>• rndc daemon (localhost:953)<br/>• DNS (port 53)<br/>Dynamic zones:<br/>- example.com<br/>- internal.local"]
                primary_api["Bindcar API Sidecar<br/>• HTTP API (port 80→8080)<br/>• ServiceAccount auth<br/>• Local RNDC client<br/>• Zone file management"]

                primary_api -->|"rndc localhost:953"| primary_bind
            end

            subgraph secondary_pod["Secondary Pod (bind9-secondary)"]
                secondary_bind["BIND9 Container<br/>• rndc daemon (localhost:953)<br/>• DNS (port 53)<br/>Transferred zones:<br/>- example.com<br/>- internal.local"]
                secondary_api["Bindcar API Sidecar<br/>• HTTP API (port 80→8080)<br/>• ServiceAccount auth<br/>• Local RNDC client"]

                secondary_api -->|"rndc localhost:953"| secondary_bind
            end
        end

        secrets["RNDC Keys (Secrets)<br/>• bind9-primary-rndc-key<br/>• bind9-secondary-rndc-key<br/>(HMAC-SHA256)"]
        volumes["Shared Volumes<br/>• /var/cache/bind (zone files)<br/>• /etc/bind/keys (RNDC keys, read-only for API)"]
    end

    clients["DNS Clients<br/>• Applications<br/>• Services<br/>• External users"]

    crds -->|"watches<br/>(Kubernetes API)"| controller
    controller -->|"HTTP API<br/>(REST/JSON)<br/>Port 80/TCP"| bind9
    volumes -.->|"mounts"| primary_pod
    volumes -.->|"mounts"| secondary_pod
    primary_bind -->|"AXFR/IXFR"| secondary_bind
    secondary_bind -.->|"IXFR"| primary_bind
    bind9 -->|"DNS Queries<br/>(UDP/TCP 53)"| clients
    secrets -.->|"authenticates"| primary_api
    secrets -.->|"authenticates"| secondary_api

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style crds fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style secrets fill:#ffe0b2,stroke:#e65100,stroke-width:2px
    style clients fill:#fce4ec,stroke:#880e4f,stroke-width:2px

Key Architectural Changes from File-Based Approach

Old Architecture (File-Based)

  • Controller generated zone files
  • Files written to ConfigMaps
  • ConfigMaps mounted into BIND9 pods
  • Manual rndc reload triggered after file changes
  • Complex synchronization between ConfigMaps and BIND9 state

New Architecture (RNDC Protocol + Cluster Hierarchy)

  • Three-tier resource model: Bind9Cluster → Bind9Instance → DNSZone
  • Controller uses native RNDC protocol
  • Direct communication with BIND9 via port 953
  • Commands executed in real-time: addzone, delzone, reload
  • No file manipulation or ConfigMap management
  • BIND9 manages zone files internally with dynamic updates
  • Atomic operations with immediate feedback
  • Cluster-level config sharing (version, TSIG keys, ACLs)

Three-Tier Resource Model

1. Bind9Cluster (Cluster Configuration)

Defines shared configuration for a logical group of BIND9 instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  config:
    recursion: false
    dnssec:
      enabled: true
      validation: true
    allowQuery:
      - any
    allowTransfer:
      - 10.0.0.0/8
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: base64-encoded-key
  acls:
    internal:
      - 10.0.0.0/8
      - 172.16.0.0/12

2. Bind9Instance (Instance Deployment)

References a cluster and deploys BIND9 pods:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-primary
spec:
  clusterRef: production-dns  # References Bind9Cluster
  role: primary
  replicas: 2

The instance inherits configuration from the cluster but can override specific settings.

3. DNSZone (Zone Definition)

References an instance and creates zones via RNDC:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
spec:
  zoneName: example.com
  clusterRef: dns-primary  # References Bind9Instance
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101

RNDC Protocol Communication

┌──────────────────────┐                 ┌──────────────────────┐
│  Bindy Controller    │                 │   BIND9 Instance     │
│                      │                 │   (Primary)          │
│  ┌────────────────┐  │                 │                      │
│  │ Bind9Manager   │  │                 │  ┌────────────────┐  │
│  │                │  │   TCP Port 953  │  │  rndc daemon   │  │
│  │ RndcClient     │──┼────────────────▶│  │                │  │
│  │  • Server URL  │  │  TSIG Auth      │  │  Validates:    │  │
│  │  • Algorithm   │  │  HMAC-SHA256    │  │  • Key name    │  │
│  │  • Secret Key  │  │                 │  │  • Signature   │  │
│  │                │  │                 │  │  • Timestamp   │  │
│  └────────────────┘  │                 │  └────────────────┘  │
│         │            │                 │         │            │
│         │ Commands:  │                 │         │            │
│         │            │                 │         ▼            │
│    addzone zone {    │                 │  ┌────────────────┐  │
│      type master;    │                 │  │ BIND9 named    │  │
│      file "x.zone";  │────────────────▶│  │                │  │
│    };                │                 │  │ • Creates zone │  │
│                      │◀────────────────│  │ • Loads into   │  │
│    Success/Error     │    Response     │  │   memory       │  │
│                      │                 │  │ • Writes file  │  │
│                      │                 │  └────────────────┘  │
└──────────────────────┘                 └──────────────────────┘

RNDC Authentication Flow

┌────────────────────────────────────────────────────────────────┐
│  1. Controller Retrieves RNDC Key from Kubernetes Secret      │
│                                                                │
│  Secret: bind9-primary-rndc-key                              │
│    data:                                                      │
│      key-name: "bind9-primary"                               │
│      algorithm: "hmac-sha256"                                │
│      secret: "base64-encoded-256-bit-key"                    │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  2. Create RndcClient Instance                                │
│                                                                │
│  let client = RndcClient::new(                                │
│      "bind9-primary.dns-system.svc.cluster.local:953",       │
│      "hmac-sha256",                                           │
│      "base64-secret-key"                                      │
│  );                                                           │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  3. Execute RNDC Command with TSIG Authentication             │
│                                                                │
│  TSIG Signature = HMAC-SHA256(                                │
│      key: secret,                                             │
│      data: command + timestamp + nonce                        │
│  )                                                            │
│                                                                │
│  Request packet:                                              │
│    • Command: "addzone example.com { type master; ... }"     │
│    • TSIG record with signature                              │
│    • Timestamp                                                │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  4. BIND9 Validates Request                                   │
│                                                                │
│  • Looks up key "bind9-primary" in rndc.key file             │
│  • Verifies HMAC-SHA256 signature matches                    │
│  • Checks timestamp is within acceptable window              │
│  • Executes command if valid                                 │
│  • Returns success/error with TSIG-signed response           │
└────────────────────────────────────────────────────────────────┘

Data Flow: Zone Creation

User creates DNSZone resource
    │
    │ kubectl apply -f dnszone.yaml
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Kubernetes API Server stores DNSZone in etcd            │
└─────────────────────────────────────────────────────────┘
    │
    │ Watch event
    ▼
┌─────────────────────────────────────────────────────────┐
│ Bindy Controller receives event                         │
│   • DNSZone watcher triggers                            │
│   • Event: Applied(dnszone)                             │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ reconcile_dnszone() called                              │
│   1. Extract namespace and name                         │
│   2. Get zone spec (zone_name, cluster_ref, etc.)      │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Find PRIMARY pod for cluster                            │
│   • List pods with labels:                              │
│     app=bind9, instance={cluster_ref}                   │
│   • Select first running pod                            │
│   • Build server address:                               │
│     "{cluster_ref}.{namespace}.svc.cluster.local:953"   │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Load RNDC key from Secret                               │
│   • Secret name: "{cluster_ref}-rndc-key"              │
│   • Parse key-name, algorithm, secret                   │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Execute RNDC addzone command                            │
│   zone_manager.add_zone(                                │
│       zone_name: "example.com",                         │
│       zone_type: "master",                              │
│       zone_file: "/var/lib/bind/example.com.zone",     │
│       server: "bind9-primary...:953",                   │
│       key_data: RndcKeyData { ... }                     │
│   )                                                     │
└─────────────────────────────────────────────────────────┘
    │
    │ RNDC Protocol (Port 953)
    ▼
┌─────────────────────────────────────────────────────────┐
│ BIND9 Instance executes command                         │
│   • Creates zone configuration                          │
│   • Allocates memory for zone                           │
│   • Creates zone file /var/lib/bind/example.com.zone   │
│   • Loads zone into memory                              │
│   • Starts serving DNS queries for zone                 │
│   • Returns success response                            │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Update DNSZone status                                   │
│   status:                                               │
│     conditions:                                         │
│       - type: Ready                                     │
│         status: "True"                                  │
│         message: "Zone created for cluster: ..."        │
└─────────────────────────────────────────────────────────┘

Data Flow: Record Addition

User creates ARecord resource
    │
    │ kubectl apply -f arecord.yaml
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Kubernetes API Server stores ARecord in etcd            │
└─────────────────────────────────────────────────────────┘
    │
    │ Watch event
    ▼
┌─────────────────────────────────────────────────────────┐
│ Bindy Controller receives event                         │
│   • ARecord watcher triggers                            │
│   • Event: Applied(arecord)                             │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ reconcile_a_record() called                             │
│   1. Extract namespace and name                         │
│   2. Get spec (zone, name, ipv4_address, ttl)          │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Find cluster from zone                                   │
│   • List DNSZone resources in namespace                 │
│   • Find zone matching spec.zone                        │
│   • Extract zone.spec.cluster_ref                       │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Load RNDC key and build server address                  │
│   • Load "{cluster_ref}-rndc-key" Secret               │
│   • Server: "{cluster_ref}.{namespace}.svc:953"        │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Add record via RNDC (PLACEHOLDER - Future nsupdate)     │
│   zone_manager.add_a_record(                            │
│       zone: "example.com",                              │
│       name: "www",                                      │
│       ipv4: "192.0.2.1",                               │
│       ttl: Some(300),                                   │
│       server: "bind9-primary...:953",                   │
│       key_data: RndcKeyData { ... }                     │
│   )                                                     │
│                                                         │
│ NOTE: Currently logs intent. Full implementation will   │
│       use nsupdate protocol for dynamic DNS updates.    │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Update ARecord status                                   │
│   status:                                               │
│     conditions:                                         │
│       - type: Ready                                     │
│         status: "True"                                  │
│         message: "A record created"                     │
└─────────────────────────────────────────────────────────┘

RNDC Commands Supported

The Bind9Manager provides the following RNDC operations:

Zone Management

┌────────────────────┬─────────────────────────────────────────┐
│ Operation          │ RNDC Command                            │
├────────────────────┼─────────────────────────────────────────┤
│ add_zone()         │ addzone <zone> { type <type>;           │
│                    │                   file "<file>"; };     │
│                    │                                         │
│ delete_zone()      │ delzone <zone>                          │
│                    │                                         │
│ reload_zone()      │ reload <zone>                           │
│                    │                                         │
│ reload_all_zones() │ reload                                  │
│                    │                                         │
│ retransfer_zone()  │ retransfer <zone>                       │
│                    │                                         │
│ notify_zone()      │ notify <zone>                           │
│                    │                                         │
│ freeze_zone()      │ freeze <zone>                           │
│                    │                                         │
│ thaw_zone()        │ thaw <zone>                             │
│                    │                                         │
│ zone_status()      │ zonestatus <zone>                       │
│                    │                                         │
│ server_status()    │ status                                  │
└────────────────────┴─────────────────────────────────────────┘

Record Management (Planned)

Currently implemented as placeholders:
  • add_a_record()      (will use nsupdate protocol)
  • add_aaaa_record()   (will use nsupdate protocol)
  • add_txt_record()    (will use nsupdate protocol)
  • add_cname_record()  (will use nsupdate protocol)
  • add_mx_record()     (will use nsupdate protocol)
  • add_ns_record()     (will use nsupdate protocol)
  • add_srv_record()    (will use nsupdate protocol)
  • add_caa_record()    (will use nsupdate protocol)

Note: RNDC protocol doesn't support individual record operations.
These will be implemented using the nsupdate protocol for dynamic
DNS updates, or via zone file manipulation + reload.

Pod Discovery and Networking

┌────────────────────────────────────────────────────────────┐
│ Controller discovers BIND9 pods using labels:              │
│                                                            │
│   Pod labels:                                             │
│     app: bind9                                            │
│     instance: {cluster_ref}                               │
│                                                            │
│   Controller searches:                                    │
│     List pods where app=bind9 AND instance={cluster_ref}  │
│                                                            │
│   Service DNS:                                            │
│     {cluster_ref}.{namespace}.svc.cluster.local:953      │
│                                                            │
│   Example:                                                │
│     bind9-primary.dns-system.svc.cluster.local:953       │
└────────────────────────────────────────────────────────────┘

Zone Transfers (AXFR/IXFR)

Primary Instance                    Secondary Instance
┌─────────────────┐                ┌─────────────────┐
│ example.com     │                │                 │
│ Serial: 2024010│                │                 │
│                 │   1. NOTIFY    │                 │
│                 │───────────────▶│                 │
│                 │                │                 │
│                 │   2. SOA Query │                 │
│                 │◀───────────────│  Checks serial  │
│                 │                │                 │
│                 │   3. AXFR/IXFR │                 │
│                 │◀───────────────│  Serial outdated│
│                 │                │                 │
│  Sends full     │   Zone data    │                 │
│  zone (AXFR) or │───────────────▶│  Updates zone   │
│  delta (IXFR)   │                │  Serial: 2024010│
│                 │                │                 │
└─────────────────┘                └─────────────────┘

Triggered by:
  • zone_manager.notify_zone()
  • zone_manager.retransfer_zone()
  • BIND9 automatic refresh timers (SOA refresh value)

Components Deep Dive

1. Bind9Manager

Rust struct that wraps the rndc crate for BIND9 management:

#![allow(unused)]
fn main() {
pub struct Bind9Manager;

impl Bind9Manager {
    pub fn new() -> Self { Self }

    // RNDC key generation
    pub fn generate_rndc_key() -> RndcKeyData { ... }
    pub fn create_rndc_secret_data(key_data: &RndcKeyData) -> BTreeMap<String, String> { ... }
    pub fn parse_rndc_secret_data(data: &BTreeMap<String, Vec<u8>>) -> Result<RndcKeyData> { ... }

    // Core RNDC operations
    async fn exec_rndc_command(&self, server: &str, key_data: &RndcKeyData, command: &str) -> Result<String> { ... }

    // Zone management
    pub async fn add_zone(&self, zone_name: &str, zone_type: &str, zone_file: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn delete_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn reload_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn notify_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
}
}

2. RndcKeyData

Struct for RNDC authentication:

#![allow(unused)]
fn main() {
pub struct RndcKeyData {
    pub name: String,      // Key name (e.g., "bind9-primary")
    pub algorithm: String, // HMAC algorithm (e.g., "hmac-sha256")
    pub secret: String,    // Base64-encoded secret key
}
}

3. Reconcilers

Zone reconciler using RNDC:

#![allow(unused)]
fn main() {
pub async fn reconcile_dnszone(
    client: Client,
    dnszone: DNSZone,
    zone_manager: &Bind9Manager,
) -> Result<()> {
    // 1. Find PRIMARY pod
    let primary_pod = find_primary_pod(&client, &namespace, &cluster_ref).await?;

    // 2. Load RNDC key
    let key_data = load_rndc_key(&client, &namespace, &cluster_ref).await?;

    // 3. Build server address
    let server = format!("{}.{}.svc.cluster.local:953", cluster_ref, namespace);

    // 4. Add zone via RNDC
    zone_manager.add_zone(&zone_name, "master", &zone_file, &server, &key_data).await?;

    // 5. Update status
    update_status(&client, &dnszone, "Ready", "True", "Zone created").await?;

    Ok(())
}
}

Security Architecture

TSIG Authentication

┌────────────────────────────────────────────────────────────┐
│ TSIG (Transaction Signature) provides:                     │
│                                                            │
│  1. Authentication - Verifies command source               │
│  2. Integrity - Prevents command tampering                 │
│  3. Replay protection - Timestamp validation               │
│                                                            │
│ Algorithm: HMAC-SHA256 (256-bit keys)                     │
│ Key Storage: Kubernetes Secrets (base64-encoded)          │
│ Key Generation: Random 256-bit keys per instance          │
└────────────────────────────────────────────────────────────┘

Network Security

┌────────────────────────────────────────────────────────────┐
│ • RNDC traffic on port 953/TCP (not exposed externally)   │
│ • DNS queries on port 53/UDP+TCP (exposed via Service)    │
│ • All RNDC communication within cluster network           │
│ • No external RNDC access (ClusterIP services only)       │
│ • NetworkPolicies can restrict RNDC access to controller  │
└────────────────────────────────────────────────────────────┘

RBAC Requirements

# Controller needs access to:
- Secrets (get, list) - for RNDC keys
- Pods (get, list) - for pod discovery
- Services (get, list) - for DNS resolution
- DNSZone, ARecord, etc. (get, list, watch, update status)

Performance Characteristics

Latency

Operation                    Old (File-based)    New (RNDC)
─────────────────────────────────────────────────────────────
Create DNSZone              2-5 seconds          <500ms
Add DNS Record              1-3 seconds          <200ms
Delete DNSZone              2-4 seconds          <500ms
Zone reload                 1-2 seconds          <300ms
Status check                N/A                  <100ms

Benefits of RNDC Protocol

✓ Atomic operations - Commands succeed or fail atomically
✓ Real-time feedback - Immediate success/error responses
✓ No ConfigMap overhead - No intermediate Kubernetes resources
✓ Direct control - Native BIND9 management interface
✓ Better error messages - BIND9 provides detailed errors
✓ Zone status queries - Can check zone state anytime
✓ Freeze/thaw support - Control dynamic updates precisely
✓ Notify support - Trigger zone transfers on demand

Future Enhancements

1. nsupdate Protocol Integration

Implement dynamic DNS updates for individual records:
  • Use nsupdate protocol alongside RNDC
  • Add/update/delete individual A, AAAA, TXT, etc. records
  • No full zone reload needed for record changes
  • Even lower latency for record operations

2. Zone Transfer Monitoring

Monitor AXFR/IXFR operations:
  • Track transfer status
  • Report transfer errors
  • Automatic retry on failures

3. Health Checks

Periodic health checks using RNDC:
  • server_status() - overall server health
  • zone_status() - per-zone health
  • Update CRD status with health information

Next Steps

Architecture Diagrams

Comprehensive visual diagrams showing Bindy’s architecture, components, and data flows.

System Architecture

graph TB
    subgraph "Kubernetes Cluster"
        subgraph "Custom Resources"
            BC[Bind9Cluster]
            BI[Bind9Instance]
            DZ[DNSZone]
            AR[ARecord]
            CR[CNAMERecord]
            MR[MXRecord]
            TR[TXTRecord]
        end

        subgraph "Bindy Controller (Rust)"
            WA[Watch API<br/>kube-rs]

            subgraph "Reconcilers"
                BCR[Bind9Cluster<br/>Reconciler]
                BIR[Bind9Instance<br/>Reconciler]
                DZR[DNSZone<br/>Reconciler]
                RR[Record<br/>Reconcilers]
            end

            subgraph "Core Components"
                BM[Bind9Manager<br/>RNDC Client]
                RES[Resource<br/>Builders]
            end
        end

        subgraph "Kubernetes Resources"
            DEP[Deployments]
            CM[ConfigMaps]
            SEC[Secrets]
            SVC[Services]
        end

        subgraph "BIND9 Pods"
            P1[Primary DNS<br/>us-east]
            P2[Secondary DNS<br/>us-west]
            P3[Secondary DNS<br/>eu]
        end
    end

    subgraph "External"
        CLI[DNS Clients]
    end

    %% Custom Resource relationships
    BC -.inherits.-> BI
    BI -.references.-> DZ
    DZ -.contains.-> AR
    DZ -.contains.-> CR
    DZ -.contains.-> MR
    DZ -.contains.-> TR

    %% Watch relationships
    BC --> WA
    BI --> WA
    DZ --> WA
    AR --> WA
    CR --> WA
    MR --> WA
    TR --> WA

    %% Reconciler routing
    WA --> BCR
    WA --> BIR
    WA --> DZR
    WA --> RR

    %% Component interactions
    BCR --> RES
    BIR --> RES
    DZR --> BM
    RR --> BM

    %% K8s resource creation
    RES --> DEP
    RES --> CM
    RES --> SEC
    RES --> SVC

    %% RNDC communication
    BM -.RNDC:953.-> P1
    BM -.RNDC:953.-> P2
    BM -.RNDC:953.-> P3

    %% DNS deployment
    DEP --> P1
    DEP --> P2
    DEP --> P3
    CM --> P1
    CM --> P2
    CM --> P3
    SEC --> P1

    %% Zone transfers
    P1 -.AXFR/IXFR.-> P2
    P1 -.AXFR/IXFR.-> P3

    %% DNS queries
    CLI -.DNS:53.-> P1
    CLI -.DNS:53.-> P2
    CLI -.DNS:53.-> P3

    style BC fill:#e1f5ff
    style BI fill:#e1f5ff
    style DZ fill:#e1f5ff
    style AR fill:#fff4e1
    style CR fill:#fff4e1
    style MR fill:#fff4e1
    style TR fill:#fff4e1
    style WA fill:#f0f0f0
    style BCR fill:#d4e8d4
    style BIR fill:#d4e8d4
    style DZR fill:#d4e8d4
    style RR fill:#d4e8d4
    style BM fill:#ffd4d4
    style RES fill:#ffd4d4

Rust Component Architecture

graph TB
    subgraph "Main Process"
        MAIN[main.rs<br/>Tokio Runtime]
    end

    subgraph "CRD Definitions (src/crd.rs)"
        CRD_BC[Bind9Cluster]
        CRD_BI[Bind9Instance]
        CRD_DZ[DNSZone]
        CRD_REC[Record Types<br/>A, AAAA, CNAME,<br/>MX, NS, TXT,<br/>SRV, CAA]
    end

    subgraph "Reconcilers (src/reconcilers/)"
        RECON_BC[bind9cluster.rs]
        RECON_BI[bind9instance.rs]
        RECON_DZ[dnszone.rs]
        RECON_REC[records.rs]
    end

    subgraph "BIND9 Management (src/bind9/)"
        BM_MGR[Bind9Manager]
        BM_KEY[RndcKeyData]
        BM_CMD[Zone Operations<br/>HTTP API & RNDC<br/>addzone, delzone,<br/>reload, freeze,<br/>thaw, notify]
    end

    subgraph "Resource Builders (src/bind9_resources.rs)"
        RB_DEP[build_deployment]
        RB_CM[build_configmap]
        RB_SVC[build_service]
        RB_VOL[build_volumes]
        RB_POD[build_podspec]
    end

    subgraph "External Dependencies"
        KUBE[kube-rs<br/>Kubernetes Client]
        RNDC[rndc-rs<br/>RNDC Protocol]
        TOKIO[tokio<br/>Async Runtime]
        SERDE[serde<br/>Serialization]
    end

    %% Main process spawns reconcilers
    MAIN --> RECON_BC
    MAIN --> RECON_BI
    MAIN --> RECON_DZ
    MAIN --> RECON_REC

    %% Reconcilers use CRD types
    RECON_BC -.uses.-> CRD_BC
    RECON_BI -.uses.-> CRD_BI
    RECON_DZ -.uses.-> CRD_DZ
    RECON_REC -.uses.-> CRD_REC

    %% Reconcilers call managers
    RECON_BI --> RB_DEP
    RECON_BI --> RB_CM
    RECON_BI --> RB_SVC
    RECON_DZ --> BM_MGR
    RECON_REC --> BM_MGR

    %% Resource builders use components
    RB_DEP --> RB_POD
    RB_DEP --> RB_VOL
    RB_CM --> RB_VOL

    %% BIND9 manager components
    BM_MGR --> BM_KEY
    BM_MGR --> BM_CMD

    %% External dependencies
    MAIN --> TOKIO
    RECON_BC --> KUBE
    RECON_BI --> KUBE
    RECON_DZ --> KUBE
    RECON_REC --> KUBE
    BM_CMD --> RNDC
    CRD_BC --> SERDE
    CRD_BI --> SERDE
    CRD_DZ --> SERDE
    CRD_REC --> SERDE

    style MAIN fill:#e1f5ff
    style CRD_BC fill:#d4e8d4
    style CRD_BI fill:#d4e8d4
    style CRD_DZ fill:#d4e8d4
    style CRD_REC fill:#d4e8d4
    style RECON_BC fill:#fff4e1
    style RECON_BI fill:#fff4e1
    style RECON_DZ fill:#fff4e1
    style RECON_REC fill:#fff4e1
    style BM_MGR fill:#ffd4d4
    style BM_KEY fill:#ffd4d4
    style BM_CMD fill:#ffd4d4
    style RB_DEP fill:#e8d4f8
    style RB_CM fill:#e8d4f8
    style RB_SVC fill:#e8d4f8
    style RB_VOL fill:#e8d4f8
    style RB_POD fill:#e8d4f8

DNS Record Creation Data Flow

sequenceDiagram
    participant User
    participant K8sAPI as Kubernetes API
    participant Watch as Watch Stream
    participant RecRec as Record Reconciler
    participant ZoneRec as DNSZone Reconciler
    participant BindMgr as Bind9Manager
    participant Primary as Primary BIND9
    participant Secondary as Secondary BIND9
    participant Client as DNS Client

    Note over User,Client: Record Creation Flow

    User->>K8sAPI: kubectl apply -f arecord.yaml
    K8sAPI->>K8sAPI: Validate CRD schema
    K8sAPI->>K8sAPI: Store in etcd
    K8sAPI-->>User: ARecord created

    K8sAPI->>Watch: Event: ARecord Added
    Watch->>RecRec: Trigger reconciliation

    RecRec->>K8sAPI: Get referenced DNSZone
    K8sAPI-->>RecRec: DNSZone details

    RecRec->>K8sAPI: Get Bind9Instance (via clusterRef)
    K8sAPI-->>RecRec: Bind9Instance details

    RecRec->>K8sAPI: Get RNDC Secret
    K8sAPI-->>RecRec: RNDC key data

    RecRec->>BindMgr: Call add_a_record()
    Note over BindMgr: Currently placeholder<br/>Will use nsupdate
    BindMgr-->>RecRec: Ok(())

    RecRec->>BindMgr: Call reload_zone(zone_name)
    BindMgr->>Primary: RNDC reload zone
    activate Primary
    Primary->>Primary: Reload zone file
    Primary-->>BindMgr: Success
    deactivate Primary
    BindMgr-->>RecRec: Zone reloaded

    RecRec->>K8sAPI: Update ARecord status
    K8sAPI-->>RecRec: Status updated

    Note over Primary,Secondary: Zone Transfer (AXFR/IXFR)

    Primary->>Secondary: NOTIFY (zone updated)
    activate Secondary
    Secondary->>Primary: SOA query (check serial)
    Primary-->>Secondary: SOA record

    alt Serial increased
        Secondary->>Primary: IXFR/AXFR request
        Primary-->>Secondary: Zone transfer
        Secondary->>Secondary: Update zone
    else Serial unchanged
        Secondary->>Secondary: No update needed
    end
    deactivate Secondary

    Note over Client,Secondary: DNS Query

    Client->>Secondary: DNS query (www.example.com A?)
    activate Secondary
    Secondary->>Secondary: Lookup in zone
    Secondary-->>Client: Answer: 192.0.2.1
    deactivate Secondary

Zone Creation and Synchronization Flow

stateDiagram-v2
    [*] --> ZoneCreated: User creates DNSZone

    ZoneCreated --> Validating: Controller watches event

    Validating --> ValidatingInstance: Validate zone spec
    ValidatingInstance --> ValidatingCluster: Find Bind9Instance
    ValidatingCluster --> GeneratingConfig: Find Bind9Cluster

    GeneratingConfig --> CreatingRNDCKey: Generate zone config
    CreatingRNDCKey --> StoringSecret: Generate RNDC key
    StoringSecret --> AddingZone: Store in Secret

    AddingZone --> ConnectingRNDC: Call rndc addzone
    ConnectingRNDC --> ExecutingCommand: Connect via port 953
    ExecutingCommand --> VerifyingZone: Execute addzone command

    VerifyingZone --> Ready: Verify zone exists
    Ready --> [*]: Update status to Ready

    ValidatingInstance --> Failed: Instance not found
    ValidatingCluster --> Failed: Cluster not found
    AddingZone --> Failed: RNDC command failed
    ConnectingRNDC --> Failed: Connection failed

    Failed --> [*]: Update status conditions

    note right of GeneratingConfig
        Creates zone with:
        - SOA record
        - Default TTL
        - Zone file path
    end note

    note right of AddingZone
        Uses RNDC protocol:
        addzone example.com
        '{ type master;
           file "zones/example.com"; }'
    end note

Primary to Secondary Zone Transfer Flow

sequenceDiagram
    participant Ctl as Bindy Controller
    participant Pri as Primary BIND9<br/>(us-east)
    participant Sec1 as Secondary BIND9<br/>(us-west)
    participant Sec2 as Secondary BIND9<br/>(eu)

    Note over Ctl,Sec2: Initial Zone Setup

    Ctl->>Pri: RNDC addzone example.com
    activate Pri
    Pri->>Pri: Create zone file
    Pri-->>Ctl: Zone added
    deactivate Pri

    Ctl->>Sec1: RNDC addzone example.com (type secondary)
    activate Sec1
    Sec1->>Sec1: Configure as secondary
    Sec1-->>Ctl: Zone added as secondary
    deactivate Sec1

    Ctl->>Sec2: RNDC addzone example.com (type secondary)
    activate Sec2
    Sec2->>Sec2: Configure as secondary
    Sec2-->>Ctl: Zone added as secondary
    deactivate Sec2

    Note over Pri,Sec2: Initial Zone Transfer

    Sec1->>Pri: SOA query (get serial)
    Pri-->>Sec1: SOA serial=2024010101
    Sec1->>Pri: AXFR request (full transfer)
    Pri-->>Sec1: Complete zone data
    Sec1->>Sec1: Write zone file

    Sec2->>Pri: SOA query (get serial)
    Pri-->>Sec2: SOA serial=2024010101
    Sec2->>Pri: AXFR request (full transfer)
    Pri-->>Sec2: Complete zone data
    Sec2->>Sec2: Write zone file

    Note over Ctl,Sec2: Record Update

    Ctl->>Ctl: User adds new ARecord
    Ctl->>Pri: Update zone + reload
    activate Pri
    Pri->>Pri: Update zone file
    Pri->>Pri: Increment serial to 2024010102
    Pri-->>Ctl: Zone reloaded
    deactivate Pri

    Note over Pri,Sec2: NOTIFY and Incremental Transfer

    Pri->>Sec1: NOTIFY (zone updated)
    Pri->>Sec2: NOTIFY (zone updated)

    activate Sec1
    Sec1->>Pri: SOA query (check serial)
    Pri-->>Sec1: SOA serial=2024010102
    Sec1->>Sec1: Compare: 2024010102 > 2024010101
    Sec1->>Pri: IXFR request (incremental)
    Pri-->>Sec1: Only changed records
    Sec1->>Sec1: Apply changes
    Sec1-->>Pri: ACK
    deactivate Sec1

    activate Sec2
    Sec2->>Pri: SOA query (check serial)
    Pri-->>Sec2: SOA serial=2024010102
    Sec2->>Sec2: Compare: 2024010102 > 2024010101
    Sec2->>Pri: IXFR request (incremental)
    Pri-->>Sec2: Only changed records
    Sec2->>Sec2: Apply changes
    Sec2-->>Pri: ACK
    deactivate Sec2

    Note over Pri,Sec2: All zones synchronized

Reconciliation Loop

flowchart TD
    Start([Watch Event Received]) --> CheckType{Event Type?}

    CheckType -->|Added/Modified| GetResource[Get Resource from API]
    CheckType -->|Deleted| Cleanup[Run Cleanup Logic]
    CheckType -->|Restarted| RefreshAll[Refresh All Resources]

    GetResource --> CheckGen{observedGeneration<br/>== metadata.generation?}
    CheckGen -->|Yes| SkipRecon[Skip: Already reconciled]
    CheckGen -->|No| ValidateSpec[Validate Spec]

    ValidateSpec --> CheckValid{Valid?}
    CheckValid -->|No| UpdateFailed[Update Status: Failed]
    CheckValid -->|Yes| Reconcile[Execute Reconciliation]

    Reconcile --> ReconcileResult{Success?}
    ReconcileResult -->|Yes| UpdateReady[Update Status: Ready]
    ReconcileResult -->|No| CheckRetry{Retryable?}

    CheckRetry -->|Yes| Requeue[Requeue with backoff]
    CheckRetry -->|No| UpdateError[Update Status: Error]

    UpdateReady --> UpdateGen[Update observedGeneration]
    UpdateError --> Requeue
    UpdateFailed --> End

    UpdateGen --> End([Done])
    Cleanup --> End
    RefreshAll --> End
    SkipRecon --> End
    Requeue --> End

    style Start fill:#e1f5ff
    style End fill:#e1f5ff
    style Reconcile fill:#d4e8d4
    style UpdateReady fill:#d4f8d4
    style UpdateError fill:#f8d4d4
    style UpdateFailed fill:#f8d4d4
    style CheckType fill:#fff4e1
    style CheckGen fill:#fff4e1
    style CheckValid fill:#fff4e1
    style ReconcileResult fill:#fff4e1
    style CheckRetry fill:#fff4e1

RNDC Protocol Communication

sequenceDiagram
    participant BM as Bind9Manager<br/>(Rust)
    participant RC as RNDC Client<br/>(rndc-rs)
    participant Net as TCP Socket<br/>:953
    participant BIND as BIND9 Server<br/>(rndc daemon)

    Note over BM,BIND: RNDC Key Setup (One-time)

    BM->>BM: generate_rndc_key()
    BM->>BM: Create HMAC-SHA256 key
    BM->>BM: Store in K8s Secret

    Note over BM,BIND: RNDC Command Execution

    BM->>RC: new(server, algorithm, secret)
    RC->>RC: Parse RNDC key
    RC->>RC: Prepare TSIG signature

    BM->>RC: rndc_command("reload zone")
    RC->>Net: Connect to server:953
    Net->>BIND: TCP handshake

    RC->>RC: Create RNDC message
    RC->>RC: Sign with HMAC-SHA256
    RC->>Net: Send signed message
    Net->>BIND: Forward RNDC message

    activate BIND
    BIND->>BIND: Verify TSIG signature
    BIND->>BIND: Execute: reload zone
    BIND->>BIND: Reload zone file
    BIND->>Net: Response + TSIG
    deactivate BIND

    Net->>RC: Receive response
    RC->>RC: Verify response TSIG
    RC->>RC: Parse result
    RC-->>BM: Ok(result.text)

    alt Authentication Failed
        BIND-->>Net: Error: TSIG verification failed
        Net-->>RC: Error response
        RC-->>BM: Err("RNDC authentication failed")
    end

    alt Command Failed
        BIND-->>Net: Error: Zone not found
        Net-->>RC: Error response
        RC-->>BM: Err("Zone not found")
    end

Multi-Cluster Deployment

graph TB
    subgraph "Cluster: us-east-1"
        BC1[Bind9Cluster:<br/>production-dns]
        BI1[Bind9Instance:<br/>primary-dns]
        DZ1[DNSZone:<br/>example.com]
        P1[Primary BIND9<br/>172.16.1.10]

        BC1 -.-> BI1
        BI1 -.-> DZ1
        DZ1 --> P1
    end

    subgraph "Cluster: us-west-2"
        BC2[Bind9Cluster:<br/>production-dns]
        BI2[Bind9Instance:<br/>secondary-dns-west]
        DZ2[DNSZone:<br/>example.com]
        S1[Secondary BIND9<br/>172.16.2.10]

        BC2 -.-> BI2
        BI2 -.-> DZ2
        DZ2 --> S1
    end

    subgraph "Cluster: eu-central-1"
        BC3[Bind9Cluster:<br/>production-dns]
        BI3[Bind9Instance:<br/>secondary-dns-eu]
        DZ3[DNSZone:<br/>example.com]
        S2[Secondary BIND9<br/>172.16.3.10]

        BC3 -.-> BI3
        BI3 -.-> DZ3
        DZ3 --> S2
    end

    P1 -.AXFR/IXFR.-> S1
    P1 -.AXFR/IXFR.-> S2

    LB[Global Load Balancer<br/>GeoDNS]

    LB -.US Traffic.-> P1
    LB -.US Traffic.-> S1
    LB -.EU Traffic.-> S2

    style BC1 fill:#e1f5ff
    style BC2 fill:#e1f5ff
    style BC3 fill:#e1f5ff
    style BI1 fill:#d4e8d4
    style BI2 fill:#d4e8d4
    style BI3 fill:#d4e8d4
    style P1 fill:#ffd4d4
    style S1 fill:#fff4e1
    style S2 fill:#fff4e1
    style LB fill:#f0f0f0

Custom Resource Definitions

Bindy extends Kubernetes with these Custom Resource Definitions (CRDs).

Infrastructure CRDs

Bind9Cluster

Represents cluster-level configuration shared across multiple BIND9 instances.

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: "base64-encoded-secret"

Learn more: Bind9Cluster concept documentation

Bind9Instance

Represents a BIND9 DNS server instance that references a Bind9Cluster.

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References Bind9Cluster
  replicas: 2

Learn more about Bind9Instance

DNS CRDs

DNSZone

Defines a DNS zone with SOA record and references a Bind9Instance.

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns  # References Bind9Instance
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Learn more about DNSZone

DNS Record Types

Bindy supports all common DNS record types:

  • ARecord - IPv4 addresses
  • AAAARecord - IPv6 addresses
  • CNAMERecord - Canonical name aliases
  • MXRecord - Mail exchange
  • TXTRecord - Text records (SPF, DKIM, etc.)
  • NSRecord - Nameserver delegation
  • SRVRecord - Service discovery
  • CAARecord - Certificate authority authorization

Learn more about DNS Records

Resource Hierarchy

The three-tier resource model:

Bind9Cluster (cluster config)
    ↑
    │ referenced by clusterRef
    │
Bind9Instance (instance deployment)
    ↑
    │ referenced by clusterRef
    │
DNSZone (zone definition)
    ↑
    │ referenced by zone field
    │
DNS Records (A, CNAME, MX, etc.)

Common Fields

All Bindy CRDs share these common fields:

Metadata

metadata:
  name: resource-name
  namespace: dns-system
  labels:
    key: value
  annotations:
    key: value

Status Subresource

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: Resource is synchronized
      lastTransitionTime: "2024-01-01T00:00:00Z"
  observedGeneration: 1

API Group and Versions

All Bindy CRDs belong to the bindy.firestoned.io API group:

  • Current version: v1alpha1
  • API stability: Alpha (subject to breaking changes)

Next Steps

Bind9Cluster

The Bind9Cluster resource represents a logical DNS cluster - a collection of related BIND9 instances with shared configuration.

Overview

A Bind9Cluster defines cluster-level configuration that can be inherited by multiple Bind9Instance resources:

  • Shared BIND9 version and container image
  • Common configuration (recursion, ACLs, etc.)
  • Custom ConfigMap references for BIND9 configuration files
  • TSIG keys for authenticated zone transfers
  • Access Control Lists (ACLs)

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: "base64-encoded-secret"
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    external:
      - "0.0.0.0/0"
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ClusterConfigured
      message: "Cluster configured successfully"
  instanceCount: 4
  readyInstances: 4

Specification

Optional Fields

  • spec.version - BIND9 version for all instances in the cluster
  • spec.image - Container image configuration for all instances
    • image - Full container image reference (registry/repo:tag)
    • imagePullPolicy - Image pull policy (Always, IfNotPresent, Never)
    • imagePullSecrets - List of secret names for private registries
  • spec.configMapRefs - Custom ConfigMap references for BIND9 configuration
    • namedConf - Name of ConfigMap containing named.conf
    • namedConfOptions - Name of ConfigMap containing named.conf.options
  • spec.global - Shared BIND9 configuration
    • recursion - Enable/disable recursion globally
    • allowQuery - List of CIDR ranges allowed to query
    • allowTransfer - List of CIDR ranges allowed zone transfers
    • dnssec - DNSSEC configuration
    • forwarders - DNS forwarders
    • listenOn - IPv4 addresses to listen on
    • listenOnV6 - IPv6 addresses to listen on
  • spec.primary - Primary instance configuration
    • replicas - Number of primary instances to create (managed instances)
  • spec.secondary - Secondary instance configuration
    • replicas - Number of secondary instances to create (managed instances)
  • spec.tsigKeys - TSIG keys for authenticated zone transfers
    • name - Key name
    • algorithm - HMAC algorithm (hmac-sha256, hmac-sha512, etc.)
    • secret - Base64-encoded shared secret
  • spec.acls - Named ACL definitions that instances can reference

Cluster vs Instance

The relationship between Bind9Cluster and Bind9Instance:

# Cluster defines shared configuration
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
spec:
  version: "9.18"
  global:
    recursion: false
  acls:
    internal:
      - "10.0.0.0/8"

---
# Instance references the cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    cluster: prod-cluster
    dns-role: primary
spec:
  clusterRef: prod-cluster
  role: primary
  replicas: 2
  # Instance-specific config can override cluster defaults
  config:
    allowQuery:
      - acl:internal  # Reference the cluster's ACL

TSIG Keys

TSIG (Transaction SIGnature) keys provide authenticated zone transfers:

spec:
  rndcSecretRefs:
    - name: primary-secondary-key
      algorithm: hmac-sha256
      secret: "K8x...base64...=="
    - name: backup-key
      algorithm: hmac-sha512
      secret: "L9y...base64...=="

These keys are used by:

  • Primary instances for authenticated zone transfers to secondaries
  • Secondary instances to authenticate when requesting zone transfers
  • Dynamic DNS updates (if enabled)

Access Control Lists (ACLs)

ACLs define reusable network access policies:

spec:
  acls:
    # Internal networks
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
      - "192.168.0.0/16"

    # External clients
    external:
      - "0.0.0.0/0"

    # Secondary DNS servers
    secondaries:
      - "10.0.1.10"
      - "10.0.2.10"
      - "10.0.3.10"

Instances can then reference these ACLs:

# In Bind9Instance spec
config:
  allowQuery:
    - acl:external
  allowTransfer:
    - acl:secondaries

Status

The controller updates status to reflect cluster state:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ClusterConfigured
      message: "Cluster configured with 4 instances"
  instanceCount: 4      # Total instances in cluster
  readyInstances: 4     # Instances reporting ready
  observedGeneration: 1

Managed Instances

Bind9Cluster can automatically create and manage Bind9Instance resources based on the spec.primary.replicas and spec.secondary.replicas fields.

Automatic Scaling

The operator automatically scales instances up and down based on the replica counts in the cluster spec:

Scale-Up: When you increase replica counts, the operator creates missing instances Scale-Down: When you decrease replica counts, the operator deletes excess instances (highest-indexed first)

When you specify replica counts in the cluster spec, the operator automatically creates the corresponding instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  primary:
    replicas: 2  # Creates 2 primary instances
  secondary:
    replicas: 3  # Creates 3 secondary instances
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

This cluster definition will automatically create 5 Bind9Instance resources:

  • production-dns-primary-0
  • production-dns-primary-1
  • production-dns-secondary-0
  • production-dns-secondary-1
  • production-dns-secondary-2

Management Labels

All managed instances are labeled with:

  • bindy.firestoned.io/managed-by: "Bind9Cluster" - Identifies cluster-managed instances
  • bindy.firestoned.io/cluster: "<cluster-name>" - Links instance to parent cluster
  • bindy.firestoned.io/role: "primary"|"secondary" - Indicates instance role

And annotated with:

  • bindy.firestoned.io/instance-index: "<index>" - Sequential index for the instance

Example of a managed instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: production-dns-primary-0
  namespace: dns-system
  labels:
    bindy.firestoned.io/managed-by: "Bind9Cluster"
    bindy.firestoned.io/cluster: "production-dns"
    bindy.firestoned.io/role: "primary"
  annotations:
    bindy.firestoned.io/instance-index: "0"
spec:
  clusterRef: production-dns
  role: Primary
  replicas: 1
  version: "9.18"
  # Configuration inherited from cluster's spec.global

Configuration Inheritance

Managed instances automatically inherit configuration from the cluster:

  • BIND9 version (spec.version)
  • Container image (spec.image)
  • ConfigMap references (spec.configMapRefs)
  • Volumes and volume mounts
  • Global configuration (spec.global)

Self-Healing

The Bind9Cluster controller provides comprehensive self-healing for managed instances:

Instance-Level Self-Healing:

  • If a managed instance (Bind9Instance CRD) is deleted (manually or accidentally), the controller automatically recreates it during the next reconciliation cycle

Resource-Level Self-Healing:

  • If any child resource is deleted, the controller automatically triggers recreation:
    • ConfigMap - BIND9 configuration files
    • Secret - RNDC key for remote control
    • Service - DNS traffic routing (TCP/UDP port 53)
    • Deployment - BIND9 pods

This ensures complete desired state is maintained even if individual Kubernetes resources are manually deleted or corrupted.

Example self-healing scenario:

# Manually delete a ConfigMap
kubectl delete configmap production-dns-primary-0-config -n dns-system

# During next reconciliation (~10 seconds), the controller:
# 1. Detects missing ConfigMap
# 2. Triggers Bind9Instance reconciliation
# 3. Recreates ConfigMap with correct configuration
# 4. BIND9 pod automatically remounts updated ConfigMap

Example scaling scenario:

# Initial cluster with 2 primary instances
kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  primary:
    replicas: 2
EOF

# Controller creates: production-dns-primary-0, production-dns-primary-1

# Scale up to 4 primaries
kubectl patch bind9cluster production-dns -n dns-system --type=merge -p '{"spec":{"primary":{"replicas":4}}}'

# Controller creates: production-dns-primary-2, production-dns-primary-3

# Scale down to 3 primaries
kubectl patch bind9cluster production-dns -n dns-system --type=merge -p '{"spec":{"primary":{"replicas":3}}}'

# Controller deletes: production-dns-primary-3 (highest index first)

Manual vs Managed Instances

You can mix managed and manual instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: mixed-cluster
spec:
  version: "9.18"
  primary:
    replicas: 2  # Managed instances
  # No secondary replicas - create manually
---
# Manual instance with custom configuration
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-secondary
spec:
  clusterRef: mixed-cluster
  role: Secondary
  replicas: 1
  # Custom configuration overrides
  config:
    allowQuery:
      - "192.168.1.0/24"

Lifecycle Management

Cascade Deletion

When a Bind9Cluster is deleted, the operator automatically deletes all instances that reference it via spec.clusterRef. This ensures clean removal of all cluster resources.

Finalizer: bindy.firestoned.io/bind9cluster-finalizer

The cluster resource uses a finalizer to ensure proper cleanup before deletion:

# Delete the cluster
kubectl delete bind9cluster production-dns

# The operator will:
# 1. Detect deletion timestamp
# 2. Find all instances with clusterRef: production-dns
# 3. Delete each instance
# 4. Remove finalizer
# 5. Allow cluster deletion to complete

Example deletion logs:

INFO Deleting Bind9Cluster production-dns
INFO Found 5 instances to delete
INFO Deleted instance production-dns-primary-0
INFO Deleted instance production-dns-primary-1
INFO Deleted instance production-dns-secondary-0
INFO Deleted instance production-dns-secondary-1
INFO Deleted instance production-dns-secondary-2
INFO Removed finalizer from cluster
INFO Cluster deletion complete

Important Warnings

⚠️ Deleting a Bind9Cluster will delete ALL instances that reference it, including:

  • Managed instances (created by spec.primary.replicas and spec.secondary.replicas)
  • Manual instances (created separately but referencing the cluster via spec.clusterRef)

To preserve instances during cluster deletion, remove the spec.clusterRef field from instances first:

# Remove clusterRef from an instance to preserve it
kubectl patch bind9instance my-instance --type=json -p='[{"op": "remove", "path": "/spec/clusterRef"}]'

# Now safe to delete the cluster without affecting this instance
kubectl delete bind9cluster production-dns

Troubleshooting Stuck Deletions

If a cluster is stuck in Terminating state:

# Check for finalizers
kubectl get bind9cluster production-dns -o jsonpath='{.metadata.finalizers}'

# Check operator logs
kubectl logs -n dns-system deployment/bindy -f

# If operator is not running, manually remove finalizer (last resort)
kubectl patch bind9cluster production-dns -p '{"metadata":{"finalizers":null}}' --type=merge

Use Cases

Multi-Region DNS Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: global-dns
spec:
  version: "9.18"
  global:
    recursion: false
    dnssec:
      enabled: true
      validation: true
  rndcSecretRefs:
    - name: region-sync-key
      algorithm: hmac-sha256
      secret: "..."
  acls:
    us-east:
      - "10.1.0.0/16"
    us-west:
      - "10.2.0.0/16"
    eu-west:
      - "10.3.0.0/16"

Development Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true  # Allow recursion for dev
    allowQuery:
      - "0.0.0.0/0"
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"
  acls:
    dev-team:
      - "192.168.1.0/24"

Custom Image Cluster

Use a custom container image across all instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-image-cluster
  namespace: dns-system
spec:
  version: "9.18"
  # Custom image with organization-specific patches
  image:
    image: "my-registry.example.com/bind9:9.18-custom"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - docker-registry-secret
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

All Bind9Instances referencing this cluster will inherit the custom image configuration unless they override it.

Custom ConfigMap Cluster

Share custom BIND9 configuration files across all instances:

apiVersion: v1
kind: ConfigMap
metadata:
  name: shared-bind9-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      allow-transfer { 10.0.2.0/24; };
      dnssec-validation auto;

      # Custom logging
      querylog yes;

      # Rate limiting
      rate-limit {
        responses-per-second 10;
        window 5;
      };
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-config-cluster
  namespace: dns-system
spec:
  version: "9.18"
  configMapRefs:
    namedConfOptions: "shared-bind9-options"

All instances in this cluster will use the custom configuration, while named.conf is auto-generated.

Best Practices

  1. One cluster per environment - Separate clusters for production, staging, development
  2. Consistent TSIG keys - Use the same keys across all instances in a cluster
  3. Version pinning - Specify exact BIND9 versions to avoid unexpected updates
  4. ACL organization - Define ACLs at cluster level for consistency
  5. DNSSEC - Enable DNSSEC at the cluster level for all zones
  6. Image management - Define container images at cluster level for consistency; override at instance level only for canary testing
  7. ConfigMap strategy - Use cluster-level ConfigMaps for shared configuration; use instance-level ConfigMaps for instance-specific customizations
  8. Image pull secrets - Configure imagePullSecrets at cluster level to avoid duplicating secrets across instances

Next Steps

Bind9GlobalCluster

The Bind9GlobalCluster CRD defines a cluster-scoped logical grouping of BIND9 DNS server instances for platform-managed infrastructure.

Overview

Bind9GlobalCluster is a cluster-scoped resource (no namespace) designed for platform teams to provide shared DNS infrastructure accessible from all namespaces in the cluster.

Key Characteristics

  • Cluster-Scoped: No namespace - visible cluster-wide
  • Platform-Managed: Typically managed by platform/infrastructure teams
  • Shared Infrastructure: DNSZones in any namespace can reference it
  • High Availability: Designed for production workloads
  • RBAC: Requires ClusterRole + ClusterRoleBinding

Relationship with Bind9Cluster

Bindy provides two cluster types:

FeatureBind9ClusterBind9GlobalCluster
ScopeNamespace-scopedCluster-scoped
Managed ByDevelopment teamsPlatform teams
VisibilitySingle namespaceAll namespaces
RBACRole + RoleBindingClusterRole + ClusterRoleBinding
Zone ReferenceclusterRefglobalClusterRef
Use CaseDev/test, team isolationProduction, shared infrastructure

Shared Configuration: Both cluster types use the same Bind9ClusterCommonSpec for configuration, ensuring consistency.

Spec Structure

The Bind9GlobalClusterSpec uses the same configuration fields as Bind9Cluster through a shared spec:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  # No namespace - cluster-scoped
spec:
  # BIND9 version
  version: "9.18"

  # Primary instance configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

  # Secondary instance configuration
  secondary:
    replicas: 2

  # Global BIND9 configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"

  # Access control lists
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    secondaries:
      - "10.10.1.0/24"

  # Volumes for persistent storage
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-storage

  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind

For detailed field descriptions, see the Bind9Cluster Spec Reference - all fields are identical.

Status

The status subresource tracks the overall health of the global cluster:

status:
  # Cluster-level conditions
  conditions:
    - type: Ready
      status: "True"
      reason: AllReady
      message: "All 5 instances are ready"
      lastTransitionTime: "2025-01-10T12:00:00Z"

  # Instance tracking (namespace/name format for global clusters)
  instances:
    - "production/primary-dns-0"
    - "production/primary-dns-1"
    - "production/primary-dns-2"
    - "staging/secondary-dns-0"
    - "staging/secondary-dns-1"

  # Generation tracking
  observedGeneration: 3

  # Instance counts
  instanceCount: 5
  readyInstances: 5

Key Difference from Bind9Cluster: Instance names include namespace prefix (namespace/name) since instances can be in any namespace.

Usage Patterns

Pattern 1: Platform-Managed Production DNS

Scenario: Platform team provides shared DNS for all production workloads.

# Platform team creates global cluster (ClusterRole required)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
---
# Application team references global cluster (Role in their namespace)
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-service  # Application namespace
spec:
  zoneName: api.example.com
  globalClusterRef: shared-production-dns  # References cluster-scoped cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Different application, same global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: web-zone
  namespace: web-frontend  # Different namespace
spec:
  zoneName: www.example.com
  globalClusterRef: shared-production-dns  # Same global cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Pattern 2: Multi-Region Global Clusters

Scenario: Geo-distributed DNS with regional global clusters.

# US East region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-us-east
  labels:
    region: us-east-1
    tier: production
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.0.0.0/8"
---
# EU West region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-eu-west
  labels:
    region: eu-west-1
    tier: production
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.128.0.0/9"
---
# Application chooses regional cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-us
  namespace: api-service
spec:
  zoneName: api.us.example.com
  globalClusterRef: dns-us-east  # US region
  soaRecord: { /* ... */ }
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-eu
  namespace: api-service
spec:
  zoneName: api.eu.example.com
  globalClusterRef: dns-eu-west  # EU region
  soaRecord: { /* ... */ }

Pattern 3: Tiered DNS Service

Scenario: Platform offers different DNS service tiers.

# Premium tier - high availability
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-premium
  labels:
    tier: premium
    sla: "99.99"
spec:
  version: "9.18"
  primary:
    replicas: 5
    service:
      type: LoadBalancer
  secondary:
    replicas: 5
  global:
    options:
      - "minimal-responses yes"
      - "recursive-clients 10000"
---
# Standard tier - balanced cost/availability
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-standard
  labels:
    tier: standard
    sla: "99.9"
spec:
  version: "9.18"
  primary:
    replicas: 3
  secondary:
    replicas: 2
---
# Economy tier - minimal resources
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-economy
  labels:
    tier: economy
    sla: "99.0"
spec:
  version: "9.18"
  primary:
    replicas: 2
  secondary:
    replicas: 1

RBAC Requirements

Platform Team (ClusterRole)

Platform teams need ClusterRole to manage global clusters:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
# Manage global clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View global cluster status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters/status"]
  verbs: ["get", "list", "watch"]

# Manage instances across namespaces (for global clusters)
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns
subjects:
- kind: Group
  name: platform-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Application Teams (Role)

Application teams only need namespace-scoped permissions:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-zone-admin
  namespace: api-service
rules:
# Manage DNS zones and records in this namespace
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "mxrecords"
    - "txtrecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View resource status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

# Note: No permissions for Bind9GlobalCluster needed
# Application teams only manage DNSZones, not the cluster itself
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: api-team-dns
  namespace: api-service
subjects:
- kind: Group
  name: api-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-zone-admin
  apiGroup: rbac.authorization.k8s.io

Instance Management

Creating Instances for Global Clusters

Instances can be created in any namespace and reference the global cluster:

# Instance in production namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns-0
  namespace: production
spec:
  cluster_ref: shared-production-dns  # References global cluster
  role: primary
  replicas: 1
---
# Instance in staging namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns-0
  namespace: staging
spec:
  cluster_ref: shared-production-dns  # Same global cluster
  role: secondary
  replicas: 1

Status Tracking: The global cluster status includes instances from all namespaces:

status:
  instances:
    - "production/primary-dns-0"  # namespace/name format
    - "staging/secondary-dns-0"
  instanceCount: 2
  readyInstances: 2

Configuration Inheritance

How Configuration Flows to Deployments

When you update a Bind9GlobalCluster, the configuration automatically propagates down to all managed Deployment resources. This ensures consistency across your entire DNS infrastructure.

Configuration Precedence

Configuration is resolved with the following precedence (highest to lowest):

  1. Bind9Instance - Instance-specific overrides
  2. Bind9Cluster - Namespace-scoped cluster defaults
  3. Bind9GlobalCluster - Cluster-scoped global defaults
  4. System defaults - Built-in fallback values

Example:

# Bind9GlobalCluster defines global defaults
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"  # Global default version
  image:
    image: "internetsystemsconsortium/bind9:9.18"  # Global default image
  global:
    bindcarConfig:
      image: "ghcr.io/company/bindcar:v1.2.0"  # Global bindcar image

---
# Bind9Instance can override specific fields
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-0
  namespace: production
spec:
  clusterRef: production-dns
  role: Primary
  # version: "9.20"  # Would override global version if specified
  # Uses global version "9.18" and global bindcar image

Propagation Flow

When you update Bind9GlobalCluster.spec.common.global.bindcarConfig.image, the change propagates automatically:

sequenceDiagram
    participant User
    participant GC as Bind9GlobalCluster<br/>Reconciler
    participant BC as Bind9Cluster<br/>Reconciler
    participant BI as Bind9Instance<br/>Reconciler
    participant Deploy as Deployment

    User->>GC: Update bindcarConfig.image
    Note over GC: metadata.generation increments
    GC->>GC: Detect spec change
    GC->>BC: PATCH Bind9Cluster with new spec
    Note over BC: metadata.generation increments
    BC->>BC: Detect spec change
    BC->>BI: PATCH Bind9Instance with new spec
    Note over BI: metadata.generation increments
    BI->>BI: Detect spec change
    BI->>BI: Fetch Bind9GlobalCluster config
    BI->>BI: resolve_deployment_config():<br/>instance > cluster > global_cluster
    BI->>Deploy: UPDATE Deployment with new image
    Deploy->>Deploy: Rolling update pods

Inherited Configuration Fields

The following fields are inherited from Bind9GlobalCluster to Deployment:

FieldExampleDescription
imagespec.common.imageContainer image configuration
versionspec.common.versionBIND9 version tag
volumesspec.common.volumesPod volumes (PVCs, ConfigMaps, etc.)
volumeMountsspec.common.volumeMountsContainer volume mounts
bindcarConfigspec.common.global.bindcarConfigAPI sidecar configuration
configMapRefsspec.common.configMapRefsCustom ConfigMap references

Complete Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"

  # Image configuration - inherited by all instances
  image:
    image: "ghcr.io/mycompany/bind9:9.18-custom"
    imagePullPolicy: Always
    imagePullSecrets:
      - name: ghcr-credentials

  # API sidecar configuration - inherited by all instances
  global:
    bindcarConfig:
      image: "ghcr.io/mycompany/bindcar:v1.2.0"
      port: 8080

  # Volumes - inherited by all instances
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zones-pvc
    - name: custom-config
      configMap:
        name: bind9-custom-config

  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind
    - name: custom-config
      mountPath: /etc/bind/custom

All instances referencing this global cluster will inherit these configurations in their Deployment resources.

Verifying Configuration Propagation

To verify configuration is inherited correctly:

# 1. Check Bind9GlobalCluster spec
kubectl get bind9globalcluster production-dns -o yaml | grep -A 5 bindcarConfig

# 2. Check Bind9Instance spec (should be empty if using global config)
kubectl get bind9instance primary-0 -n production -o yaml | grep -A 5 bindcarConfig

# 3. Check Deployment - should show global cluster's bindcar image
kubectl get deployment primary-0 -n production -o yaml | grep "image:" | grep bindcar

Expected Output:

# Deployment should use global cluster's bindcar image
containers:
  - name: bindcar
    image: ghcr.io/mycompany/bindcar:v1.2.0  # From Bind9GlobalCluster

Reconciliation

Controller Behavior

The Bind9GlobalCluster reconciler:

  1. Lists instances across ALL namespaces

    #![allow(unused)]
    fn main() {
    let instances_api: Api<Bind9Instance> = Api::all(client.clone());
    let all_instances = instances_api.list(&lp).await?;
    }
  2. Filters instances by cluster_ref matching the global cluster name

    #![allow(unused)]
    fn main() {
    let instances: Vec<_> = all_instances
        .items
        .into_iter()
        .filter(|inst| inst.spec.cluster_ref == global_cluster_name)
        .collect();
    }
  3. Calculates cluster status

    • Counts total and ready instances
    • Aggregates health conditions
    • Formats instance names as namespace/name
  4. Updates status

    • Sets observedGeneration
    • Updates Ready condition
    • Lists all instances with namespace prefix

Generation Tracking

The reconciler uses standard Kubernetes generation tracking:

metadata:
  generation: 5  # Incremented on spec changes

status:
  observedGeneration: 5  # Updated after reconciliation

Reconciliation occurs only when metadata.generation != status.observedGeneration (spec changed).

Comparison with Bind9Cluster

Similarities

  • ✓ Identical configuration fields (Bind9ClusterCommonSpec)
  • ✓ Same reconciliation logic for health tracking
  • ✓ Status subresource with conditions
  • ✓ Generation-based reconciliation
  • ✓ Finalizer-based cleanup

Differences

AspectBind9ClusterBind9GlobalCluster
ScopeNamespace-scopedCluster-scoped (no namespace)
API UsedApi::namespaced()Api::all()
Instance ListingSame namespace onlyAll namespaces
Instance Namesnamenamespace/name
RBACRole + RoleBindingClusterRole + ClusterRoleBinding
Zone Reference Fieldspec.clusterRefspec.globalClusterRef
Kubectl Getkubectl get bind9cluster -n <namespace>kubectl get bind9globalcluster

Best Practices

1. Use for Production Workloads

Global clusters are ideal for production:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  labels:
    environment: production
    managed-by: platform-team
spec:
  version: "9.18"
  primary:
    replicas: 3  # High availability
    service:
      type: LoadBalancer
  secondary:
    replicas: 3

2. Separate Global Clusters by Environment

# Production cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-production
  labels:
    environment: production
spec: { /* production config */ }
---
# Staging cluster (also global, but separate)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-staging
  labels:
    environment: staging
spec: { /* staging config */ }

3. Label for Organization

Use labels to categorize global clusters:

metadata:
  name: dns-us-east-prod
  labels:
    region: us-east-1
    environment: production
    tier: premium
    team: platform
    cost-center: infrastructure

4. Monitor Status Across Namespaces

# View global cluster status
kubectl get bind9globalcluster dns-production

# See instances across all namespaces
kubectl get bind9globalcluster dns-production -o jsonpath='{.status.instances}'

# Check instance distribution
kubectl get bind9instance -A -l cluster=dns-production

5. Use with DNSZone Namespace Isolation

Remember: DNSZones are always namespace-scoped, even when referencing global clusters:

# DNSZone in namespace-a
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: zone-a
  namespace: namespace-a
spec:
  zoneName: app-a.example.com
  globalClusterRef: shared-dns
  # Records in namespace-a can ONLY reference this zone
---
# DNSZone in namespace-b
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: zone-b
  namespace: namespace-b
spec:
  zoneName: app-b.example.com
  globalClusterRef: shared-dns
  # Records in namespace-b can ONLY reference this zone

Troubleshooting

Viewing Global Clusters

# List all global clusters
kubectl get bind9globalclusters

# Describe a specific global cluster
kubectl describe bind9globalcluster production-dns

# View status
kubectl get bind9globalcluster production-dns -o yaml

Common Issues

Issue: Application team cannot create global cluster

Solution: Check RBAC - requires ClusterRole, not Role

kubectl auth can-i create bind9globalclusters --as=user@example.com

Issue: Instances not showing in status

Solution: Verify instance cluster_ref matches global cluster name

kubectl get bind9instance -A -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: {.spec.cluster_ref}{"\n"}{end}'

Issue: DNSZone cannot find global cluster

Solution: Check globalClusterRef field (not clusterRef)

spec:
  globalClusterRef: production-dns  # ✓ Correct
  # clusterRef: production-dns      # ✗ Wrong - for namespace-scoped

Next Steps

Bind9Instance

The Bind9Instance resource represents a BIND9 DNS server deployment in Kubernetes.

Overview

A Bind9Instance defines:

  • Number of replicas
  • BIND9 version and container image
  • Configuration options (or custom ConfigMap references)
  • Network settings
  • Labels for targeting
  • Optional cluster reference for inheriting shared configuration

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
    datacenter: us-east
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
status:
  conditions:
    - type: Ready
      status: "True"
      reason: Running
      message: "2 replicas running"
  readyReplicas: 2
  currentVersion: "9.18"

Specification

Optional Fields

All fields are optional. If no clusterRef is specified, default values are used.

  • spec.clusterRef - Reference to a Bind9Cluster for inheriting shared configuration
  • spec.replicas - Number of BIND9 pods (default: 1)
  • spec.version - BIND9 version to deploy (default: “9.18”, or inherit from cluster)
  • spec.image - Container image configuration (inherits from cluster if not specified)
    • image - Full container image reference
    • imagePullPolicy - Image pull policy (Always, IfNotPresent, Never)
    • imagePullSecrets - List of secret names for private registries
  • spec.configMapRefs - Custom ConfigMap references (inherits from cluster if not specified)
    • namedConf - ConfigMap name containing named.conf
    • namedConfOptions - ConfigMap name containing named.conf.options
  • spec.config - BIND9 configuration options (inherits from cluster if not specified)
    • recursion - Enable/disable recursion (default: false)
    • allowQuery - List of CIDR ranges allowed to query
    • allowTransfer - List of CIDR ranges allowed to transfer zones
    • dnssec - DNSSEC configuration
    • forwarders - DNS forwarders
    • listenOn - IPv4 addresses to listen on
    • listenOnV6 - IPv6 addresses to listen on

Configuration Inheritance

When a Bind9Instance references a Bind9Cluster via clusterRef:

  1. Instance-level settings take precedence
  2. If not specified at instance level, cluster settings are used
  3. If not specified at cluster level, defaults are used

Labels and Selectors

Labels on Bind9Instance resources are used by DNSZone resources to target specific instances:

# Instance with labels
metadata:
  labels:
    dns-role: primary
    region: us-east
    environment: production

# Zone selecting this instance
spec:
  instanceSelector:
    matchLabels:
      dns-role: primary
      region: us-east

Status

The controller updates status to reflect the instance state:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Running
  readyReplicas: 2
  currentVersion: "9.18"

Use Cases

Primary DNS Instance

metadata:
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Allow secondaries to transfer

Secondary DNS Instance

metadata:
  labels:
    dns-role: secondary
spec:
  replicas: 2
  config:
    allowTransfer: []  # No transfers from secondary

Instance with Custom Image

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-image-dns
  namespace: dns-system
spec:
  replicas: 2
  image:
    image: "my-registry.example.com/bind9:9.18-patched"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Instance with Custom ConfigMaps

apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-dns-config
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };

      # Custom rate limiting
      rate-limit {
        responses-per-second 10;
      };
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-config-dns
  namespace: dns-system
spec:
  replicas: 2
  configMapRefs:
    namedConfOptions: "custom-dns-config"

Instance Inheriting from Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
  namespace: dns-system
spec:
  version: "9.18"
  image:
    image: "internetsystemsconsortium/bind9:9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: prod-instance-1
  namespace: dns-system
spec:
  clusterRef: prod-cluster
  replicas: 2
  # Inherits version, image, and config from cluster

Canary Instance with Override

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: canary-instance
  namespace: dns-system
spec:
  clusterRef: prod-cluster  # Inherits most settings from cluster
  replicas: 1
  # Override image for canary testing
  image:
    image: "internetsystemsconsortium/bind9:9.19-beta"
    imagePullPolicy: "Always"

Next Steps

DNSZone

The DNSZone resource defines a DNS zone with its SOA record and references a specific BIND9 cluster.

Overview

A DNSZone represents:

  • Zone name (e.g., example.com)
  • SOA (Start of Authority) record
  • Cluster reference to a Bind9Instance
  • Default TTL for records

The zone is created on the referenced BIND9 cluster using the RNDC protocol.

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: my-dns-cluster  # References Bind9Instance name
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600
status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: "Zone created for cluster: my-dns-cluster"
  observedGeneration: 1

Specification

Required Fields

  • spec.zoneName - The DNS zone name (e.g., example.com)
  • spec.clusterRef - Name of the Bind9Instance to host this zone
  • spec.soaRecord - Start of Authority record configuration

SOA Record Fields

  • primaryNs - Primary nameserver (must end with .)
  • adminEmail - Zone administrator email (@ replaced with ., must end with .)
  • serial - Zone serial number (typically YYYYMMDDNN format)
  • refresh - Refresh interval in seconds (how often secondaries check for updates)
  • retry - Retry interval in seconds (retry delay after failed refresh)
  • expire - Expiry time in seconds (when to stop serving if primary unreachable)
  • negativeTtl - Negative caching TTL (cache duration for NXDOMAIN responses)

Optional Fields

  • spec.ttl - Default TTL for records in seconds (default: 3600)

How Zones Are Created

When you create a DNSZone resource:

  1. Controller discovers pods - Finds BIND9 pods with label instance={clusterRef}
  2. Loads RNDC key - Retrieves Secret named {clusterRef}-rndc-key
  3. Connects via RNDC - Establishes connection to {clusterRef}.{namespace}.svc.cluster.local:953
  4. Executes addzone - Runs rndc addzone command with zone configuration
  5. BIND9 creates zone - BIND9 creates the zone file and starts serving the zone
  6. Updates status - Controller updates DNSZone status to Ready

Cluster References

Zones reference a specific BIND9 cluster by name:

spec:
  clusterRef: my-dns-cluster

This references a Bind9Instance resource:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: my-dns-cluster  # Referenced by DNSZone
  namespace: dns-system
spec:
  role: primary
  replicas: 2

RNDC Key Discovery

The controller automatically finds the RNDC key using the cluster reference:

DNSZone.spec.clusterRef = "my-dns-cluster"
    ↓
Secret name = "my-dns-cluster-rndc-key"
    ↓
RNDC authentication to: my-dns-cluster.dns-system.svc.cluster.local:953

Status

The controller reports zone status with granular condition types that provide real-time visibility into the reconciliation process.

Status During Reconciliation

# Phase 1: Configuring primary instances
status:
  conditions:
    - type: Progressing
      status: "True"
      reason: PrimaryReconciling
      message: "Configuring zone on primary instances"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

# Phase 2: Primary success, configuring secondaries
status:
  conditions:
    - type: Progressing
      status: "True"
      reason: SecondaryReconciling
      message: "Configured on 2 primary server(s), now configuring secondaries"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

Status After Successful Reconciliation

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Configured on 2 primary server(s) and 3 secondary server(s)"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"
    - "10.42.0.7"

Status After Partial Failure (Degraded)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: SecondaryFailed
      message: "Configured on 2 primary server(s), but secondary configuration failed: connection timeout"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

Condition Types

DNSZone uses the following condition types:

  • Progressing - Zone is being configured

    • PrimaryReconciling: Configuring on primary instances
    • PrimaryReconciled: Primary configuration successful
    • SecondaryReconciling: Configuring on secondary instances
    • SecondaryReconciled: Secondary configuration successful
  • Ready - Zone fully configured and operational

    • ReconcileSucceeded: All primaries and secondaries configured successfully
  • Degraded - Partial or complete failure

    • PrimaryFailed: Primary configuration failed (zone not functional)
    • SecondaryFailed: Secondary configuration failed (primaries work, but secondaries unavailable)

Benefits of Granular Status

  1. Real-time visibility - See which reconciliation phase is running
  2. Better debugging - Know exactly which phase failed (primary vs secondary)
  3. Graceful degradation - Secondary failures don’t break the zone (primaries still work)
  4. Accurate counts - Status shows exact number of configured servers

Use Cases

Simple Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: simple-com
spec:
  zoneName: simple.com
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.simple.com.
    adminEmail: admin.simple.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Production Zone with Custom TTL

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-example-com
spec:
  zoneName: api.example.com
  clusterRef: production-dns
  ttl: 300  # 5 minute default TTL for faster updates
  soaRecord:
    primaryNs: ns1.api.example.com.
    adminEmail: ops.example.com.
    serial: 2024010101
    refresh: 1800   # Check every 30 minutes
    retry: 300      # Retry after 5 minutes
    expire: 604800
    negativeTtl: 300  # Short negative cache

Next Steps

DNS Records

Bindy supports all common DNS record types as Custom Resources.

Supported Record Types

  • ARecord - IPv4 address mapping
  • AAAARecord - IPv6 address mapping
  • CNAMERecord - Canonical name (alias)
  • MXRecord - Mail exchange
  • TXTRecord - Text data
  • NSRecord - Nameserver delegation
  • SRVRecord - Service location
  • CAARecord - Certificate authority authorization

Common Fields

All DNS record types share these fields:

metadata:
  name: record-name
  namespace: dns-system
spec:
  # Zone reference (use ONE of these):
  zone: example.com          # Match against DNSZone spec.zoneName
  # OR
  zoneRef: example-com       # Direct reference to DNSZone metadata.name

  name: record-name          # DNS name (@ for zone apex)
  ttl: 300                   # Time to live (optional)

Zone Referencing

DNS records can reference their parent zone using two different methods:

  1. zone field - Searches for a DNSZone by matching spec.zoneName

    • Value: The actual DNS zone name (e.g., example.com)
    • The controller searches all DNSZones in the namespace for matching spec.zoneName
    • More intuitive but requires a list operation
  2. zoneRef field - Direct reference to a DNSZone resource

    • Value: The Kubernetes resource name (e.g., example-com)
    • The controller directly retrieves the DNSZone by metadata.name
    • More efficient (no search required)
    • Recommended for production use

Important: You must specify exactly one of zone or zoneRef (not both).

Example: Zone vs ZoneRef

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com        # Kubernetes resource name
  namespace: dns-system
spec:
  zoneName: example.com    # Actual DNS zone name
  clusterRef: primary-dns
  # ... soa_record, etc.

You can reference it using either method:

Method 1: Using zone (matches spec.zoneName)

spec:
  zone: example.com  # Matches DNSZone spec.zoneName
  name: www

Method 2: Using zoneRef (matches metadata.name)

spec:
  zoneRef: example-com  # Matches DNSZone metadata.name
  name: www

ARecord (IPv4)

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

Learn more about A Records

AAAARecord (IPv6)

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-ipv6
spec:
  zone: example-com
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

Learn more about AAAA Records

CNAMERecord (Alias)

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example
spec:
  zone: example-com
  name: blog
  target: www.example.com.
  ttl: 300

Learn more about CNAME Records

MXRecord (Mail Exchange)

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
spec:
  zone: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
  ttl: 3600

Learn more about MX Records

TXTRecord (Text)

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-example
spec:
  zone: example-com
  name: "@"
  text:
    - "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Learn more about TXT Records

NSRecord (Nameserver)

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: delegate-subdomain
spec:
  zone: example-com
  name: subdomain
  nameserver: ns1.subdomain.example.com.
  ttl: 3600

Learn more about NS Records

SRVRecord (Service)

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-service
spec:
  zone: example-com
  name: _sip._tcp
  priority: 10
  weight: 60
  port: 5060
  target: sipserver.example.com.
  ttl: 3600

Learn more about SRV Records

CAARecord (Certificate Authority)

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: letsencrypt-caa
spec:
  zone: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
  ttl: 3600

Learn more about CAA Records

Record Status

All DNS record types use granular status conditions to provide real-time visibility into the record configuration process.

Status During Configuration

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: RecordReconciling
      message: "Configuring A record on zone endpoints"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

Status After Successful Configuration

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Status After Failure

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: RecordFailed
      message: "Failed to configure record: Zone not found on primary servers"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Condition Types

All DNS record types use the following condition types:

  • Progressing - Record is being configured

    • RecordReconciling: Before adding record to zone endpoints
  • Ready - Record successfully configured

    • ReconcileSucceeded: Record configured on all endpoints (message includes endpoint count)
  • Degraded - Configuration failure

    • RecordFailed: Failed to configure record (includes error details)

Benefits

  1. Real-time progress - See when records are being configured
  2. Better debugging - Know immediately if/why a record failed
  3. Accurate reporting - Status shows exact number of endpoints configured
  4. Consistent across types - All 8 record types use the same status pattern

Record Management

Referencing Zones

All records reference a DNSZone via the zone field:

spec:
  zone: example-com  # Must match DNSZone metadata.name

Zone Apex Records

Use @ for zone apex records:

spec:
  name: "@"  # Represents the zone itself

Subdomain Records

Use the subdomain name:

spec:
  name: www        # www.example.com
  name: api.v2     # api.v2.example.com

Next Steps

Architecture Overview

This guide explains the Bindy architecture, focusing on the dual-cluster model that enables multi-tenancy and flexible deployment patterns.

Table of Contents

Architecture Principles

Bindy follows Kubernetes controller pattern best practices:

  1. Declarative Configuration: Users declare desired state via CRDs, controllers reconcile to match
  2. Level-Based Reconciliation: Controllers continuously ensure actual state matches desired state
  3. Status Subresources: All CRDs expose status for observability
  4. Finalizers: Proper cleanup of dependent resources before deletion
  5. Generation Tracking: Reconcile only when spec changes (using metadata.generation)

Cluster Models

Bindy provides two cluster models to support different organizational patterns:

Namespace-Scoped Clusters (Bind9Cluster)

Use Case: Development teams manage their own DNS infrastructure within their namespace.

graph TB
    subgraph "Namespace: dev-team-alpha"
        Cluster[Bind9Cluster<br/>dev-team-dns]
        Zone1[DNSZone<br/>app.example.com]
        Zone2[DNSZone<br/>test.local]
        Record1[ARecord<br/>www]
        Record2[MXRecord<br/>mail]

        Cluster --> Zone1
        Cluster --> Zone2
        Zone1 --> Record1
        Zone2 --> Record2
    end

    style Cluster fill:#e1f5ff
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1
    style Record1 fill:#f0f0f0
    style Record2 fill:#f0f0f0

Characteristics:

  • Isolated to a single namespace
  • Teams manage their own DNS independently
  • RBAC scoped to namespace (Role/RoleBinding)
  • Cannot be referenced from other namespaces

YAML Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-team-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1

Cluster-Scoped Clusters (Bind9GlobalCluster)

Use Case: Platform teams provide shared DNS infrastructure accessible from all namespaces.

graph TB
    subgraph "Cluster-Scoped (no namespace)"
        GlobalCluster[Bind9GlobalCluster<br/>shared-production-dns]
    end

    subgraph "Namespace: production"
        Zone1[DNSZone<br/>api.example.com]
        Record1[ARecord<br/>api]
    end

    subgraph "Namespace: staging"
        Zone2[DNSZone<br/>staging.example.com]
        Record2[ARecord<br/>app]
    end

    GlobalCluster -.-> Zone1
    GlobalCluster -.-> Zone2
    Zone1 --> Record1
    Zone2 --> Record2

    style GlobalCluster fill:#c8e6c9
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1
    style Record1 fill:#f0f0f0
    style Record2 fill:#f0f0f0

Characteristics:

  • Cluster-wide visibility (no namespace)
  • Platform team manages centralized DNS
  • RBAC requires ClusterRole/ClusterRoleBinding
  • DNSZones in any namespace can reference it

YAML Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
  # No namespace - cluster-scoped resource
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2

Resource Hierarchy

The complete resource hierarchy shows how components relate:

graph TD
    subgraph "Cluster-Scoped Resources"
        GlobalCluster[Bind9GlobalCluster]
    end

    subgraph "Namespace-Scoped Resources"
        Cluster[Bind9Cluster]
        Zone[DNSZone]
        Instance[Bind9Instance]
        Records[DNS Records<br/>A, AAAA, CNAME, MX, etc.]
    end

    GlobalCluster -.globalClusterRef.-> Zone
    Cluster --clusterRef--> Zone

    Cluster --cluster_ref--> Instance
    GlobalCluster -.cluster_ref.-> Instance

    Zone --> Records

    style GlobalCluster fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style Zone fill:#fff4e1
    style Instance fill:#ffe1e1
    style Records fill:#f0f0f0

Key Relationships

  1. DNSZone → Cluster References:

    • spec.clusterRef: References namespace-scoped Bind9Cluster (same namespace)
    • spec.globalClusterRef: References cluster-scoped Bind9GlobalCluster
    • Mutual Exclusivity: Exactly one must be specified
  2. Bind9Instance → Cluster Reference:

    • spec.cluster_ref: Can reference either Bind9Cluster or Bind9GlobalCluster
    • Controller auto-detects cluster type
  3. DNS Records → Zone Reference:

    • spec.zone: Zone name lookup (searches in same namespace)
    • spec.zoneRef: Direct DNSZone resource name (same namespace)
    • Namespace Isolation: Records can ONLY reference zones in their own namespace

Reconciliation Flow

DNSZone Reconciliation

sequenceDiagram
    participant K8s as Kubernetes API
    participant Controller as DNSZone Controller
    participant Cluster as Bind9Cluster/GlobalCluster
    participant Instances as Bind9Instances
    participant BIND9 as BIND9 Pods

    K8s->>Controller: DNSZone created/updated
    Controller->>Controller: Check metadata.generation vs status.observedGeneration
    alt Spec unchanged
        Controller->>K8s: Skip reconciliation (status-only update)
    else Spec changed
        Controller->>Controller: Validate clusterRef XOR globalClusterRef
        Controller->>Cluster: Get cluster by clusterRef or globalClusterRef
        Controller->>Instances: List instances by cluster reference
        Controller->>BIND9: Update zone files via Bindcar API
        Controller->>K8s: Update status (observedGeneration, conditions)
    end

Bind9GlobalCluster Reconciliation

sequenceDiagram
    participant K8s as Kubernetes API
    participant Controller as GlobalCluster Controller
    participant Instances as Bind9Instances (all namespaces)

    K8s->>Controller: Bind9GlobalCluster created/updated
    Controller->>Controller: Check generation changed
    Controller->>Instances: List all instances across all namespaces
    Controller->>Controller: Filter instances by cluster_ref
    Controller->>Controller: Calculate cluster status
    Note over Controller: - Count ready instances<br/>- Aggregate conditions<br/>- Format instance names as namespace/name
    Controller->>K8s: Update status with aggregated health

Multi-Tenancy Model

Bindy supports multi-tenancy through two organizational patterns:

Platform Team Pattern

Platform teams manage cluster-wide DNS infrastructure:

graph TB
    subgraph "Platform Team (ClusterRole)"
        PlatformAdmin[Platform Admin]
    end

    subgraph "Cluster-Scoped"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
    end

    subgraph "Namespace: app-a"
        Zone1[DNSZone<br/>app-a.example.com]
        Instance1[Bind9Instance<br/>primary-app-a]
    end

    subgraph "Namespace: app-b"
        Zone2[DNSZone<br/>app-b.example.com]
        Instance2[Bind9Instance<br/>primary-app-b]
    end

    PlatformAdmin -->|manages| GlobalCluster
    GlobalCluster -.->|referenced by| Zone1
    GlobalCluster -.->|referenced by| Zone2
    GlobalCluster -->|references| Instance1
    GlobalCluster -->|references| Instance2

    style PlatformAdmin fill:#ff9800
    style GlobalCluster fill:#c8e6c9
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1

RBAC Setup:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns
subjects:
- kind: Group
  name: platform-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Development Team Pattern

Development teams manage namespace-scoped DNS:

graph TB
    subgraph "Namespace: dev-team-alpha (Role)"
        DevAdmin[Dev Team Admin]
        Cluster[Bind9Cluster<br/>dev-dns]
        Zone[DNSZone<br/>dev.example.com]
        Records[DNS Records]
        Instance[Bind9Instance]
    end

    DevAdmin -->|manages| Cluster
    DevAdmin -->|manages| Zone
    DevAdmin -->|manages| Records
    Cluster --> Instance
    Cluster --> Zone
    Zone --> Records

    style DevAdmin fill:#2196f3
    style Cluster fill:#e1f5ff
    style Zone fill:#fff4e1

RBAC Setup:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-admin
  namespace: dev-team-alpha
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9clusters", "dnszones", "arecords", "cnamerecords", "mxrecords", "txtrecords"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dev-team-dns
  namespace: dev-team-alpha
subjects:
- kind: Group
  name: dev-team-alpha
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-admin
  apiGroup: rbac.authorization.k8s.io

Namespace Isolation

Security Principle: DNSZones and records are always namespace-scoped, even when referencing cluster-scoped resources.

graph TB
    subgraph "Cluster-Scoped"
        GlobalCluster[Bind9GlobalCluster<br/>shared-dns]
    end

    subgraph "Namespace: team-a"
        ZoneA[DNSZone<br/>team-a.example.com]
        RecordA[ARecord<br/>www]
    end

    subgraph "Namespace: team-b"
        ZoneB[DNSZone<br/>team-b.example.com]
        RecordB[ARecord<br/>api]
    end

    GlobalCluster -.-> ZoneA
    GlobalCluster -.-> ZoneB
    ZoneA --> RecordA
    ZoneB --> RecordB

    RecordA -.X|blocked|ZoneB
    RecordB -.X|blocked|ZoneA

    style GlobalCluster fill:#c8e6c9
    style ZoneA fill:#fff4e1
    style ZoneB fill:#fff4e1

Isolation Rules:

  1. Records can ONLY reference zones in their own namespace

    • Controller uses Api::namespaced() to enforce this
    • Cross-namespace references are impossible
  2. DNSZones are namespace-scoped

    • Even when referencing Bind9GlobalCluster
    • Each team manages their own zones
  3. RBAC controls zone management

    • Platform team: ClusterRole for Bind9GlobalCluster
    • Dev teams: Role for DNSZone and records in their namespace

Example - Record Isolation:

# team-a namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-a-zone  # ✅ References zone in same namespace
  name: www
  ipv4Address: "192.0.2.1"
---
# This would FAIL - cannot reference zone in another namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-b-zone  # ❌ References zone in team-b namespace - BLOCKED
  name: www
  ipv4Address: "192.0.2.1"

Decision Tree: Choosing a Cluster Model

Use this decision tree to determine which cluster model fits your use case:

graph TD
    Start{Who manages<br/>DNS infrastructure?}
    Start -->|Platform Team| PlatformCheck{Shared across<br/>namespaces?}
    Start -->|Development Team| DevCheck{Isolated to<br/>namespace?}

    PlatformCheck -->|Yes| Global[Use Bind9GlobalCluster<br/>cluster-scoped]
    PlatformCheck -->|No| Cluster[Use Bind9Cluster<br/>namespace-scoped]

    DevCheck -->|Yes| Cluster
    DevCheck -->|No| Global

    Global --> GlobalDetails[✓ ClusterRole required<br/>✓ Accessible from all namespaces<br/>✓ Centralized management<br/>✓ Production workloads]

    Cluster --> ClusterDetails[✓ Role required namespace<br/>✓ Isolated to namespace<br/>✓ Team autonomy<br/>✓ Dev/test workloads]

    style Global fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style GlobalDetails fill:#e8f5e9
    style ClusterDetails fill:#e3f2fd

Next Steps

Multi-Tenancy Guide

This guide explains how to set up multi-tenancy with Bindy using the dual-cluster model, RBAC configuration, and namespace isolation.

Table of Contents

Overview

Bindy supports multi-tenancy through two complementary approaches:

  1. Platform-Managed DNS: Centralized DNS infrastructure managed by platform teams
  2. Tenant-Managed DNS: Isolated DNS infrastructure managed by development teams

Both can coexist in the same cluster, providing flexibility for different organizational needs.

Key Principles

  • Namespace Isolation: DNSZones and records are always namespace-scoped
  • RBAC-Based Access: Kubernetes RBAC controls who can manage DNS resources
  • Cluster Model Flexibility: Choose namespace-scoped or cluster-scoped clusters based on needs
  • No Cross-Namespace Access: Records cannot reference zones in other namespaces

Tenancy Models

Model 1: Platform-Managed DNS

Use Case: Platform team provides shared DNS infrastructure for all applications.

graph TB
    subgraph "Platform Team ClusterRole"
        PlatformAdmin[Platform Admin]
    end

    subgraph "Cluster-Scoped Resources"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
    end

    subgraph "Application Team A Namespace"
        ZoneA[DNSZone<br/>app-a.example.com]
        RecordsA[DNS Records]
    end

    subgraph "Application Team B Namespace"
        ZoneB[DNSZone<br/>app-b.example.com]
        RecordsB[DNS Records]
    end

    PlatformAdmin -->|manages| GlobalCluster
    GlobalCluster -.globalClusterRef.-> ZoneA
    GlobalCluster -.globalClusterRef.-> ZoneB
    ZoneA --> RecordsA
    ZoneB --> RecordsB

    style PlatformAdmin fill:#ff9800
    style GlobalCluster fill:#c8e6c9
    style ZoneA fill:#fff4e1
    style ZoneB fill:#fff4e1

Characteristics:

  • Platform team manages Bind9GlobalCluster (requires ClusterRole)
  • Application teams manage DNSZone and records in their namespace (requires Role)
  • Shared DNS infrastructure, distributed zone management
  • Suitable for production workloads

Model 2: Tenant-Managed DNS

Use Case: Development teams run isolated DNS infrastructure for testing/dev.

graph TB
    subgraph "Team A Namespace + Role"
        AdminA[Team A Admin]
        ClusterA[Bind9Cluster<br/>team-a-dns]
        ZoneA[DNSZone<br/>dev-a.local]
        RecordsA[DNS Records]
    end

    subgraph "Team B Namespace + Role"
        AdminB[Team B Admin]
        ClusterB[Bind9Cluster<br/>team-b-dns]
        ZoneB[DNSZone<br/>dev-b.local]
        RecordsB[DNS Records]
    end

    AdminA -->|manages| ClusterA
    AdminA -->|manages| ZoneA
    AdminA -->|manages| RecordsA
    ClusterA --> ZoneA
    ZoneA --> RecordsA

    AdminB -->|manages| ClusterB
    AdminB -->|manages| ZoneB
    AdminB -->|manages| RecordsB
    ClusterB --> ZoneB
    ZoneB --> RecordsB

    style AdminA fill:#2196f3
    style AdminB fill:#2196f3
    style ClusterA fill:#e1f5ff
    style ClusterB fill:#e1f5ff

Characteristics:

  • Each team manages their own Bind9Cluster (namespace-scoped Role)
  • Complete isolation between teams
  • Teams have full autonomy over DNS configuration
  • Suitable for development/testing environments

Platform Team Setup

Step 1: Create ClusterRole for Platform DNS Management

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
# Manage cluster-scoped global clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View global cluster status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters/status"]
  verbs: ["get", "list", "watch"]

# Manage bind9 instances across all namespaces (for global clusters)
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View instance status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances/status"]
  verbs: ["get", "list", "watch"]

Step 2: Bind ClusterRole to Platform Team

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns-admin
subjects:
- kind: Group
  name: platform-team  # Your IdP/OIDC group name
  apiGroup: rbac.authorization.k8s.io
# Alternative: Bind to specific users
# - kind: User
#   name: alice@example.com
#   apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Step 3: Create Bind9GlobalCluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
  # No namespace - cluster-scoped
spec:
  version: "9.18"

  # Primary instances configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

  # Secondary instances configuration
  secondary:
    replicas: 2

  # Global BIND9 configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"

  # Access control lists
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"

Step 4: Grant Application Teams DNS Zone Management

Create a Role in each application namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-zone-admin
  namespace: app-team-a
rules:
# Manage DNS zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "aaaarecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
    - "nsrecords"
    - "srvrecords"
    - "caarecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View resource status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
    - "cnamerecords/status"
    - "mxrecords/status"
    - "txtrecords/status"
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-team-a-dns
  namespace: app-team-a
subjects:
- kind: Group
  name: app-team-a
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-zone-admin
  apiGroup: rbac.authorization.k8s.io

Step 5: Application Teams Create DNSZones

Application teams can now create zones in their namespace:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-a-zone
  namespace: app-team-a
spec:
  zoneName: app-a.example.com
  globalClusterRef: shared-production-dns  # References platform cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Development Team Setup

Step 1: Create Namespace for Team

apiVersion: v1
kind: Namespace
metadata:
  name: dev-team-alpha
  labels:
    team: dev-team-alpha
    environment: development

Step 2: Create Role for Full DNS Management

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-full-admin
  namespace: dev-team-alpha
rules:
# Manage namespace-scoped clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9clusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Manage instances
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Manage zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "aaaarecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
    - "nsrecords"
    - "srvrecords"
    - "caarecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View status for all resources
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "bind9clusters/status"
    - "bind9instances/status"
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

Step 3: Bind Role to Development Team

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dev-team-alpha-dns
  namespace: dev-team-alpha
subjects:
- kind: Group
  name: dev-team-alpha
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-full-admin
  apiGroup: rbac.authorization.k8s.io

Step 4: Development Team Creates Infrastructure

# Namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
# DNS zone referencing namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: dev-zone
  namespace: dev-team-alpha
spec:
  zoneName: dev.local
  clusterRef: dev-dns  # References namespace-scoped cluster
  soaRecord:
    primaryNs: ns1.dev.local.
    adminEmail: admin.dev.local.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 300
  ttl: 300
---
# DNS record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: test-server
  namespace: dev-team-alpha
spec:
  zoneRef: dev-zone
  name: test-server
  ipv4Address: "10.244.1.100"
  ttl: 60

RBAC Configuration

ClusterRole vs Role Decision Matrix

ResourceScopeRBAC TypeWho Gets It
Bind9GlobalClusterCluster-scopedClusterRole + ClusterRoleBindingPlatform team
Bind9ClusterNamespace-scopedRole + RoleBindingDevelopment teams
Bind9InstanceNamespace-scopedRole + RoleBindingTeams managing instances
DNSZoneNamespace-scopedRole + RoleBindingApplication teams
DNS RecordsNamespace-scopedRole + RoleBindingApplication teams

Example RBAC Hierarchy

graph TD
    subgraph "Cluster-Level RBAC"
        CR1[ClusterRole:<br/>platform-dns-admin]
        CRB1[ClusterRoleBinding:<br/>platform-team]
    end

    subgraph "Namespace-Level RBAC"
        R1[Role: dns-full-admin<br/>namespace: dev-team-alpha]
        RB1[RoleBinding: dev-team-alpha-dns]

        R2[Role: dns-zone-admin<br/>namespace: app-team-a]
        RB2[RoleBinding: app-team-a-dns]
    end

    CR1 --> CRB1
    R1 --> RB1
    R2 --> RB2

    CRB1 -.->|grants| PlatformTeam[platform-team group]
    RB1 -.->|grants| DevTeam[dev-team-alpha group]
    RB2 -.->|grants| AppTeam[app-team-a group]

    style CR1 fill:#ffccbc
    style R1 fill:#c5e1a5
    style R2 fill:#c5e1a5

Minimal Permissions for Application Teams

If application teams only need to manage DNS records (not clusters):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-record-editor
  namespace: app-team-a
rules:
# Only manage DNS zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Read-only access to status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

Security Best Practices

1. Namespace Isolation

Enforce strict namespace boundaries:

  • Records cannot reference zones in other namespaces
  • This is enforced by the controller using Api::namespaced()
  • No configuration needed - isolation is automatic
# team-a namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-a-zone  # ✅ Same namespace
  name: www
  ipv4Address: "192.0.2.1"
---
# This FAILS - cross-namespace reference blocked
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-b-zone  # ❌ Different namespace - BLOCKED
  name: www
  ipv4Address: "192.0.2.1"

2. Least Privilege RBAC

Grant minimum necessary permissions:

# ✅ GOOD - Specific permissions
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["dnszones", "arecords"]
  verbs: ["get", "list", "create", "update"]

# ❌ BAD - Overly broad permissions
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["*"]
  verbs: ["*"]

3. Separate Platform and Tenant Roles

Keep platform and tenant permissions separate:

Role TypeManagesScope
Platform DNS AdminBind9GlobalClusterCluster-wide
Tenant Cluster AdminBind9Cluster, Bind9InstanceNamespace
Tenant Zone AdminDNSZone, RecordsNamespace
Tenant Record EditorRecords onlyNamespace

4. Audit and Monitoring

Enable audit logging for DNS changes:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all changes to Bindy resources
- level: RequestResponse
  resources:
  - group: bindy.firestoned.io
    resources:
    - bind9globalclusters
    - bind9clusters
    - dnszones
    - arecords
    - mxrecords
  verbs: ["create", "update", "patch", "delete"]

5. NetworkPolicies for BIND9 Pods

Restrict network access to DNS pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-network-policy
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow DNS queries on port 53
  - from:
    - podSelector: {}  # All pods in namespace
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow Bindcar API access (internal only)
  - from:
    - podSelector:
        matchLabels:
          app: bindy-controller
    ports:
    - protocol: TCP
      port: 8080

Example Scenarios

Scenario 1: Multi-Region Production DNS

Requirement: Platform team manages production DNS across multiple regions.

# Platform creates global cluster per region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns-us-east
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  acls:
    trusted:
      - "10.0.0.0/8"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns-eu-west
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  acls:
    trusted:
      - "10.128.0.0/9"
---
# App teams create zones in their namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-us
  namespace: api-service
spec:
  zoneName: api.example.com
  globalClusterRef: production-dns-us-east
  soaRecord: { /* ... */ }
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-eu
  namespace: api-service
spec:
  zoneName: api.eu.example.com
  globalClusterRef: production-dns-eu-west
  soaRecord: { /* ... */ }

Scenario 2: Development Team Sandboxes

Requirement: Each dev team has isolated DNS for testing.

# Dev Team Alpha namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: alpha-dns
  namespace: dev-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: alpha-zone
  namespace: dev-alpha
spec:
  zoneName: alpha.test.local
  clusterRef: alpha-dns
  soaRecord: { /* ... */ }
---
# Dev Team Beta namespace (completely isolated)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: beta-dns
  namespace: dev-beta
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: beta-zone
  namespace: dev-beta
spec:
  zoneName: beta.test.local
  clusterRef: beta-dns
  soaRecord: { /* ... */ }

Scenario 3: Hybrid - Platform + Tenant DNS

Requirement: Production uses platform DNS, dev teams use their own.

# Platform manages production global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
---
# Production app references global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-prod
  namespace: production
spec:
  zoneName: app.example.com
  globalClusterRef: production-dns  # Platform-managed
  soaRecord: { /* ... */ }
---
# Dev team manages their own cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: development
spec:
  version: "9.18"
  primary:
    replicas: 1
---
# Dev app references namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-dev
  namespace: development
spec:
  zoneName: app.dev.local
  clusterRef: dev-dns  # Team-managed
  soaRecord: { /* ... */ }

Next Steps

Choosing a Cluster Type

This guide helps you decide between Bind9Cluster (namespace-scoped) and Bind9GlobalCluster (cluster-scoped) for your DNS infrastructure.

Quick Decision Matrix

FactorBind9ClusterBind9GlobalCluster
ScopeSingle namespaceCluster-wide
Who ManagesDevelopment teamsPlatform teams
RBACRole + RoleBindingClusterRole + ClusterRoleBinding
VisibilityNamespace-onlyAll namespaces
Use CaseDev/test environmentsProduction infrastructure
Zone ReferencesclusterRefglobalClusterRef
IsolationComplete isolation between teamsShared infrastructure
CostHigher (per-namespace overhead)Lower (shared resources)

Decision Tree

graph TD
    Start[Need DNS Infrastructure?]
    Start --> Q1{Who should<br/>manage it?}

    Q1 -->|Platform Team| Q2{Shared across<br/>multiple namespaces?}
    Q1 -->|Development Team| Q3{Need isolation<br/>from other teams?}

    Q2 -->|Yes| Global[Bind9GlobalCluster]
    Q2 -->|No| Cluster[Bind9Cluster]

    Q3 -->|Yes| Cluster
    Q3 -->|No| Global

    Global --> GlobalUse[Platform-managed<br/>Production DNS<br/>Shared infrastructure]
    Cluster --> ClusterUse[Team-managed<br/>Dev/Test DNS<br/>Isolated infrastructure]

    style Global fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style GlobalUse fill:#a5d6a7
    style ClusterUse fill:#90caf9

When to Use Bind9Cluster (Namespace-Scoped)

Ideal For:

Development and Testing Environments

  • Teams need isolated DNS for development
  • Frequent DNS configuration changes
  • Short-lived environments

Multi-Tenant Platforms

  • Each tenant gets their own namespace
  • Complete isolation between tenants
  • Teams manage their own DNS independently

Team Autonomy

  • Development teams need full control
  • No dependency on platform team
  • Self-service DNS management

Learning and Experimentation

  • Safe environment to learn BIND9
  • Can delete and recreate easily
  • No impact on other teams

Example Use Cases:

1. Development Team Sandbox

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1  # Minimal resources for dev
  secondary:
    replicas: 1
  global:
    options:
      - "recursion yes"  # Allow recursion for dev
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: test-zone
  namespace: dev-team-alpha
spec:
  zoneName: test.local
  clusterRef: dev-dns  # Namespace-scoped reference
  soaRecord:
    primaryNs: ns1.test.local.
    adminEmail: admin.test.local.
    serial: 2025010101
    refresh: 300  # Fast refresh for dev
    retry: 60
    expire: 3600
    negativeTtl: 60
  ttl: 60  # Low TTL for frequent changes

2. CI/CD Ephemeral Environments

# Each PR creates isolated DNS infrastructure
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: pr-{{PR_NUMBER}}-dns
  namespace: ci-pr-{{PR_NUMBER}}
  labels:
    pr-number: "{{PR_NUMBER}}"
    environment: ephemeral
spec:
  version: "9.18"
  primary:
    replicas: 1
  # Minimal config for short-lived environment

3. Multi-Tenant SaaS Platform

# Each customer gets isolated DNS in their namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: customer-dns
  namespace: customer-{{CUSTOMER_ID}}
  labels:
    customer-id: "{{CUSTOMER_ID}}"
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
  # Customer-specific ACLs and configuration
  acls:
    customer-networks:
      - "{{CUSTOMER_CIDR}}"

Characteristics:

✓ Pros:

  • Complete isolation between teams
  • No cross-namespace dependencies
  • Teams have full autonomy
  • Easy to delete and recreate
  • No ClusterRole permissions needed

✗ Cons:

  • Higher resource overhead (per-namespace clusters)
  • Cannot share DNS infrastructure across namespaces
  • Each team must manage their own BIND9 instances
  • Duplication of configuration

When to Use Bind9GlobalCluster (Cluster-Scoped)

Ideal For:

Production Infrastructure

  • Centralized DNS for production workloads
  • High availability requirements
  • Shared across multiple applications

Platform Team Management

  • Platform team provides DNS as a service
  • Centralized governance and compliance
  • Consistent configuration across environments

Resource Efficiency

  • Share DNS infrastructure across namespaces
  • Reduce operational overhead
  • Lower total cost of ownership

Enterprise Requirements

  • Audit logging and compliance
  • Centralized monitoring and alerting
  • Disaster recovery and backups

Example Use Cases:

1. Production DNS Infrastructure

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  # No namespace - cluster-scoped
spec:
  version: "9.18"

  # High availability configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
        service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"

  secondary:
    replicas: 3

  # Production-grade configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"
      - "minimal-responses yes"

  # Access control
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    secondaries:
      - "10.10.1.0/24"

  # Persistent storage for zone files
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-storage
  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind

Application teams reference the global cluster:

# Application in any namespace can use the global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-service  # Different namespace
spec:
  zoneName: api.example.com
  globalClusterRef: production-dns  # References cluster-scoped cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600
---
# Another application in a different namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: web-zone
  namespace: web-frontend  # Different namespace
spec:
  zoneName: www.example.com
  globalClusterRef: production-dns  # Same global cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

2. Multi-Region DNS

# Regional global clusters for geo-distributed DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-us-east
  labels:
    region: us-east-1
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.0.0.0/8"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-eu-west
  labels:
    region: eu-west-1
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.128.0.0/9"

3. Platform DNS as a Service

# Platform team provides multiple tiers of DNS service
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-premium
  labels:
    tier: premium
    sla: "99.99"
spec:
  version: "9.18"
  primary:
    replicas: 5  # High availability
    service:
      type: LoadBalancer
  secondary:
    replicas: 5
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-standard
  labels:
    tier: standard
    sla: "99.9"
spec:
  version: "9.18"
  primary:
    replicas: 3
  secondary:
    replicas: 2

Characteristics:

✓ Pros:

  • Shared infrastructure across namespaces
  • Lower total resource usage
  • Centralized management and governance
  • Consistent configuration
  • Platform team controls DNS

✗ Cons:

  • Requires ClusterRole permissions
  • Platform team must manage it
  • Less autonomy for application teams
  • Single point of management (not necessarily failure)

Hybrid Approach

You can use both cluster types in the same Kubernetes cluster:

graph TB
    subgraph "Production Workloads"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
        ProdZone1[DNSZone: api.example.com<br/>namespace: api-prod]
        ProdZone2[DNSZone: www.example.com<br/>namespace: web-prod]
    end

    subgraph "Development Namespace A"
        ClusterA[Bind9Cluster<br/>dev-dns-a]
        DevZoneA[DNSZone: dev-a.local<br/>namespace: dev-a]
    end

    subgraph "Development Namespace B"
        ClusterB[Bind9Cluster<br/>dev-dns-b]
        DevZoneB[DNSZone: dev-b.local<br/>namespace: dev-b]
    end

    GlobalCluster -.globalClusterRef.-> ProdZone1
    GlobalCluster -.globalClusterRef.-> ProdZone2
    ClusterA --> DevZoneA
    ClusterB --> DevZoneB

    style GlobalCluster fill:#c8e6c9
    style ClusterA fill:#e1f5ff
    style ClusterB fill:#e1f5ff

Example Configuration:

# Platform team manages production DNS globally
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
---
# Dev teams manage their own DNS per namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-a
spec:
  version: "9.18"
  primary:
    replicas: 1
---
# Production app references global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: prod-zone
  namespace: production
spec:
  zoneName: app.example.com
  globalClusterRef: production-dns
  soaRecord: { /* ... */ }
---
# Dev app references namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: dev-zone
  namespace: dev-team-a
spec:
  zoneName: app.dev.local
  clusterRef: dev-dns
  soaRecord: { /* ... */ }

Common Scenarios

Scenario 1: Startup/Small Team

Recommendation: Start with Bind9Cluster (namespace-scoped)

Why:

  • Simpler RBAC (no ClusterRole needed)
  • Faster iteration and experimentation
  • Easy to recreate if configuration is wrong
  • Lower learning curve

Migration Path: When you grow, migrate production to Bind9GlobalCluster while keeping dev on Bind9Cluster.

Scenario 2: Enterprise with Platform Team

Recommendation: Use Bind9GlobalCluster for production, Bind9Cluster for dev

Why:

  • Platform team provides production DNS as a service
  • Development teams have autonomy in their namespaces
  • Clear separation of responsibilities
  • Resource efficiency at scale

Scenario 3: Multi-Tenant SaaS

Recommendation: Use Bind9Cluster per tenant namespace

Why:

  • Complete isolation between customers
  • Tenant-specific configuration
  • Easier to delete customer data (namespace deletion)
  • No risk of cross-tenant data leaks

Scenario 4: CI/CD with Ephemeral Environments

Recommendation: Use Bind9Cluster per environment

Why:

  • Isolated DNS per PR/branch
  • Easy cleanup when PR closes
  • No impact on other environments
  • Fast provisioning

Migration Between Cluster Types

From Bind9Cluster to Bind9GlobalCluster

Steps:

  1. Create Bind9GlobalCluster:

    apiVersion: bindy.firestoned.io/v1alpha1
    kind: Bind9GlobalCluster
    metadata:
      name: shared-dns
    spec:
      # Copy configuration from Bind9Cluster
      version: "9.18"
      primary:
        replicas: 3
      secondary:
        replicas: 2
    
  2. Update DNSZone References:

    # Before
    apiVersion: bindy.firestoned.io/v1alpha1
    kind: DNSZone
    metadata:
      name: my-zone
      namespace: my-namespace
    spec:
      zoneName: example.com
      clusterRef: my-cluster  # namespace-scoped
    
    # After
    apiVersion: bindy.firestoned.io/v1alpha1
    kind: DNSZone
    metadata:
      name: my-zone
      namespace: my-namespace
    spec:
      zoneName: example.com
      globalClusterRef: shared-dns  # cluster-scoped
    
  3. Update RBAC (if needed):

    • Application teams no longer need permissions for bind9clusters
    • Only need permissions for dnszones and records
  4. Delete Old Bind9Cluster:

    kubectl delete bind9cluster my-cluster -n my-namespace
    

From Bind9GlobalCluster to Bind9Cluster

Steps:

  1. Create Bind9Cluster in Target Namespace:

    apiVersion: bindy.firestoned.io/v1alpha1
    kind: Bind9Cluster
    metadata:
      name: team-dns
      namespace: my-namespace
    spec:
      # Copy configuration from global cluster
      version: "9.18"
      primary:
        replicas: 2
    
  2. Update DNSZone References:

    # Before
    spec:
      globalClusterRef: shared-dns
    
    # After
    spec:
      clusterRef: team-dns
    
  3. Update RBAC (if needed):

    • Team needs permissions for bind9clusters in their namespace

Summary

Choose ThisIf You Need
Bind9ClusterTeam autonomy, complete isolation, dev/test environments
Bind9GlobalClusterShared infrastructure, platform management, production DNS
Both (Hybrid)Production on global, dev on namespace-scoped

Key Takeaway: There’s no “wrong” choice - select based on your organizational structure and requirements. Many organizations use both cluster types for different purposes.

Next Steps

Creating DNS Infrastructure

This section guides you through setting up your DNS infrastructure using Bindy. A typical DNS setup consists of:

  • Primary DNS Instances: Authoritative DNS servers that host the master copies of your zones
  • Secondary DNS Instances: Replica servers that receive zone transfers from primaries
  • Multi-Region Setup: Geographically distributed DNS servers for redundancy

Overview

Bindy uses Kubernetes Custom Resources to define DNS infrastructure. The Bind9Instance resource creates and manages BIND9 DNS server deployments, including:

  • BIND9 Deployment pods
  • ConfigMaps for BIND9 configuration
  • Services for DNS traffic (TCP/UDP port 53)

Infrastructure Components

Bind9Instance

A Bind9Instance represents a single BIND9 DNS server deployment. You can create multiple instances for:

  • High availability - Multiple replicas of the same instance
  • Role separation - Separate primary and secondary instances
  • Geographic distribution - Instances in different regions or availability zones

Planning Your Infrastructure

Before creating instances, consider:

  1. Zone Hosting Strategy

    • Which zones will be primary vs. secondary?
    • How will zones be distributed across instances?
  2. Redundancy Requirements

    • How many replicas per instance?
    • How many geographic locations?
  3. Label Strategy

    • How will you select instances for zones?
    • Common labels: dns-role, region, environment

Next Steps

Primary DNS Instances

Primary DNS instances are authoritative DNS servers that host the master copies of your DNS zones. They are the source of truth for DNS data and handle zone updates.

Creating a Primary Instance

Here’s a basic example of a primary DNS instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"  # Allow zone transfers to secondary servers
    dnssec:
      enabled: true
      validation: true

Apply it with:

kubectl apply -f primary-instance.yaml

Configuration Options

Replicas

The replicas field controls how many BIND9 pods to run:

spec:
  replicas: 2  # Run 2 pods for high availability

BIND9 Version

Specify the BIND9 version to use:

spec:
  version: "9.18"  # Use BIND 9.18

Query Access Control

Control who can query your DNS server:

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"      # Allow queries from anywhere
      - "10.0.0.0/8"     # Or restrict to specific networks

Zone Transfer Control

Restrict zone transfers to authorized servers (typically secondaries):

spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"     # Allow transfers to secondary network
      - "192.168.1.0/24" # Or specific secondary server network

DNSSEC Configuration

Enable DNSSEC signing and validation:

spec:
  config:
    dnssec:
      enabled: true      # Enable DNSSEC signing
      validation: true   # Enable DNSSEC validation

Recursion

Primary authoritative servers should disable recursion:

spec:
  config:
    recursion: false  # Disable recursion for authoritative servers

Labels

Use labels to organize and select instances:

metadata:
  labels:
    dns-role: primary        # Indicates this is a primary server
    environment: production  # Environment designation
    region: us-east-1       # Geographic location

These labels are used by DNSZone resources to select which instances should host their zones.

Verifying Deployment

Check the instance status:

kubectl get bind9instances -n dns-system
kubectl describe bind9instance primary-dns -n dns-system

Check the created resources:

# View the deployment
kubectl get deployment -n dns-system -l instance=primary-dns

# View the pods
kubectl get pods -n dns-system -l instance=primary-dns

# View the service
kubectl get service -n dns-system -l instance=primary-dns

Testing DNS Resolution

Once deployed, test DNS queries:

# Get the service IP
SERVICE_IP=$(kubectl get svc -n dns-system primary-dns -o jsonpath='{.spec.clusterIP}')

# Test DNS query
dig @$SERVICE_IP example.com

Next Steps

Secondary DNS Instances

Secondary DNS instances receive zone data from primary servers via zone transfers (AXFR/IXFR). They provide redundancy and load distribution for DNS queries.

Creating a Secondary Instance

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    dns-role: secondary
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Apply with:

kubectl apply -f secondary-instance.yaml

Key Differences from Primary

No Zone Transfers Allowed

Secondary servers typically don’t allow zone transfers:

spec:
  config:
    allowTransfer: []  # Empty or omitted - no transfers from secondary

Read-Only Zones

Secondaries receive zone data from primaries and cannot be updated directly. All zone modifications must be made on the primary server.

Label for Selection

Use the role: secondary label to enable automatic zone transfer configuration:

metadata:
  labels:
    role: secondary      # Required for automatic discovery
    cluster: production  # Required - must match cluster name

Important: The role: secondary label is required for Bindy to automatically discover secondary instances and configure zone transfers on primary zones.

Configuring Secondary Zones

When creating a DNSZone resource for secondary zones, use the secondary type and specify primary servers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"  # IP of primary DNS server
      - "10.0.1.11"  # Additional primary for redundancy

Automatic Zone Transfer Configuration

New in v0.1.0: Bindy automatically configures zone transfers from primaries to secondaries!

When you create primary DNSZone resources, Bindy automatically:

  1. Discovers secondary instances using the role=secondary label
  2. Configures zone transfers on primary zones with also-notify and allow-transfer
  3. Tracks secondary IPs in DNSZone.status.secondaryIps
  4. Detects IP changes when secondary pods restart or are rescheduled
  5. Auto-updates zones when secondary IPs change (within 5-10 minutes)

Example:

# Create secondary instance with proper labels
cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    role: secondary          # Required for discovery
    cluster: production      # Must match cluster name
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
EOF

# Create primary zone - zone transfers auto-configured!
cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production  # Matches cluster label
  # ... other config ...
EOF

# Verify automatic configuration
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'
# Output: ["10.244.1.5","10.244.2.8"]

Self-Healing: When secondary pods restart and get new IPs:

  • Bindy detects the change within one reconciliation cycle (~5-10 minutes)
  • Primary zones are automatically updated with new secondary IPs
  • Zone transfers resume automatically with no manual intervention

Verifying Zone Transfers

Check that zones are being transferred:

# Check zone files on secondary
kubectl exec -n dns-system deployment/secondary-dns -- ls -la /var/lib/bind/zones/

# Check BIND9 logs for transfer messages
kubectl logs -n dns-system -l instance=secondary-dns | grep "transfer of"

# Verify secondary IPs are configured on primary zones
kubectl get dnszone -n dns-system -o yaml | yq '.items[].status.secondaryIps'

Best Practices

Use Multiple Secondaries

Deploy secondary instances in different locations:

# Secondary in different AZ/region
metadata:
  labels:
    dns-role: secondary
    region: us-west-1

Configure NOTIFY

Primary servers send NOTIFY messages to secondaries when zones change. Ensure network connectivity allows these notifications.

Monitor Transfer Status

Watch for failed transfers in logs:

kubectl logs -n dns-system -l instance=secondary-dns --tail=100 | grep -i transfer

Network Requirements

Secondaries must be able to:

  1. Receive zone transfers from primaries (TCP port 53)
  2. Receive NOTIFY messages from primaries (UDP port 53)
  3. Respond to DNS queries from clients (UDP/TCP port 53)

Ensure Kubernetes network policies and firewall rules allow this traffic.

Next Steps

Multi-Region Setup

Distribute your DNS infrastructure across multiple regions or availability zones for maximum availability and performance.

Architecture Overview

A multi-region DNS setup typically includes:

  • Primary instances in one or more regions
  • Secondary instances in multiple geographic locations
  • Zone distribution across all instances using label selectors

Creating Regional Instances

Primary in Region 1

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east
  namespace: dns-system
  labels:
    dns-role: primary
    region: us-east-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true

Secondary in Region 2

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  namespace: dns-system
  labels:
    dns-role: secondary
    region: us-west-2
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Secondary in Region 3

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  namespace: dns-system
  labels:
    dns-role: secondary
    region: eu-west-1
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Distributing Zones Across Regions

Create zones that target all regions:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  type: primary
  instanceSelector:
    matchExpressions:
      - key: environment
        operator: In
        values:
          - production
      - key: dns-role
        operator: In
        values:
          - primary
          - secondary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

This zone will be deployed to all instances matching the selector (all production primary and secondary instances).

Deployment Strategy

Option 1: Primary-Secondary Model

  • One region hosts primary instances
  • All other regions host secondary instances
  • Zone transfers flow from primary to secondaries
graph LR
    region1["Region 1 (us-east-1)<br/>Primary Instances<br/>(Master zones)"]
    region2["Region 2 (us-west-2)<br/>Secondary Instances<br/>(Slave zones)"]
    region3["Region 3 (eu-west-1)<br/>Secondary Instances<br/>(Slave zones)"]

    region1 -->|Zone Transfer| region2
    region2 -->|Zone Transfer| region3

    style region1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style region2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style region3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Option 2: Multi-Primary Model

  • Multiple regions host primary instances
  • Different zones can have primaries in different regions
  • Use careful labeling to route zones to appropriate primaries

Network Considerations

Zone Transfer Network

Ensure network connectivity for zone transfers:

  • Primaries must reach secondaries on TCP port 53
  • Use VPN, peering, or allow public transfer with IP restrictions

Client Query Routing

Use one of:

  • GeoDNS - Route clients to nearest regional instance
  • Anycast - Same IP announced from multiple locations
  • Load Balancer - Distribute across regional endpoints

Failover Strategy

Automatic Failover

Kubernetes handles pod-level failures automatically:

spec:
  replicas: 2  # Multiple replicas for pod-level HA

Regional Failover

For regional failures:

  1. Clients automatically query secondary instances in other regions
  2. Zone data remains available via zone transfers
  3. Updates queue until primary region recovers

Manual Failover

To manually promote a secondary to primary:

  1. Update DNSZone to change primary servers
  2. Update instance labels if needed
  3. Verify zone transfers are working correctly

Monitoring Multi-Region Setup

Check instance distribution:

# View all instances and their regions
kubectl get bind9instances -n dns-system -L region

# Check zone distribution
kubectl describe dnszone example-com -n dns-system

Monitor zone transfers:

# Check transfer logs on secondaries
kubectl logs -n dns-system -l dns-role=secondary | grep "transfer of"

Best Practices

  1. Use Odd Number of Regions: 3 or 5 regions for better quorum
  2. Distribute Replicas: Spread replicas across availability zones
  3. Monitor Latency: Watch zone transfer times between regions
  4. Test Failover: Regularly test regional failover scenarios
  5. Automate Updates: Use GitOps for consistent multi-region deployments

Next Steps

Managing DNS Zones

DNS zones are the containers for DNS records. In Bindy, zones are defined using the DNSZone custom resource.

Zone Types

Primary Zones

Primary (master) zones contain the authoritative data:

  • Zone data is created and managed on the primary
  • Changes are made by creating/updating DNS record resources
  • Can be transferred to secondary servers

Secondary Zones

Secondary (slave) zones receive data from primary servers:

  • Zone data is received via AXFR/IXFR transfers
  • Read-only - cannot be modified directly
  • Automatically updated when primary changes

Zone Lifecycle

  1. Create Bind9Instance resources to host zones
  2. Create DNSZone resource with instance selector
  3. Add DNS records (A, CNAME, MX, etc.)
  4. Monitor status to ensure zone is active

Instance Selection

Zones are deployed to Bind9Instances using label selectors:

spec:
  instanceSelector:
    matchLabels:
      dns-role: primary
      environment: production

This deploys the zone to all instances matching both labels.

SOA Record

Every primary zone requires an SOA (Start of Authority) record:

spec:
  soaRecord:
    primaryNs: ns1.example.com.      # Primary nameserver
    adminEmail: admin@example.com    # Admin email (@ becomes .)
    serial: 2024010101               # Zone serial number
    refresh: 3600                    # Refresh interval
    retry: 600                       # Retry interval
    expire: 604800                   # Expiration time
    negativeTtl: 86400              # Negative caching TTL

Zone Configuration

TTL (Time To Live)

Set the default TTL for records in the zone:

spec:
  ttl: 3600  # 1 hour default TTL

Individual records can override this with their own TTL values.

Zone Status

Check zone status:

kubectl get dnszone -n dns-system
kubectl describe dnszone example-com -n dns-system

Status conditions indicate:

  • Whether the zone is ready
  • Which instances are hosting the zone
  • Any errors or warnings

Common Operations

Listing Zones

# List all zones
kubectl get dnszones -n dns-system

# Show zones with custom columns
kubectl get dnszones -n dns-system -o custom-columns=NAME:.metadata.name,ZONE:.spec.zoneName,TYPE:.spec.type

Viewing Zone Details

kubectl describe dnszone example-com -n dns-system

Updating Zones

Edit the zone configuration:

kubectl edit dnszone example-com -n dns-system

Or apply an updated YAML file:

kubectl apply -f zone.yaml

Deleting Zones

kubectl delete dnszone example-com -n dns-system

This removes the zone from all instances but doesn’t delete the instance itself.

Next Steps

Creating Zones

Learn how to create DNS zones in Bindy using the RNDC protocol.

Zone Architecture

Zones in Bindy follow a three-tier model:

  1. Bind9Cluster - Cluster-level configuration (version, shared config, TSIG keys)
  2. Bind9Instance - Individual BIND9 server deployment (references a cluster)
  3. DNSZone - DNS zone (references an instance via clusterRef)

Prerequisites

Before creating a zone, ensure you have:

  1. A Bind9Cluster resource deployed
  2. A Bind9Instance resource deployed (referencing the cluster)
  3. The instance is ready and running

Creating a Primary Zone

First, ensure you have a cluster and instance:

# Step 1: Create a Bind9Cluster (if not already created)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"

---
# Step 2: Create a Bind9Instance (if not already created)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References the Bind9Cluster above
  role: primary
  replicas: 1

---
# Step 3: Create the DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns  # References the Bind9Instance above
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

How It Works

When you create a DNSZone:

  1. Controller discovers pods - Finds BIND9 pods with label instance=primary-dns
  2. Loads RNDC key - Retrieves Secret named primary-dns-rndc-key
  3. Connects via RNDC - Establishes connection to primary-dns.dns-system.svc.cluster.local:953
  4. Executes addzone - Runs rndc addzone example.com command
  5. BIND9 creates zone - BIND9 creates the zone and starts serving it
  6. Updates status - Controller updates DNSZone status to Ready

Verifying Zone Creation

Check the zone status:

kubectl get dnszones -n dns-system
kubectl describe dnszone example-com -n dns-system

Expected output:

Name:         example-com
Namespace:    dns-system
Labels:       <none>
Annotations:  <none>
API Version:  bindy.firestoned.io/v1alpha1
Kind:         DNSZone
Spec:
  Cluster Ref:  primary-dns
  Zone Name:    example.com
Status:
  Conditions:
    Type:    Ready
    Status:  True
    Reason:  Synchronized
    Message: Zone created for cluster: primary-dns

Next Steps

Cluster References

Bindy uses direct cluster references instead of label selectors for targeting DNS zones to BIND9 instances.

Overview

In Bindy’s three-tier architecture, resources reference each other directly by name:

Bind9Cluster ← clusterRef ← Bind9Instance
       ↑
   clusterRef ← DNSZone ← zoneRef ← DNS Records

This provides:

  • Explicit targeting - Clear, direct references instead of label matching
  • Simpler configuration - No complex selector logic
  • Better validation - References can be validated at admission time
  • Easier troubleshooting - Direct relationships are easier to understand

Cluster Reference Model

Bind9Cluster (Top-Level)

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false

Bind9Instance References Bind9Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # Direct reference to Bind9Cluster name
  role: primary  # Required: primary or secondary
  replicas: 2

DNSZone References Bind9Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production-dns  # Direct reference to Bind9Cluster name
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.

How References Work

When you create a DNSZone with clusterRef: production-dns:

  1. Controller finds the Bind9Cluster - Looks up Bind9Cluster named production-dns
  2. Discovers instances - Finds all Bind9Instance resources referencing this cluster
  3. Identifies primaries - Selects instances with role: primary
  4. Loads RNDC keys - Retrieves RNDC keys from cluster configuration
  5. Connects via RNDC - Connects to primary instance pods via RNDC
  6. Creates zone - Executes rndc addzone command on primary instances

Example: Multi-Region Setup

East Region

# East Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dns-cluster-east
  namespace: dns-system
spec:
  version: "9.18"

---
# East Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-east
  namespace: dns-system
spec:
  clusterRef: dns-cluster-east
  role: primary  # Required: primary or secondary
  replicas: 2

---
# Zone on East Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-east
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-cluster-east  # Targets east cluster

West Region

# West Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dns-cluster-west
  namespace: dns-system
spec:
  version: "9.18"

---
# West Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-west
  namespace: dns-system
spec:
  clusterRef: dns-cluster-west
  role: primary  # Required: primary or secondary
  replicas: 2

---
# Zone on West Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-west
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-cluster-west  # Targets west cluster

Benefits Over Label Selectors

Simpler Configuration

Old approach (label selectors):

# Had to set labels on instance
labels:
  dns-role: primary
  region: us-east

# Had to use selector in zone
instanceSelector:
  matchLabels:
    dns-role: primary
    region: us-east

New approach (cluster references):

# Just reference by name
clusterRef: primary-dns

Better Validation

  • References can be validated at admission time
  • Typos are caught immediately
  • No ambiguity about which instance will host the zone

Clearer Relationships

# See exactly which instance hosts a zone
kubectl get dnszone example-com -o jsonpath='{.spec.clusterRef}'

# See which cluster an instance belongs to
kubectl get bind9instance primary-dns -o jsonpath='{.spec.clusterRef}'

Migrating from Label Selectors

If you have old DNSZone resources using instanceSelector, migrate them:

Before:

spec:
  zoneName: example.com
  instanceSelector:
    matchLabels:
      dns-role: primary

After:

spec:
  zoneName: example.com
  clusterRef: production-dns  # Direct reference to cluster name

Next Steps

Zone Configuration

Advanced zone configuration options.

Default TTL

Set the default TTL for all records in the zone:

spec:
  ttl: 3600  # 1 hour

SOA Record Details

spec:
  soaRecord:
    primaryNs: ns1.example.com.    # Primary nameserver FQDN (must end with .)
    adminEmail: admin@example.com  # Admin email (@ replaced with . in zone file)
    serial: 2024010101             # Serial number (YYYYMMDDnn format recommended)
    refresh: 3600                  # How often secondaries check for updates (seconds)
    retry: 600                     # How long to wait before retry after failed refresh
    expire: 604800                 # When to stop answering if no refresh (1 week)
    negativeTtl: 86400             # TTL for negative responses (NXDOMAIN)

Secondary Zone Configuration

For secondary zones, specify primary servers:

spec:
  type: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"

Managing DNS Records

DNS records are the actual data in your zones - IP addresses, mail servers, text data, etc.

Record Types

Bindy supports all common DNS record types:

  • A Records - IPv4 addresses
  • AAAA Records - IPv6 addresses
  • CNAME Records - Canonical name (alias)
  • MX Records - Mail exchange servers
  • TXT Records - Text data (SPF, DKIM, DMARC, verification)
  • NS Records - Nameserver delegation
  • SRV Records - Service location
  • CAA Records - Certificate authority authorization

Record Structure

All records share common fields:

apiVersion: bindy.firestoned.io/v1alpha1
kind: <RecordType>
metadata:
  name: <unique-name>
  namespace: dns-system
spec:
  # Zone reference - use ONE of these:
  zone: <zone-name>            # Match against DNSZone spec.zoneName
  # OR
  zoneRef: <zone-resource-name> # Direct reference to DNSZone metadata.name

  name: <record-name>          # Name within the zone
  ttl: <optional-ttl>          # Override zone default TTL
  # ... record-specific fields

Referencing DNS Zones

DNS records must reference an existing DNSZone. There are two ways to reference a zone:

Method 1: Using zone Field (Zone Name Lookup)

The zone field searches for a DNSZone by matching its spec.zoneName:

spec:
  zone: example.com  # Matches DNSZone with spec.zoneName: example.com
  name: www

How it works:

  • The controller lists all DNSZones in the namespace
  • Searches for one with spec.zoneName matching the provided value
  • More intuitive - you specify the actual DNS zone name

When to use:

  • Quick testing and development
  • When you’re not sure of the resource name
  • When readability is more important than performance

Method 2: Using zoneRef Field (Direct Reference)

The zoneRef field directly references a DNSZone by its Kubernetes resource name:

spec:
  zoneRef: example-com  # Matches DNSZone with metadata.name: example-com
  name: www

How it works:

  • The controller directly retrieves the DNSZone by metadata.name
  • No search required - single API call
  • More efficient

When to use:

  • Production environments (recommended)
  • Large namespaces with many zones
  • When performance matters
  • Infrastructure-as-code with known resource names

Choosing Between zone and zoneRef

CriteriazonezoneRef
PerformanceSlower (list + search)Faster (direct get)
ReadabilityMore intuitiveLess obvious
Use CaseDevelopment/testingProduction
API CallsMultipleSingle
Best ForHumans writing YAMLAutomation/templates

Important: You must specify exactly one of zone or zoneRef - not both, not neither.

Example: Same Record, Two Methods

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com        # Kubernetes resource name
  namespace: dns-system
spec:
  zoneName: example.com    # Actual DNS zone name
  clusterRef: primary-dns
  # ...

Create an A record using either method:

Using zone (matches spec.zoneName):

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zone: example.com     # ← Actual zone name
  name: www
  ipv4Address: "192.0.2.1"

Using zoneRef (matches metadata.name):

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com  # ← Resource name
  name: www
  ipv4Address: "192.0.2.1"

Both create the same DNS record: www.example.com → 192.0.2.1

Creating Records

After choosing your zone reference method, specify the record details:

spec:
  zoneRef: example-com  # Recommended for production
  name: www             # Creates www.example.com
  ipv4Address: "192.0.2.1"
  ttl: 300             # Optional - overrides zone default

Next Steps

A Records (IPv4)

A records map domain names to IPv4 addresses.

Creating an A Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

This creates www.example.com -> 192.0.2.1.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Root Record

For the zone apex (example.com):

spec:
  zoneRef: example-com
  name: "@"
  ipv4Address: "192.0.2.1"

Multiple A Records

Create multiple records for the same name for load balancing:

kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-1
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-2
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.2"

AAAA Records (IPv6)

AAAA records map domain names to IPv6 addresses. They are the IPv6 equivalent of A records.

Creating an AAAA Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-ipv6
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

This creates www.example.com -> 2001:db8::1.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Root Record

For the zone apex (example.com):

spec:
  zoneRef: example-com
  name: "@"
  ipv6Address: "2001:db8::1"

Multiple AAAA Records

Create multiple records for the same name for load balancing:

kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6-1
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6-2
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::2"
EOF

DNS clients will receive both addresses (round-robin load balancing).

Dual-Stack Configuration

For dual-stack (IPv4 + IPv6) configuration, create both A and AAAA records:

# IPv4
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-ipv4
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300
---
# IPv6
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

Clients will use IPv6 if available, falling back to IPv4 otherwise.

IPv6 Address Formats

IPv6 addresses support various formats:

# Full format
ipv6Address: "2001:0db8:0000:0000:0000:0000:0000:0001"

# Compressed format (recommended)
ipv6Address: "2001:db8::1"

# Link-local address
ipv6Address: "fe80::1"

# Loopback
ipv6Address: "::1"

# IPv4-mapped IPv6
ipv6Address: "::ffff:192.0.2.1"

Common Use Cases

Web Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: web-ipv6
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8:1::443"
  ttl: 300

API Endpoint

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: api-ipv6
spec:
  zoneRef: example-com
  name: api
  ipv6Address: "2001:db8:2::443"
  ttl: 60  # Short TTL for faster updates

Mail Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: mail-ipv6
spec:
  zoneRef: example-com
  name: mail
  ipv6Address: "2001:db8:3::25"
  ttl: 3600

Best Practices

  1. Use compressed format - 2001:db8::1 instead of 2001:0db8:0000:0000:0000:0000:0000:0001
  2. Dual-stack when possible - Provide both A and AAAA records for compatibility
  3. Match TTLs - Use the same TTL for A and AAAA records of the same name
  4. Test IPv6 connectivity - Ensure your infrastructure supports IPv6 before advertising AAAA records

Status Monitoring

Check the status of your AAAA record:

kubectl get aaaarecord www-ipv6 -o yaml

Look for the status.conditions field:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Troubleshooting

Record not resolving

  1. Check record status:

    kubectl get aaaarecord www-ipv6 -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
    
  2. Verify zone exists:

    kubectl get dnszone example-com
    
  3. Test DNS resolution:

    dig AAAA www.example.com @<dns-server-ip>
    

Invalid IPv6 address

The controller validates IPv6 addresses. Ensure your address is in valid format:

  • Use compressed notation: 2001:db8::1
  • Do not mix uppercase/lowercase unnecessarily
  • Ensure all segments are valid hexadecimal

Next Steps

CNAME Records

CNAME (Canonical Name) records create aliases to other domain names.

Creating a CNAME Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example-com
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: blog
  target: www.example.com.  # Must end with a dot
  ttl: 300

This creates blog.example.com -> www.example.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Important CNAME Rules

Target Must Be Fully Qualified

The target field must be a fully qualified domain name (FQDN) ending with a dot:

# ✅ Correct
target: www.example.com.

# ❌ Incorrect - missing trailing dot
target: www.example.com

No CNAME at Zone Apex

CNAME records cannot be created at the zone apex (@):

# ❌ Not allowed - RFC 1034/1035 violation
spec:
  zoneRef: example-com
  name: "@"
  target: www.example.com.

For the zone apex, use A Records or AAAA Records instead.

No Other Records for Same Name

If a CNAME exists for a name, no other record types can exist for that same name (RFC 1034):

# ❌ Not allowed - www already has a CNAME
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-alias
spec:
  zoneRef: example-com
  name: www
  target: server.example.com.
---
# ❌ This will conflict with the CNAME above
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-a-record
spec:
  zoneRef: example-com
  name: www  # Same name as CNAME - not allowed
  ipv4Address: "192.0.2.1"

Common Use Cases

Aliasing to External Services

Point to external services like CDNs or cloud providers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: cdn
  target: d111111abcdef8.cloudfront.net.
  ttl: 3600

Subdomain Aliases

Create aliases for subdomains:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: shop-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: shop
  target: www.example.com.
  ttl: 300

This creates shop.example.com -> www.example.com.

Internal Service Discovery

Point to internal Kubernetes services:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cache-internal
  namespace: dns-system
spec:
  zoneRef: internal-local
  name: cache
  target: db.internal.local.
  ttl: 300

www to Non-www Redirect

Create a www alias:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: www
  target: example.com.
  ttl: 300

Note: This only works if example.com has an A or AAAA record, not another CNAME.

Field Reference

FieldTypeRequiredDescription
zonestringEither zone or zoneRefDNS zone name (e.g., “example.com”)
zoneRefstringEither zone or zoneRefReference to DNSZone metadata.name
namestringYesRecord name within the zone (cannot be “@”)
targetstringYesTarget FQDN ending with a dot
ttlintegerNoTime To Live in seconds (default: zone TTL)

TTL Behavior

If ttl is not specified, the zone’s default TTL is used:

# Uses zone default TTL
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.
# Explicit TTL override
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.
  ttl: 600  # 10 minutes

Troubleshooting

CNAME Loop Detection

Avoid creating CNAME loops:

# ❌ Creates a loop
# a.example.com -> b.example.com
# b.example.com -> a.example.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cname-a
spec:
  zoneRef: example-com
  name: a
  target: b.example.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cname-b
spec:
  zoneRef: example-com
  name: b
  target: a.example.com.  # ❌ Loop!

Missing Trailing Dot

If your CNAME doesn’t resolve correctly, check for the trailing dot:

# Check the BIND9 zone file
kubectl exec -n dns-system bindy-primary-0 -- cat /etc/bind/zones/example.com.zone

# Should show:
# blog.example.com.  300  IN  CNAME  www.example.com.

If you see relative names, the target is missing the trailing dot:

# ❌ Wrong - becomes blog.example.com -> www.example.com.example.com
blog.example.com.  300  IN  CNAME  www.example.com

See Also

MX Records (Mail Exchange)

MX records specify the mail servers responsible for accepting email on behalf of a domain. Each MX record includes a priority value that determines the order in which mail servers are contacted.

Creating an MX Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"             # Zone apex - mail for @example.com
  priority: 10
  mailServer: mail.example.com.  # Must end with a dot (FQDN)
  ttl: 3600

This configures mail delivery for example.com to mail.example.com with priority 10.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

FQDN Requirement

CRITICAL: The mailServer field MUST end with a dot (.) to indicate a fully qualified domain name (FQDN).

# ✅ CORRECT
mailServer: mail.example.com.

# ❌ WRONG - will be treated as relative to zone
mailServer: mail.example.com

Priority Values

Lower priority values are preferred. Mail servers with the lowest priority are contacted first.

Single Mail Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.

Multiple Mail Servers (Failover)

# Primary mail server (lowest priority)
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail1.example.com.
  ttl: 3600
---
# Backup mail server
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-backup
spec:
  zoneRef: example-com
  name: "@"
  priority: 20
  mailServer: mail2.example.com.
  ttl: 3600

Sending servers will try mail1.example.com first (priority 10), falling back to mail2.example.com (priority 20) if the primary is unavailable.

Load Balancing

Equal priority values enable round-robin load balancing:

# Server 1
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-1
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail1.example.com.
---
# Server 2 (same priority)
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-2
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail2.example.com.

Both servers share the load equally.

Subdomain Mail

Configure mail for a subdomain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: support-mail
spec:
  zoneRef: example-com
  name: support  # Email: user@support.example.com
  priority: 10
  mailServer: mail-support.example.com.

Common Configurations

Google Workspace (formerly G Suite)

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-1
spec:
  zoneRef: example-com
  name: "@"
  priority: 1
  mailServer: aspmx.l.google.com.
  ttl: 3600
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-2
spec:
  zoneRef: example-com
  name: "@"
  priority: 5
  mailServer: alt1.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-3
spec:
  zoneRef: example-com
  name: "@"
  priority: 5
  mailServer: alt2.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-4
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: alt3.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-5
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: alt4.aspmx.l.google.com.

Microsoft 365

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-microsoft
spec:
  zoneRef: example-com
  name: "@"
  priority: 0
  mailServer: example-com.mail.protection.outlook.com.  # Replace 'example-com' with your domain
  ttl: 3600

Self-Hosted Mail Server

# Primary MX
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
---
# Corresponding A record for mail server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-server
spec:
  zoneRef: example-com
  name: mail
  ipv4Address: "203.0.113.10"

Best Practices

  1. Always use FQDNs - End mailServer values with a dot (.)
  2. Set appropriate TTLs - Use longer TTLs (3600-86400) for stable mail configurations
  3. Configure backups - Use multiple MX records with different priorities for redundancy
  4. Test mail delivery - Verify mail flow after DNS changes
  5. Coordinate with SPF/DKIM - Update TXT records when adding mail servers

Required Supporting Records

MX records need corresponding A/AAAA records for the mail servers:

# MX record points to mail.example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-main
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
---
# A record for mail.example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-server-ipv4
spec:
  zoneRef: example-com
  name: mail
  ipv4Address: "203.0.113.10"
---
# AAAA record for IPv6
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: mail-server-ipv6
spec:
  zoneRef: example-com
  name: mail
  ipv6Address: "2001:db8::10"

Status Monitoring

Check the status of your MX record:

kubectl get mxrecord mx-primary -o yaml

Look for the status.conditions field:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Troubleshooting

Mail not being delivered

  1. Check MX record status:

    kubectl get mxrecord mx-primary -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'
    
  2. Verify DNS propagation:

    dig MX example.com @<dns-server-ip>
    
  3. Test from external servers:

    nslookup -type=MX example.com 8.8.8.8
    
  4. Check mail server A/AAAA records exist:

    dig A mail.example.com
    

Common Mistakes

  • Missing trailing dot - mail.example.com instead of mail.example.com.
  • No A/AAAA record - MX points to a hostname that doesn’t resolve
  • Wrong priority - Higher priority when you meant lower (remember: lower = preferred)
  • Relative vs absolute - Without trailing dot, name is treated as relative to zone

Testing Mail Configuration

Test MX lookup

# Query MX records
dig MX example.com

# Expected output shows priority and mail server
;; ANSWER SECTION:
example.com.  3600  IN  MX  10 mail.example.com.
example.com.  3600  IN  MX  20 mail2.example.com.

Test mail server connectivity

# Test SMTP connection
telnet mail.example.com 25

# Or using openssl for TLS
openssl s_client -starttls smtp -connect mail.example.com:25

Next Steps

TXT Records (Text)

TXT records store arbitrary text data in DNS. They’re commonly used for domain verification, email security (SPF, DKIM, DMARC), and other service configurations.

Creating a TXT Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: verification-txt
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"
  text: "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

Common Use Cases

SPF (Sender Policy Framework)

Authorize mail servers to send email on behalf of your domain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
spec:
  zoneRef: example-com
  name: "@"
  text: "v=spf1 mx include:_spf.google.com ~all"
  ttl: 3600

Common SPF mechanisms:

  • mx - Allow servers in MX records
  • a - Allow A/AAAA records of domain
  • ip4:192.0.2.0/24 - Allow specific IPv4 range
  • include:domain.com - Include another domain’s SPF policy
  • ~all - Soft fail (recommended)
  • -all - Hard fail (strict)

DKIM (Domain Keys Identified Mail)

Publish DKIM public keys:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim-selector
spec:
  zoneRef: example-com
  name: default._domainkey  # selector._domainkey format
  text: "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBA..."
  ttl: 3600

DMARC (Domain-based Message Authentication)

Set email authentication policy:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc-policy
spec:
  zoneRef: example-com
  name: _dmarc
  text: "v=DMARC1; p=quarantine; rua=mailto:dmarc@example.com"
  ttl: 3600

DMARC policies:

  • p=none - Monitor only (recommended for testing)
  • p=quarantine - Treat failures as spam
  • p=reject - Reject failures outright

Domain Verification

Verify domain ownership for services:

# Google verification
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: google-verification
spec:
  zoneRef: example-com
  name: "@"
  text: "google-site-verification=1234567890abcdef"
---
# Microsoft verification
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: ms-verification
spec:
  zoneRef: example-com
  name: "@"
  text: "MS=ms12345678"

Service-Specific Records

Atlassian Domain Verification

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: atlassian-verify
spec:
  zoneRef: example-com
  name: "@"
  text: "atlassian-domain-verification=abc123"

Stripe Domain Verification

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: stripe-verify
spec:
  zoneRef: example-com
  name: "_stripe-verification"
  text: "stripe-verification=xyz789"

Multiple TXT Values

Some records require multiple TXT strings. Create separate records:

# SPF record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: txt-spf
spec:
  zoneRef: example-com
  name: "@"
  text: "v=spf1 include:_spf.google.com ~all"
---
# Domain verification (same name, different value)
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: txt-verify
spec:
  zoneRef: example-com
  name: "@"
  text: "google-site-verification=abc123"

Both records will exist under the same DNS name.

String Formatting

Long Strings

DNS TXT records have a 255-character limit per string. For longer values, the DNS server automatically splits them:

spec:
  text: "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."  # Can be long

Special Characters

Quote strings containing spaces or special characters:

spec:
  text: "This string contains spaces"
  text: "key=value; another-key=another value"

Best Practices

  1. Keep TTLs moderate - 3600 (1 hour) is typical for TXT records
  2. Test before deploying - Verify SPF/DKIM/DMARC records with online tools
  3. Monitor DMARC reports - Set up rua and ruf addresses to receive reports
  4. Start with soft policies - Use ~all for SPF and p=none for DMARC initially
  5. Document record purposes - Use clear resource names

Status Monitoring

kubectl get txtrecord spf-record -o yaml
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test TXT record

# Query TXT records
dig TXT example.com

# Test SPF
dig TXT example.com | grep spf

# Test DKIM
dig TXT default._domainkey.example.com

# Test DMARC
dig TXT _dmarc.example.com

Online Validation Tools

Common Issues

  • SPF too long - Limit DNS lookups to 10 (use include wisely)
  • DKIM not found - Verify selector name matches mail server configuration
  • DMARC syntax error - Validate with online tools before deploying

Next Steps

NS Records (Name Server)

NS records delegate a subdomain to a different set of nameservers. This is essential for subdomain delegation and zone distribution.

Creating an NS Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: subdomain-ns
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: sub              # Subdomain to delegate
  nameserver: ns1.subdomain-host.com.  # Must end with dot (FQDN)
  ttl: 3600

This delegates sub.example.com to ns1.subdomain-host.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

Subdomain Delegation

Delegate a subdomain to external nameservers:

# Primary nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: dev-ns1
spec:
  zoneRef: example-com
  name: dev
  nameserver: ns1.hosting-provider.com.
---
# Secondary nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: dev-ns2
spec:
  zoneRef: example-com
  name: dev
  nameserver: ns2.hosting-provider.com.

Now dev.example.com is managed by the hosting provider’s DNS servers.

Common Use Cases

Multi-Cloud Delegation

# Delegate subdomain to AWS Route 53
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: aws-ns1
spec:
  zoneRef: example-com
  name: aws
  nameserver: ns-123.awsdns-12.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: aws-ns2
spec:
  zoneRef: example-com
  name: aws
  nameserver: ns-456.awsdns-45.net.

Environment Separation

# Production environment
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: prod-ns1
spec:
  zoneRef: example-com
  name: prod
  nameserver: ns-prod1.example.com.
---
# Staging environment
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: staging-ns1
spec:
  zoneRef: example-com
  name: staging
  nameserver: ns-staging1.example.com.

FQDN Requirement

CRITICAL: The nameserver field MUST end with a dot (.):

# ✅ CORRECT
nameserver: ns1.example.com.

# ❌ WRONG
nameserver: ns1.example.com

Glue Records

When delegating to nameservers within the delegated zone, you need glue records (A/AAAA):

# NS delegation
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns
spec:
  zoneRef: example-com
  name: sub
  nameserver: ns1.sub.example.com.  # Nameserver is within delegated zone
---
# Glue record (A record for the nameserver)
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: sub-ns-glue
spec:
  zoneRef: example-com
  name: ns1.sub
  ipv4Address: "203.0.113.10"

Best Practices

  1. Use multiple NS records - Always specify at least 2 nameservers for redundancy
  2. FQDNs only - Always end nameserver values with a dot
  3. Match TTLs - Use consistent TTLs across NS records for the same subdomain
  4. Glue records - Provide A/AAAA records when NS points within delegated zone
  5. Test delegation - Verify subdomain resolution after delegation

Status Monitoring

kubectl get nsrecord subdomain-ns -o yaml
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test NS delegation

# Query NS records
dig NS sub.example.com

# Test resolution through delegated nameservers
dig @ns1.subdomain-host.com www.sub.example.com

Common Issues

  • Missing glue records - Circular dependency if NS points within delegated zone
  • Wrong FQDN - Missing trailing dot causes relative name
  • Single nameserver - No redundancy if one server fails

Next Steps

SRV Records (Service Location)

SRV records specify the location of services, including hostname and port number. They’re used for service discovery in protocols like SIP, XMPP, LDAP, and Minecraft.

Creating an SRV Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-server
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  service: xmpp-client  # Service name (without leading underscore)
  proto: tcp            # Protocol: tcp or udp
  name: "@"             # Domain (use @ for zone apex)
  priority: 10
  weight: 50
  port: 5222
  target: xmpp.example.com.  # Must end with dot (FQDN)
  ttl: 3600

This creates _xmpp-client._tcp.example.com pointing to xmpp.example.com:5222.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

SRV Record Format

The DNS name format is: _service._proto.name.domain

  • service: Service name (e.g., xmpp-client, sip, ldap)
  • proto: Protocol (tcp or udp)
  • name: Subdomain or @ for zone apex
  • priority: Lower values are preferred (like MX records)
  • weight: For load balancing among equal priorities (0-65535)
  • port: Service port number
  • target: Hostname providing the service (FQDN with trailing dot)

Common Services

XMPP (Jabber)

# Client connections
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-client
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 5
  weight: 0
  port: 5222
  target: xmpp.example.com.
---
# Server-to-server
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-server
spec:
  zoneRef: example-com
  service: xmpp-server
  proto: tcp
  name: "@"
  priority: 5
  weight: 0
  port: 5269
  target: xmpp.example.com.

SIP (VoIP)

# SIP over TCP
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-tcp
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 10
  weight: 50
  port: 5060
  target: sip.example.com.
---
# SIP over UDP
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-udp
spec:
  zoneRef: example-com
  service: sip
  proto: udp
  name: "@"
  priority: 10
  weight: 50
  port: 5060
  target: sip.example.com.

LDAP

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: ldap-service
spec:
  zoneRef: example-com
  service: ldap
  proto: tcp
  name: "@"
  priority: 0
  weight: 100
  port: 389
  target: ldap.example.com.

Minecraft Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: minecraft
spec:
  zoneRef: example-com
  service: minecraft
  proto: tcp
  name: "@"
  priority: 0
  weight: 5
  port: 25565
  target: mc.example.com.

Priority and Weight

Failover with Priority

# Primary server (priority 10)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-primary
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 10
  weight: 0
  port: 5060
  target: sip1.example.com.
---
# Backup server (priority 20)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-backup
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 20
  weight: 0
  port: 5060
  target: sip2.example.com.

Load Balancing with Weight

# Server 1 (weight 70 = 70% of traffic)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-1
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 10
  weight: 70
  port: 5222
  target: xmpp1.example.com.
---
# Server 2 (weight 30 = 30% of traffic)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-2
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 10
  weight: 30
  port: 5222
  target: xmpp2.example.com.

FQDN Requirement

CRITICAL: The target field MUST end with a dot (.):

# ✅ CORRECT
target: server.example.com.

# ❌ WRONG
target: server.example.com

Required Supporting Records

SRV records need corresponding A/AAAA records for targets:

# SRV record
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: service-srv
spec:
  zoneRef: example-com
  service: myservice
  proto: tcp
  name: "@"
  priority: 10
  weight: 0
  port: 8080
  target: server.example.com.
---
# A record for target
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: server
spec:
  zoneRef: example-com
  name: server
  ipv4Address: "203.0.113.50"

Best Practices

  1. Always use FQDNs - End target values with a dot
  2. Multiple servers - Use priority/weight for redundancy and load balancing
  3. Match protocols - Create both TCP and UDP records if service supports both
  4. Test clients - Verify client applications can discover services via SRV
  5. Document services - Clearly name resources for maintainability

Status Monitoring

kubectl get srvrecord xmpp-server -o yaml
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test SRV record

# Query SRV record
dig SRV _xmpp-client._tcp.example.com

# Expected output shows priority, weight, port, and target
;; ANSWER SECTION:
_xmpp-client._tcp.example.com. 3600 IN SRV 5 0 5222 xmpp.example.com.

Common Issues

  • Service not auto-discovered - Verify client supports SRV lookups
  • Missing A/AAAA for target - Target hostname must resolve
  • Wrong service/proto names - Must match what client expects (check docs)

Next Steps

CAA Records (Certificate Authority Authorization)

CAA records specify which Certificate Authorities (CAs) are authorized to issue SSL/TLS certificates for your domain. This helps prevent unauthorized certificate issuance.

Creating a CAA Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: letsencrypt-caa
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"             # Apply to entire domain
  flags: 0              # Typically 0 (non-critical)
  tag: issue            # Tag: issue, issuewild, or iodef
  value: letsencrypt.org
  ttl: 3600

This authorizes Let’s Encrypt to issue certificates for example.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

CAA Tags

issue

Authorizes a CA to issue certificates for the domain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org  # Authorize Let's Encrypt

issuewild

Authorizes a CA to issue wildcard certificates:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: letsencrypt.org  # Allow wildcard certificates

iodef

Specifies URL/email for reporting policy violations:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef-email
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: iodef
  value: mailto:security@example.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef-url
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: iodef
  value: https://example.com/caa-report

Common Configurations

Let’s Encrypt

# Standard certificates
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-le-issue
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
---
# Wildcard certificates
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-le-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: letsencrypt.org

DigiCert

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-digicert
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: digicert.com

AWS Certificate Manager

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-aws
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: amazon.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-aws-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: amazon.com

Multiple CAs

Authorize multiple Certificate Authorities:

# Let's Encrypt
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-letsencrypt
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
---
# DigiCert
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-digicert
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: digicert.com

Deny All Issuance

Prevent any CA from issuing certificates:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-deny-all
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: ";"  # Semicolon means no CA is authorized

Flags

  • 0 - Non-critical (default, recommended)
  • 128 - Critical - CA MUST understand all CAA properties or refuse issuance

Most deployments use flags: 0.

Subdomain CAA Records

Apply CAA policy to specific subdomains:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-staging
spec:
  zoneRef: example-com
  name: staging  # staging.example.com
  flags: 0
  tag: issue
  value: letsencrypt.org  # Only Let's Encrypt for staging

Best Practices

  1. Start with permissive policies - Allow your current CA before enforcing restrictions
  2. Test thoroughly - Verify certificate renewal works after adding CAA
  3. Use iodef - Configure reporting to catch unauthorized issuance attempts
  4. Document authorized CAs - Maintain list of approved CAs in your security policy
  5. Regular audits - Review CAA records periodically

Certificate Authority Values

Common CA values for the issue and issuewild tags:

  • Let’s Encrypt: letsencrypt.org
  • DigiCert: digicert.com
  • AWS ACM: amazon.com
  • GlobalSign: globalsign.com
  • Sectigo (Comodo): sectigo.com
  • GoDaddy: godaddy.com
  • Google Trust Services: pki.goog

Check your CA’s documentation for the correct value.

Status Monitoring

kubectl get caarecord letsencrypt-caa -o yaml
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test CAA records

# Query CAA records
dig CAA example.com

# Expected output
;; ANSWER SECTION:
example.com. 3600 IN CAA 0 issue "letsencrypt.org"
example.com. 3600 IN CAA 0 issuewild "letsencrypt.org"

Certificate Issuance Failures

If certificate issuance fails after adding CAA:

  1. Verify CA is authorized:

    dig CAA example.com
    
  2. Check for typos in CA value

  3. Ensure both issue and issuewild are configured if using wildcards

  4. Test with online tools:

Common Mistakes

  • Wrong CA value - Each CA has a specific value (check their docs)
  • Missing issuewild - Wildcard certificates need separate authorization
  • Critical flag - Using flags: 128 can cause issues if CA doesn’t understand all tags

Security Benefits

  1. Prevent unauthorized issuance - CAs must check CAA before issuing
  2. Incident detection - iodef tag provides violation notifications
  3. Defense in depth - Additional layer beyond domain validation
  4. Compliance - Many security standards recommend CAA records

Next Steps

Configuration

Configure the Bindy DNS operator and BIND9 instances for your environment.

Controller Configuration

The Bindy controller is configured through environment variables set in the deployment.

See Environment Variables for details on all available configuration options.

BIND9 Instance Configuration

Configure BIND9 instances through the Bind9Instance custom resource:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: my-cluster
  role: primary
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true
      validation: true

Configuration Options

Container Image Configuration

Customize the BIND9 container image and pull configuration:

spec:
  # At instance level (overrides cluster)
  image:
    image: "my-registry.example.com/bind9:custom"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret

Or configure at the cluster level for all instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: my-cluster
spec:
  # Default image configuration for all instances
  image:
    image: "internetsystemsconsortium/bind9:9.18"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - shared-pull-secret

Fields:

  • image: Full container image reference (e.g., registry/image:tag)
  • imagePullPolicy: Always, IfNotPresent, or Never
  • imagePullSecrets: List of secret names for private registries

Custom Configuration Files

Use custom ConfigMaps for BIND9 configuration:

spec:
  # Reference custom ConfigMaps
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"
    namedConfZones: "my-custom-zones"  # Optional: for zone definitions

Create your custom ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel custom_log {
        file "/var/log/named/queries.log" versions 3 size 5m;
        severity info;
      };
      category queries { custom_log; };
    };

Zones Configuration File:

If you need to provide a custom zones file (e.g., for pre-configured zones), create a ConfigMap with named.conf.zones:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-zones
  namespace: dns-system
data:
  named.conf.zones: |
    // Zone definitions
    zone "example.com" {
      type primary;
      file "/etc/bind/zones/example.com.zone";
    };

    zone "internal.local" {
      type primary;
      file "/etc/bind/zones/internal.local.zone";
    };

Then reference it in your Bind9Instance:

spec:
  configMapRefs:
    namedConfZones: "my-custom-zones"

Default Behavior:

  • If configMapRefs is not specified, Bindy auto-generates configuration from the config block
  • If custom ConfigMaps are provided, they take precedence
  • The namedConfZones ConfigMap is optional - only include it if you need to pre-configure zones
  • If no namedConfZones is provided, no zones file will be included (zones can be added dynamically via RNDC)

Recursion

Control whether the DNS server performs recursive queries:

spec:
  config:
    recursion: false  # Disable for authoritative servers

For authoritative DNS servers, recursion should be disabled.

Query Access Control

Specify which networks can query the DNS server:

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"        # Allow from anywhere (public DNS)
      - "10.0.0.0/8"       # Private network only
      - "192.168.1.0/24"   # Specific subnet

Zone Transfer Access Control

Restrict zone transfers to authorized servers:

spec:
  config:
    allowTransfer:
      - "10.0.1.0/24"      # Secondary DNS network
      - "192.168.100.5"    # Specific secondary server

DNSSEC Configuration

Enable DNSSEC signing and validation:

spec:
  config:
    dnssec:
      enabled: true        # Enable DNSSEC signing
      validation: true     # Enable DNSSEC validation

RBAC Configuration

Configure Role-Based Access Control for the operator.

See RBAC for detailed RBAC setup.

Resource Limits

Set CPU and memory limits for BIND9 pods.

See Resource Limits for resource configuration.

Configuration Best Practices

  1. Separate Primary and Secondary - Use different instances for primary and secondary roles
  2. Limit Zone Transfers - Only allow transfers to known secondaries
  3. Enable DNSSEC - Use DNSSEC for production zones
  4. Set Appropriate Replicas - Use 2+ replicas for high availability
  5. Use Labels - Organize instances with meaningful labels

Next Steps

Environment Variables

Configure the Bindy controller using environment variables.

Controller Environment Variables

RUST_LOG

Control logging level:

env:
  - name: RUST_LOG
    value: "info"  # Options: error, warn, info, debug, trace

Levels:

  • error - Only errors
  • warn - Warnings and errors
  • info - Informational messages (default)
  • debug - Detailed debugging
  • trace - Very detailed tracing

RUST_LOG_FORMAT

Control logging output format:

env:
  - name: RUST_LOG_FORMAT
    value: "text"  # Options: text, json

Formats:

  • text - Human-readable compact text format (default)
  • json - Structured JSON format for log aggregation tools

Use JSON format for:

  • Kubernetes production deployments
  • Log aggregation systems (Loki, ELK, Splunk)
  • Centralized logging and monitoring
  • Automated log parsing and analysis

Example JSON output:

{
  "timestamp": "2025-11-30T10:00:00.123456Z",
  "level": "INFO",
  "message": "Starting BIND9 DNS Controller",
  "file": "main.rs",
  "line": 80,
  "threadName": "bindy-controller"
}

RECONCILE_INTERVAL

Set how often to reconcile resources (in seconds):

env:
  - name: RECONCILE_INTERVAL
    value: "300"  # 5 minutes

NAMESPACE

Limit operator to specific namespace:

env:
  - name: NAMESPACE
    valueFrom:
      fieldRef:
        fieldPath: metadata.namespace

Omit to watch all namespaces (requires ClusterRole).

Example Deployment Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy
  namespace: dns-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: bindy
  template:
    metadata:
      labels:
        app: bindy
    spec:
      serviceAccountName: bindy
      containers:
      - name: controller
        image: ghcr.io/firestoned/bindy:latest
        env:
        - name: RUST_LOG
          value: "info"
        - name: RUST_LOG_FORMAT
          value: "json"
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace

Best Practices

  1. Use info level in production - Balance between visibility and noise
  2. Enable debug for troubleshooting - Temporarily increase to debug level
  3. Use JSON format in production - Enable structured logging for better log aggregation
  4. Use text format for development - More readable for local debugging
  5. Set reconcile interval appropriately - Don’t set too low to avoid API pressure
  6. Use namespace scoping - Scope to specific namespace if not managing cluster-wide DNS

RBAC (Role-Based Access Control)

Configure Kubernetes RBAC for the Bindy controller.

Required Permissions

The Bindy controller needs permissions to:

  • Manage Bind9Instance, DNSZone, and DNS record resources
  • Create and manage Deployments, Services, ConfigMaps, and ServiceAccounts
  • Update resource status fields
  • Create events for logging

ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bindy-role
rules:
  # Bindy CRDs
  - apiGroups: ["bindy.firestoned.io"]
    resources:
      - "bind9instances"
      - "bind9instances/status"
      - "dnszones"
      - "dnszones/status"
      - "arecords"
      - "arecords/status"
      - "aaaarecords"
      - "aaaarecords/status"
      - "cnamerecords"
      - "cnamerecords/status"
      - "mxrecords"
      - "mxrecords/status"
      - "txtrecords"
      - "txtrecords/status"
      - "nsrecords"
      - "nsrecords/status"
      - "srvrecords"
      - "srvrecords/status"
      - "caarecords"
      - "caarecords/status"
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  
  # Kubernetes resources
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  - apiGroups: [""]
    resources: ["services", "configmaps", "serviceaccounts"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "patch"]

ServiceAccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bindy
  namespace: dns-system

ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bindy-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: bindy-role
subjects:
- kind: ServiceAccount
  name: bindy
  namespace: dns-system

Namespace-Scoped RBAC

For namespace-scoped deployments, use Role instead of ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: bindy-role
  namespace: dns-system
rules:
  # Same rules as ClusterRole
  - apiGroups: ["bindy.firestoned.io"]
    resources: ["bind9instances", "dnszones", "*records"]
    verbs: ["*"]
  
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["*"]
  
  - apiGroups: [""]
    resources: ["services", "configmaps"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bindy-rolebinding
  namespace: dns-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: bindy-role
subjects:
- kind: ServiceAccount
  name: bindy
  namespace: dns-system

Applying RBAC

# Apply all RBAC resources
kubectl apply -f deploy/rbac/

# Verify ServiceAccount
kubectl get serviceaccount bindy -n dns-system

# Verify ClusterRole
kubectl get clusterrole bindy-role

# Verify ClusterRoleBinding
kubectl get clusterrolebinding bindy-rolebinding

Security Best Practices

  1. Least Privilege - Only grant necessary permissions
  2. Namespace Scoping - Use namespace-scoped roles when possible
  3. Separate ServiceAccounts - Don’t reuse default ServiceAccount
  4. Audit Regularly - Review permissions periodically
  5. Use Pod Security Policies - Restrict pod capabilities

Troubleshooting RBAC

Check if controller has required permissions:

# Check what the ServiceAccount can do
kubectl auth can-i list dnszones \
  --as=system:serviceaccount:dns-system:bindy

# Describe the ClusterRoleBinding
kubectl describe clusterrolebinding bindy-rolebinding

# Check controller logs for permission errors
kubectl logs -n dns-system deployment/bindy | grep -i forbidden

Resource Limits

Configure CPU and memory limits for BIND9 pods.

Setting Resource Limits

Configure resources in the Bind9Instance spec:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  replicas: 2
  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
    limits:
      cpu: "500m"
      memory: "512Mi"

Small Deployment (Few zones)

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Medium Deployment (Multiple zones)

resources:
  requests:
    cpu: "200m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"

Large Deployment (Many zones, high traffic)

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "2000m"
    memory: "2Gi"

Best Practices

  1. Set both requests and limits - Ensures predictable performance
  2. Start conservative - Begin with lower values and adjust based on monitoring
  3. Monitor usage - Use metrics to right-size resources
  4. Leave headroom - Don’t max out limits
  5. Consider query volume - High-traffic DNS needs more resources

Monitoring Resource Usage

# View pod resource usage
kubectl top pods -n dns-system -l app=bind9

# Describe pod to see limits
kubectl describe pod -n dns-system <pod-name>

Monitoring

Monitor the health and performance of your Bindy DNS infrastructure.

Status Conditions

All Bindy resources report their status using standardized conditions:

# Check Bind9Instance status
kubectl get bind9instance primary-dns -n dns-system -o jsonpath='{.status.conditions}'

# Check DNSZone status
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.conditions}'

See Status Conditions for detailed condition types.

Logging

View controller and BIND9 logs:

# Controller logs
kubectl logs -n dns-system deployment/bindy

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns

# Follow logs
kubectl logs -n dns-system deployment/bindy -f

See Logging for log configuration.

Metrics

Monitor resource usage and performance:

# Pod resource usage
kubectl top pods -n dns-system

# Node resource usage
kubectl top nodes

See Metrics for detailed metrics.

Health Checks

BIND9 pods include liveness and readiness probes:

livenessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 5
  periodSeconds: 5

Check probe status:

kubectl describe pod -n dns-system <bind9-pod-name>

Monitoring Tools

Prometheus

Scrape metrics from BIND9 using bind_exporter:

# Add exporter sidecar to Bind9Instance
# (Future enhancement)

Grafana

Create dashboards for:

  • Query rate and latency
  • Zone transfer status
  • Resource usage
  • Error rates

Alerts

Set up alerts for:

  1. Pod crashes or restarts
  2. Failed zone transfers
  3. High query latency
  4. Resource exhaustion
  5. DNSSEC validation failures

Next Steps

Status Conditions

This document describes the standardized status conditions used across all Bindy CRDs.

Condition Types

All Bindy custom resources (Bind9Instance, DNSZone, and all DNS record types) use the following standardized condition types:

Ready

  • Description: Indicates whether the resource is fully operational and ready to serve its intended purpose
  • Common Use: Primary condition type used by all reconcilers
  • Status Values:
    • True: Resource is ready and operational
    • False: Resource is not ready (error or in progress)
    • Unknown: Status cannot be determined

Available

  • Description: Indicates whether the resource is available for use
  • Common Use: Used to distinguish between “ready” and “available” when resources may be ready but not yet serving traffic
  • Status Values:
    • True: Resource is available
    • False: Resource is not available
    • Unknown: Availability cannot be determined

Progressing

  • Description: Indicates whether the resource is currently being worked on
  • Common Use: During initial creation or updates
  • Status Values:
    • True: Resource is being created or updated
    • False: Resource is not currently progressing
    • Unknown: Progress status cannot be determined

Degraded

  • Description: Indicates that the resource is functioning but in a degraded state
  • Common Use: When some replicas are down but service continues, or when non-critical features are unavailable
  • Status Values:
    • True: Resource is degraded
    • False: Resource is not degraded
    • Unknown: Degradation status cannot be determined

Failed

  • Description: Indicates that the resource has failed and cannot fulfill its purpose
  • Common Use: Permanent failures that require intervention
  • Status Values:
    • True: Resource has failed
    • False: Resource has not failed
    • Unknown: Failure status cannot be determined

Condition Structure

All conditions follow this structure:

status:
  conditions:
    - type: Ready              # One of: Ready, Available, Progressing, Degraded, Failed
      status: "True"           # One of: "True", "False", "Unknown"
      reason: Ready            # Machine-readable reason (typically same as type)
      message: "Bind9Instance configured with 2 replicas"  # Human-readable message
      lastTransitionTime: "2024-11-26T10:00:00Z"          # RFC3339 timestamp
  observedGeneration: 1        # Generation last observed by controller
  # Resource-specific fields (replicas, recordCount, etc.)

Current Usage

Bind9Instance

  • Uses Ready condition type
  • Status True when Deployment, Service, and ConfigMap are successfully created
  • Status False when resource creation fails
  • Additional status fields:
    • replicas: Total number of replicas
    • readyReplicas: Number of ready replicas

DNSZone

  • Uses Ready condition type
  • Status True when zone file is created and instances are matched
  • Status False when zone creation fails
  • Additional status fields:
    • recordCount: Number of records in the zone
    • observedGeneration: Last observed generation

DNS Records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

  • All use Ready condition type
  • Status True when record is successfully added to zone
  • Status False when record creation fails
  • Additional status fields:
    • observedGeneration: Last observed generation

Best Practices

  1. Always set the condition type: Use one of the five standardized types
  2. Include timestamps: Set lastTransitionTime when condition status changes
  3. Provide clear messages: The message field should be human-readable and actionable
  4. Use appropriate reasons: The reason field should be machine-readable and consistent
  5. Update observedGeneration: Always update to match the resource’s current generation
  6. Multiple conditions: Resources can have multiple conditions simultaneously (e.g., Ready: True and Degraded: True)

Examples

Successful Bind9Instance

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Ready
      message: "Bind9Instance configured with 2 replicas"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  replicas: 2
  readyReplicas: 2

Failed DNSZone

status:
  conditions:
    - type: Ready
      status: "False"
      reason: Failed
      message: "No Bind9Instances matched selector"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

Progressing Deployment

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: Progressing
      message: "Deployment is rolling out"
      lastTransitionTime: "2024-11-26T10:00:00Z"
    - type: Ready
      status: "False"
      reason: Progressing
      message: "Waiting for deployment to complete"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 2
  replicas: 2
  readyReplicas: 1

Validation

All condition types are enforced via CRD validation. Attempting to use a condition type not in the enum will result in a validation error:

$ kubectl apply -f invalid-condition.yaml
Error from server (Invalid): error when creating "invalid-condition.yaml":
Bind9Instance.bindy.firestoned.io "test" is invalid:
status.conditions[0].type: Unsupported value: "CustomType":
supported values: "Ready", "Available", "Progressing", "Degraded", "Failed"

Logging

Configure and analyze logs from the Bindy controller and BIND9 instances.

Controller Logging

Log Levels

Set log level via RUST_LOG environment variable:

env:
  - name: RUST_LOG
    value: "info"  # error, warn, info, debug, trace

Log Format

Set log output format via RUST_LOG_FORMAT environment variable:

env:
  - name: RUST_LOG_FORMAT
    value: "json"  # text or json (default: text)

Text format (default):

  • Human-readable compact format
  • Ideal for development and local debugging
  • Includes timestamps, file locations, and line numbers

JSON format:

  • Structured JSON output
  • Recommended for production Kubernetes deployments
  • Easy integration with log aggregation tools (Loki, ELK, Splunk)
  • Enables programmatic log parsing and analysis

Viewing Controller Logs

# View recent logs
kubectl logs -n dns-system deployment/bindy --tail=100

# Follow logs in real-time
kubectl logs -n dns-system deployment/bindy -f

# Filter by log level
kubectl logs -n dns-system deployment/bindy | grep ERROR

# Search for specific resource
kubectl logs -n dns-system deployment/bindy | grep "example-com"

BIND9 Instance Logging

BIND9 instances are configured by default to log to stderr, making logs available through standard Kubernetes logging commands.

Default Logging Configuration

Bindy automatically configures BIND9 with the following logging channels:

  • stderr_log: All logs directed to stderr for container-native logging
  • Severity: Info level by default (configurable)
  • Categories: Default, queries, security, zone transfers (xfer-in/xfer-out)
  • Format: Includes timestamps, categories, and severity levels

Viewing BIND9 Logs

# Logs from all BIND9 pods
kubectl logs -n dns-system -l app=bind9

# Logs from specific instance
kubectl logs -n dns-system -l instance=primary-dns

# Follow logs
kubectl logs -n dns-system -l instance=primary-dns -f --tail=50

Common Log Messages

Successful Zone Load:

zone example.com/IN: loaded serial 2024010101

Zone Transfer:

transfer of 'example.com/IN' from 10.0.1.10#53: Transfer completed

Query Logging (if enabled):

client @0x7f... 192.0.2.1#53210: query: www.example.com IN A

Log Aggregation

Using Fluentd/Fluent Bit

Collect logs to centralized logging:

# Example Fluent Bit DaemonSet configuration
# Automatically collects pod logs

Using Loki

Store and query logs with Grafana Loki:

# Query logs for DNS zone
{namespace="dns-system", app="bind9"} |= "example.com"

# Query for errors
{namespace="dns-system"} |= "ERROR"

Structured Logging

JSON Format

Enable JSON logging with RUST_LOG_FORMAT=json:

env:
  - name: RUST_LOG_FORMAT
    value: "json"

Example JSON output:

{
  "timestamp": "2025-11-30T10:00:00.123456Z",
  "level": "INFO",
  "message": "Reconciling DNSZone: dns-system/example-com",
  "file": "dnszone.rs",
  "line": 142,
  "threadName": "bindy-controller"
}

Text Format

Default human-readable format (RUST_LOG_FORMAT=text or unset):

2025-11-30T10:00:00.123456Z dnszone.rs:142 INFO bindy-controller Reconciling DNSZone: dns-system/example-com

Log Retention

Configure log retention based on your needs:

  • Development: 7 days
  • Production: 30-90 days
  • Compliance: As required by regulations

Troubleshooting with Logs

Find Failed Reconciliations

kubectl logs -n dns-system deployment/bindy | grep "ERROR\|Failed"

Track Zone Transfer Issues

kubectl logs -n dns-system -l dns-role=secondary | grep "transfer"

Monitor Resource Creation

kubectl logs -n dns-system deployment/bindy | grep "Creating\|Updating"

Best Practices

  1. Use appropriate log levels - info for production, debug for troubleshooting
  2. Use JSON format in production - Enable structured logging for better integration with log aggregation tools
  3. Use text format for development - More readable for local debugging and development
  4. Centralize logs - Use log aggregation for easier analysis
  5. Set up log rotation - Prevent disk space issues
  6. Create alerts - Alert on ERROR level logs
  7. Regular review - Periodically review logs for issues

Example Production Configuration

env:
  - name: RUST_LOG
    value: "info"
  - name: RUST_LOG_FORMAT
    value: "json"

Example Development Configuration

env:
  - name: RUST_LOG
    value: "debug"
  - name: RUST_LOG_FORMAT
    value: "text"

Changing Log Levels at Runtime

This guide explains how to change the controller’s log level without modifying code or redeploying the application.


Overview

The Bindy controller’s log level is configured via a ConfigMap (bindy-config), which allows runtime changes without code modifications. This is especially useful for:

  • Troubleshooting: Temporarily enable debug logging to investigate issues
  • Performance: Reduce log verbosity in production (info or warn)
  • Compliance: Meet PCI-DSS 3.4 requirements (no sensitive data in production logs)

Default Log Levels

EnvironmentLog LevelLog FormatRationale
ProductioninfojsonPCI-DSS compliant, structured logging for SIEM
StaginginfojsonProduction-like logging
DevelopmentdebugtextHuman-readable, detailed logging

Changing Log Level

# Change log level to debug
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "debug"}}'

# Restart controller pods to apply changes
kubectl rollout restart deployment/bindy -n dns-system

# Verify new log level
kubectl logs -n dns-system -l app=bindy --tail=20

Available Log Levels:

  • error - Only errors (critical issues)
  • warn - Warnings and errors
  • info - Normal operations (default for production)
  • debug - Detailed reconciliation steps (troubleshooting)
  • trace - Extremely verbose (rarely needed)

Method 2: Direct Deployment Patch (Temporary)

For temporary debugging without ConfigMap changes:

# Enable debug logging (overrides ConfigMap)
kubectl set env deployment/bindy RUST_LOG=debug -n dns-system

# Revert to ConfigMap value
kubectl set env deployment/bindy RUST_LOG- -n dns-system

Warning: This method bypasses the ConfigMap and is lost on next deployment. Use for quick debugging only.


Changing Log Format

# Change to JSON format (production)
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "json"}}'

# Change to text format (development)
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "text"}}'

# Restart to apply
kubectl rollout restart deployment/bindy -n dns-system

Log Formats:

  • json - Structured JSON logs (recommended for production, SIEM integration)
  • text - Human-readable logs (recommended for development)

Verifying Log Level Changes

# Check current ConfigMap values
kubectl get configmap bindy-config -n dns-system -o yaml

# Check environment variables in running pod
kubectl exec -n dns-system deployment/bindy -- printenv | grep RUST_LOG

# View recent logs to confirm verbosity
kubectl logs -n dns-system -l app=bindy --tail=100

Production Log Level Best Practices

✅ DO:

  • Use info level in production - Balances visibility with performance
  • Use json format in production - Enables structured logging and SIEM integration
  • Temporarily enable debug for troubleshooting - Use ConfigMap, document in incident log
  • Revert to info after troubleshooting - Debug logs impact performance

❌ DON’T:

  • Leave debug enabled in production - Performance impact, log volume explosion
  • Use trace level - Extremely verbose, only for deep troubleshooting
  • Hardcode log levels in deployment - Use ConfigMap for runtime changes

Audit Debug Logs for Sensitive Data

Before enabling debug logging in production, verify no sensitive data is logged:

# Audit debug logs for secrets, passwords, keys
kubectl logs -n dns-system -l app=bindy --tail=1000 | \
  grep -iE '(password|secret|key|token|credential)'

# If sensitive data found, fix in code before enabling debug

PCI-DSS 3.4 Requirement: Mask or remove PAN (Primary Account Number) from all logs.

Bindy Compliance: Controller does not handle payment card data directly, but RNDC keys and DNS zone data are considered sensitive.


Troubleshooting Scenarios

Scenario 1: Controller Not Reconciling Zones

# Enable debug logging
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "debug"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Watch logs for reconciliation details
kubectl logs -n dns-system -l app=bindy --follow

# Look for errors in reconciliation loop
kubectl logs -n dns-system -l app=bindy | grep -i error

Scenario 2: High Log Volume (Performance Issue)

# Reduce log level to warn
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "warn"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Verify reduced log volume
kubectl logs -n dns-system -l app=bindy --tail=100

Scenario 3: SIEM Integration (Structured Logging)

# Ensure JSON format for SIEM
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "json"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Verify JSON output
kubectl logs -n dns-system -l app=bindy --tail=10 | jq .

Log Level Change Procedures (Compliance)

For compliance audits (SOX 404, PCI-DSS), document log level changes:

Change Request Template

# Log Level Change Request

**Date:** 2025-12-18
**Requester:** [Your Name]
**Approver:** [Security Team Lead]
**Environment:** Production

**Current State:**
- Log Level: info
- Log Format: json

**Requested Change:**
- Log Level: debug
- Log Format: json
- Duration: 2 hours (for troubleshooting)

**Justification:**
Investigating slow DNS zone reconciliation (Incident INC-12345)

**Rollback Plan:**
Revert to info level after 2 hours or when issue is resolved

**Approved by:** [Security Team Lead Signature]

See Also

Metrics

Monitor performance and health metrics for Bindy DNS infrastructure.

Operator Metrics

Bindy exposes Prometheus-compatible metrics on port 8080 at /metrics. These metrics provide comprehensive observability into the operator’s behavior and resource management.

Accessing Metrics

The metrics endpoint is exposed on all operator pods:

# Port forward to the operator
kubectl port-forward -n dns-system deployment/bindy-controller 8080:8080

# View metrics
curl http://localhost:8080/metrics

Available Metrics

All metrics use the namespace prefix bindy_firestoned_io_.

Reconciliation Metrics

bindy_firestoned_io_reconciliations_total (Counter) Total number of reconciliation attempts by resource type and outcome.

Labels:

  • resource_type: Kind of resource (Bind9Cluster, Bind9Instance, DNSZone, ARecord, AAAARecord, TXTRecord, CNAMERecord, MXRecord, NSRecord, SRVRecord, CAARecord)
  • status: Outcome (success, error, requeue)
# Reconciliation success rate
rate(bindy_firestoned_io_reconciliations_total{status="success"}[5m])

# Error rate by resource type
rate(bindy_firestoned_io_reconciliations_total{status="error"}[5m])

bindy_firestoned_io_reconciliation_duration_seconds (Histogram) Duration of reconciliation operations in seconds.

Labels:

  • resource_type: Kind of resource

Buckets: 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 5.0, 10.0, 30.0, 60.0

# Average reconciliation duration
rate(bindy_firestoned_io_reconciliation_duration_seconds_sum[5m])
/ rate(bindy_firestoned_io_reconciliation_duration_seconds_count[5m])

# 95th percentile latency
histogram_quantile(0.95, bindy_firestoned_io_reconciliation_duration_seconds_bucket)

bindy_firestoned_io_requeues_total (Counter) Total number of requeue operations.

Labels:

  • resource_type: Kind of resource
  • reason: Reason for requeue (error, rate_limit, dependency_wait)
# Requeue rate by reason
rate(bindy_firestoned_io_requeues_total[5m])

Resource Lifecycle Metrics

bindy_firestoned_io_resources_created_total (Counter) Total number of resources created.

Labels:

  • resource_type: Kind of resource

bindy_firestoned_io_resources_updated_total (Counter) Total number of resources updated.

Labels:

  • resource_type: Kind of resource

bindy_firestoned_io_resources_deleted_total (Counter) Total number of resources deleted.

Labels:

  • resource_type: Kind of resource

bindy_firestoned_io_resources_active (Gauge) Currently active resources being tracked.

Labels:

  • resource_type: Kind of resource
# Resource creation rate
rate(bindy_firestoned_io_resources_created_total[5m])

# Active resources by type
bindy_firestoned_io_resources_active

Error Metrics

bindy_firestoned_io_errors_total (Counter) Total number of errors by resource type and category.

Labels:

  • resource_type: Kind of resource
  • error_type: Category (api_error, validation_error, network_error, timeout, reconcile_error)
# Error rate by type
rate(bindy_firestoned_io_errors_total[5m])

# Errors by resource type
sum(rate(bindy_firestoned_io_errors_total[5m])) by (resource_type)

Leader Election Metrics

bindy_firestoned_io_leader_elections_total (Counter) Total number of leader election events.

Labels:

  • status: Event type (acquired, lost, renewed)

bindy_firestoned_io_leader_status (Gauge) Current leader election status (1 = leader, 0 = follower).

Labels:

  • pod_name: Name of the pod
# Current leader
bindy_firestoned_io_leader_status == 1

# Leader election rate
rate(bindy_firestoned_io_leader_elections_total[5m])

Performance Metrics

bindy_firestoned_io_generation_observation_lag_seconds (Histogram) Lag between resource spec generation change and controller observation.

Labels:

  • resource_type: Kind of resource

Buckets: 0.1, 0.5, 1.0, 2.0, 5.0, 10.0, 30.0, 60.0, 120.0

# Average observation lag
rate(bindy_firestoned_io_generation_observation_lag_seconds_sum[5m])
/ rate(bindy_firestoned_io_generation_observation_lag_seconds_count[5m])

Prometheus Configuration

The operator deployment includes Prometheus scrape annotations:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8080"
  prometheus.io/path: "/metrics"

Prometheus will automatically discover and scrape these metrics if configured with Kubernetes service discovery.

Example Queries

# Reconciliation success rate (last 5 minutes)
sum(rate(bindy_firestoned_io_reconciliations_total{status="success"}[5m]))
/ sum(rate(bindy_firestoned_io_reconciliations_total[5m]))

# DNSZone reconciliation p95 latency
histogram_quantile(0.95,
  sum(rate(bindy_firestoned_io_reconciliation_duration_seconds_bucket{resource_type="DNSZone"}[5m])) by (le)
)

# Error rate by resource type (last hour)
topk(10,
  sum(rate(bindy_firestoned_io_errors_total[1h])) by (resource_type)
)

# Active resources per type
sum(bindy_firestoned_io_resources_active) by (resource_type)

# Requeue backlog
sum(rate(bindy_firestoned_io_requeues_total[5m])) by (resource_type, reason)

Grafana Dashboard

Import the Bindy operator dashboard (coming soon) or create custom panels using the queries above.

Recommended panels:

  1. Reconciliation Rate - Total reconciliations/sec by resource type
  2. Reconciliation Latency - P50, P95, P99 latencies
  3. Error Rate - Errors/sec by resource type and error category
  4. Active Resources - Gauge showing current active resources
  5. Leader Status - Current leader pod and election events
  6. Resource Lifecycle - Created/Updated/Deleted rates

Resource Metrics

Pod Metrics

View CPU and memory usage:

# All DNS pods
kubectl top pods -n dns-system

# Specific instance
kubectl top pods -n dns-system -l instance=primary-dns

# Sort by CPU
kubectl top pods -n dns-system --sort-by=cpu

# Sort by memory
kubectl top pods -n dns-system --sort-by=memory

Node Metrics

# Node resource usage
kubectl top nodes

# Detailed node info
kubectl describe node <node-name>

DNS Query Metrics

Using BIND9 Statistics

Enable BIND9 statistics channel (future enhancement):

spec:
  config:
    statisticsChannels:
      - address: "127.0.0.1"
        port: 8053

Query Counters

Monitor query rate and types:

  • Total queries received
  • Queries by record type (A, AAAA, MX, etc.)
  • Successful vs failed queries
  • NXDOMAIN responses

Performance Metrics

Query Latency

Measure DNS query response time:

# Test query latency
time dig @<dns-server-ip> example.com

# Multiple queries for average
for i in {1..10}; do time dig @<dns-server-ip> example.com +short; done

Zone Transfer Metrics

Monitor zone transfer performance:

  • Transfer duration
  • Transfer size
  • Transfer failures
  • Lag between primary and secondary

Kubernetes Metrics

Resource Utilization

# View resource requests vs limits
kubectl describe pod -n dns-system <pod-name> | grep -A5 "Limits:\|Requests:"

Pod Health

# Pod status and restarts
kubectl get pods -n dns-system -o wide

# Events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

Prometheus Integration

BIND9 Exporter

Deploy bind_exporter as sidecar (future enhancement):

containers:
- name: bind-exporter
  image: prometheuscommunity/bind-exporter:latest
  args:
    - "--bind.stats-url=http://localhost:8053"
  ports:
    - name: metrics
      containerPort: 9119

Service Monitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: bindy-metrics
spec:
  selector:
    matchLabels:
      app: bind9
  endpoints:
  - port: metrics
    interval: 30s

Key Metrics to Monitor

  1. Query Rate - Queries per second
  2. Query Latency - Response time
  3. Error Rate - Failed queries percentage
  4. Cache Hit Ratio - Cache effectiveness
  5. Zone Transfer Status - Success/failure of transfers
  6. Resource Usage - CPU and memory utilization
  7. Pod Health - Running vs desired replicas

Grafana Dashboards

Create dashboards for:

DNS Overview

  • Total query rate
  • Average latency
  • Error rate
  • Top queried domains

Instance Health

  • Pod status
  • CPU/memory usage
  • Restart count
  • Network I/O

Zone Management

  • Zones count
  • Records per zone
  • Zone transfer status
  • Serial numbers

Alerting Thresholds

Recommended alert thresholds:

MetricWarningCritical
CPU Usage> 70%> 90%
Memory Usage> 70%> 90%
Query Latency> 100ms> 500ms
Error Rate> 1%> 5%
Pod Restarts> 3/hour> 10/hour

Best Practices

  1. Baseline metrics - Establish normal operating ranges
  2. Set appropriate alerts - Avoid alert fatigue
  3. Monitor trends - Look for gradual degradation
  4. Capacity planning - Use metrics to plan scaling
  5. Regular review - Review dashboards weekly

Troubleshooting

Diagnose and resolve common issues with Bindy DNS operator.

Quick Diagnosis

Check Overall Health

# Check all resources
kubectl get all -n dns-system

# Check CRDs
kubectl get bind9instances,dnszones,arecords -A

# Check events
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -20

View Status Conditions

# Bind9Instance status
kubectl get bind9instance primary-dns -n dns-system -o yaml | yq '.status'

# DNSZone status
kubectl get dnszone example-com -n dns-system -o yaml | yq '.status'

Common Issues

See Common Issues for frequently encountered problems and solutions.

DNS Record Zone Reference Issues

If you’re seeing “DNSZone not found” errors:

  • Records can use zone (matches DNSZone.spec.zoneName) or zoneRef (matches DNSZone.metadata.name)
  • Common mistake: Using zone: internal-local when the zone name is internal.local
  • See DNS Record Issues - DNSZone Not Found for detailed troubleshooting

Debugging Steps

See Debugging Guide for detailed debugging procedures.

FAQ

See FAQ for answers to frequently asked questions.

Getting Help

Check Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy --tail=100

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns

Describe Resources

# Describe Bind9Instance
kubectl describe bind9instance primary-dns -n dns-system

# Describe pods
kubectl describe pod -n dns-system <pod-name>

Check Resource Status

# Get detailed status
kubectl get bind9instance primary-dns -n dns-system -o jsonpath='{.status}' | jq

Escalation

If issues persist:

  1. Check Common Issues
  2. Review Debugging Guide
  3. Check FAQ
  4. Search GitHub issues: https://github.com/firestoned/bindy/issues
  5. Create a new issue with:
    • Kubernetes version
    • Bindy version
    • Resource YAMLs
    • Controller logs
    • Error messages

Next Steps

Error Handling and Retry Logic

Bindy implements robust error handling for DNS record reconciliation, ensuring the operator never crashes when encountering failures. Instead, it updates status conditions, creates Kubernetes Events, and automatically retries with configurable intervals.

Overview

When reconciling DNS records, several failure scenarios can occur:

  • DNSZone not found: No matching DNSZone resource exists
  • RNDC key loading fails: Cannot load the RNDC authentication Secret
  • BIND9 connection fails: Unable to connect to the BIND9 server
  • Record operation fails: BIND9 rejects the record operation

Bindy handles all these scenarios gracefully with:

  • ✅ Status condition updates following Kubernetes conventions
  • ✅ Kubernetes Events for visibility
  • ✅ Automatic retry with exponential backoff
  • ✅ Configurable retry intervals
  • ✅ Idempotent operations safe for multiple retries

Configuration

Retry Interval

Control how long to wait before retrying failed DNS record operations:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy-operator
  namespace: bindy-system
spec:
  template:
    spec:
      containers:
      - name: bindy
        image: ghcr.io/firestoned/bindy:latest
        env:
        - name: BINDY_RECORD_RETRY_SECONDS
          value: "60"  # Default: 30 seconds

Recommendations:

  • Development: 10-15 seconds for faster iteration
  • Production: 30-60 seconds to avoid overwhelming the API server
  • High-load environments: 60-120 seconds to reduce reconciliation pressure

Error Scenarios

1. DNSZone Not Found

Scenario: DNS record references a zone that doesn’t exist

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zone: example.com  # No DNSZone with zoneName: example.com exists
  name: www
  ipv4Address: 192.0.2.1

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: ZoneNotFound
    message: "No DNSZone found for zone example.com in namespace dns-system"
    lastTransitionTime: "2025-11-29T23:45:00Z"
  observedGeneration: 1

Event:

Type     Reason         Message
Warning  ZoneNotFound   No DNSZone found for zone example.com in namespace dns-system

Resolution:

  1. Create the DNSZone resource:
    apiVersion: bindy.firestoned.io/v1alpha1
    kind: DNSZone
    metadata:
      name: example-com
      namespace: dns-system
    spec:
      zoneName: example.com
      clusterRef: bind9-primary
    
  2. Or fix the zone reference in the record if it’s a typo

2. RNDC Key Load Failed

Scenario: Cannot load the RNDC authentication Secret

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: RndcKeyLoadFailed
    message: "Failed to load RNDC key for cluster bind9-primary: Secret bind9-primary-rndc-key not found"
    lastTransitionTime: "2025-11-29T23:45:00Z"

Event:

Type     Reason              Message
Warning  RndcKeyLoadFailed   Failed to load RNDC key for cluster bind9-primary

Resolution:

  1. Check if the Secret exists:
    kubectl get secret -n dns-system bind9-primary-rndc-key
    
  2. Verify the Bind9Instance is running and has created its Secret:
    kubectl get bind9instance -n dns-system bind9-primary -o yaml
    
  3. If missing, the Bind9Instance reconciler should create it automatically

3. BIND9 Connection Failed

Scenario: Cannot connect to the BIND9 server (network issue, pod not ready, etc.)

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: RecordAddFailed
    message: "Cannot connect to BIND9 server at bind9-primary.dns-system.svc.cluster.local:953: connection refused. Will retry in 30s"
    lastTransitionTime: "2025-11-29T23:45:00Z"

Event:

Type     Reason           Message
Warning  RecordAddFailed  Cannot connect to BIND9 server at bind9-primary.dns-system.svc.cluster.local:953

Resolution:

  1. Check BIND9 pod status:
    kubectl get pods -n dns-system -l app=bind9-primary
    
  2. Check BIND9 logs:
    kubectl logs -n dns-system -l app=bind9-primary --tail=50
    
  3. Verify network connectivity:
    kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- \
      nc -zv bind9-primary.dns-system.svc.cluster.local 953
    
  4. The operator will automatically retry after the configured interval

4. Record Created Successfully

Scenario: DNS record successfully created in BIND9

Status:

status:
  conditions:
  - type: Ready
    status: "True"
    reason: RecordCreated
    message: "A record www.example.com created successfully"
    lastTransitionTime: "2025-11-29T23:45:00Z"
  observedGeneration: 1

Event:

Type    Reason         Message
Normal  RecordCreated  A record www.example.com created successfully

Monitoring

View Record Status

# List all DNS records with status
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -A

# Check specific record status
kubectl get arecord www-example -n dns-system -o jsonpath='{.status.conditions[0]}' | jq .

# Find failing records
kubectl get arecords -A -o json | \
  jq -r '.items[] | select(.status.conditions[0].status == "False") |
  "\(.metadata.namespace)/\(.metadata.name): \(.status.conditions[0].reason) - \(.status.conditions[0].message)"'

View Events

# Recent events in namespace
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -20

# Watch events in real-time
kubectl get events -n dns-system --watch

# Filter for DNS record events
kubectl get events -n dns-system --field-selector involvedObject.kind=ARecord

Prometheus Metrics

Bindy exposes reconciliation metrics (if enabled):

# Reconciliation errors by reason
bindy_reconcile_errors_total{resource="ARecord", reason="ZoneNotFound"}

# Reconciliation duration
histogram_quantile(0.95, bindy_reconcile_duration_seconds_bucket{resource="ARecord"})

Status Reason Codes

ReasonStatusMeaningAction Required
RecordCreatedReady=TrueDNS record successfully created in BIND9None - record is operational
ZoneNotFoundReady=FalseNo matching DNSZone resource existsCreate DNSZone or fix zone reference
RndcKeyLoadFailedReady=FalseCannot load RNDC key SecretVerify Bind9Instance is running and Secret exists
RecordAddFailedReady=FalseFailed to communicate with BIND9 or add recordCheck BIND9 pod status and network connectivity

Idempotent Operations

All BIND9 operations are idempotent, making them safe for controller retries:

add_zones / add_primary_zone / add_secondary_zone

  • add_zones: Centralized dispatcher that routes to add_primary_zone or add_secondary_zone based on zone type
  • add_primary_zone: Checks if zone exists before attempting to add primary zone
  • add_secondary_zone: Checks if zone exists before attempting to add secondary zone
  • All functions return success if zone already exists
  • Safe to call multiple times (idempotent)

reload_zone

  • Returns clear error if zone doesn’t exist
  • Otherwise performs reload operation
  • Safe to call multiple times

Record Operations

  • All record add/update operations are idempotent
  • Retrying a failed operation won’t create duplicates
  • Controller can safely requeue failed reconciliations

Best Practices

1. Monitor Status Conditions

Always check status conditions when debugging DNS record issues:

kubectl describe arecord www-example -n dns-system

Look for the Status section showing current conditions.

2. Use Events for Troubleshooting

Events provide a timeline of what happened:

kubectl get events -n dns-system --field-selector involvedObject.name=www-example

3. Adjust Retry Interval for Your Needs

  • Fast feedback during development: BINDY_RECORD_RETRY_SECONDS=10
  • Production stability: BINDY_RECORD_RETRY_SECONDS=60
  • High-load clusters: BINDY_RECORD_RETRY_SECONDS=120

4. Create DNSZones Before Records

To avoid ZoneNotFound errors, always create DNSZone resources before creating DNS records:

# 1. Create DNSZone
kubectl apply -f dnszone.yaml

# 2. Wait for it to be ready
kubectl wait --for=condition=Ready dnszone/example-com -n dns-system --timeout=60s

# 3. Create DNS records
kubectl apply -f records/

5. Use Labels for Organization

Tag related resources for easier monitoring:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
  labels:
    app: web-frontend
    environment: production
spec:
  zone: example.com
  name: www
  ipv4Address: 192.0.2.1

Then filter:

kubectl get arecords -n dns-system -l environment=production

Troubleshooting Guide

Record Stuck in “ZoneNotFound”

  1. Verify DNSZone exists:
    kubectl get dnszones -A
    
  2. Check zone name matches:
    kubectl get dnszone example-com -n dns-system -o jsonpath='{.spec.zoneName}'
    
  3. Ensure they’re in the same namespace

Record Stuck in “RndcKeyLoadFailed”

  1. Check Secret exists:
    kubectl get secret -n dns-system {cluster-name}-rndc-key
    
  2. Verify Bind9Instance is Ready:
    kubectl get bind9instance -n dns-system
    
  3. Check Bind9Instance logs:
    kubectl logs -n bindy-system -l app=bindy-operator
    

Record Stuck in “RecordAddFailed”

  1. Check BIND9 pod is running:
    kubectl get pods -n dns-system -l app={cluster-name}
    
  2. Test network connectivity:
    kubectl run -it --rm debug --image=nicolaka/netshoot -- \
      nc -zv {cluster-name}.dns-system.svc.cluster.local 953
    
  3. Check BIND9 logs for errors:
    kubectl logs -n dns-system -l app={cluster-name} | grep -i error
    
  4. Verify RNDC is listening on port 953:
    kubectl exec -n dns-system {bind9-pod} -- ss -tlnp | grep 953
    

See Also

Common Issues

Solutions to frequently encountered problems.

Bind9Instance Issues

Pods Not Starting

Symptom: Bind9Instance created but pods not running

Diagnosis:

kubectl get pods -n dns-system -l instance=primary-dns
kubectl describe pod -n dns-system <pod-name>

Common Causes:

  1. Image pull errors - Check image name and registry access
  2. Resource limits - Insufficient CPU/memory on nodes
  3. RBAC issues - ServiceAccount lacks permissions

Solution:

# Check events
kubectl get events -n dns-system

# Fix resource limits
kubectl edit bind9instance primary-dns -n dns-system
# Increase resources.requests and resources.limits

# Verify RBAC
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy

ConfigMap Not Created

Symptom: ConfigMap missing for Bind9Instance

Diagnosis:

kubectl get configmap -n dns-system
kubectl logs -n dns-system deployment/bindy | grep ConfigMap

Solution:

# Check controller logs for errors
kubectl logs -n dns-system deployment/bindy --tail=50

# Delete and recreate instance
kubectl delete bind9instance primary-dns -n dns-system
kubectl apply -f instance.yaml

DNSZone Issues

No Instances Match Selector

Symptom: DNSZone status shows “No Bind9Instances matched selector”

Diagnosis:

kubectl get bind9instances -n dns-system --show-labels
kubectl get dnszone example-com -n dns-system -o yaml | yq '.spec.instanceSelector'

Solution:

# Verify labels on instances
kubectl label bind9instance primary-dns dns-role=primary -n dns-system

# Or update zone selector
kubectl edit dnszone example-com -n dns-system

Zone File Not Created

Symptom: Zone exists but no zone file in BIND9

Diagnosis:

kubectl exec -n dns-system deployment/primary-dns -- ls -la /var/lib/bind/zones/
kubectl logs -n dns-system deployment/bindy | grep "example-com"

Solution:

# Check if zone reconciliation succeeded
kubectl describe dnszone example-com -n dns-system

# Trigger reconciliation by updating zone
kubectl annotate dnszone example-com reconcile=true -n dns-system

DNS Record Issues

DNSZone Not Found

Symptom: Controller logs show “DNSZone not found” errors for a zone that exists

Example Error:

ERROR Failed to find DNSZone for zone 'internal-local' in namespace 'dns-system'

Root Cause: Mismatch between how the record references the zone and the actual DNSZone fields.

Diagnosis:

# Check what the record is trying to reference
kubectl get arecord www-example -n dns-system -o yaml | grep -A2 spec:

# Check available DNSZones
kubectl get dnszones -n dns-system

# Check the DNSZone details
kubectl get dnszone example-com -n dns-system -o yaml

Understanding the Problem:

DNS records can reference zones using two different fields:

  1. zone field - Matches against DNSZone.spec.zoneName (the actual DNS zone name like example.com)
  2. zoneRef field - Matches against DNSZone.metadata.name (the Kubernetes resource name like example-com)

Common mistakes:

  • Using zone: internal-local when spec.zoneName: internal.local (dots vs dashes)
  • Using zone: example-com when it should be zone: example.com
  • Using zoneRef: example.com when it should be zoneRef: example-com

Solution:

Option 1: Use zone field with the actual DNS zone name

spec:
  zone: example.com  # Must match DNSZone spec.zoneName
  name: www

Option 2: Use zoneRef field with the resource name (recommended)

spec:
  zoneRef: example-com  # Must match DNSZone metadata.name
  name: www

Example Fix:

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: internal-local      # ← Resource name
  namespace: dns-system
spec:
  zoneName: internal.local  # ← Actual zone name

Wrong:

spec:
  zone: internal-local  # ✗ This looks for spec.zoneName = "internal-local"

Correct:

# Method 1: Use actual zone name
spec:
  zone: internal.local  # ✓ Matches spec.zoneName

# Method 2: Use resource name (more efficient)
spec:
  zoneRef: internal-local  # ✓ Matches metadata.name

Verification:

# After fixing, check the record reconciles
kubectl describe arecord www-example -n dns-system

# Should see no errors in events
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -10

See Records Guide - Referencing DNS Zones for more details.

Record Not Appearing in Zone

Symptom: ARecord created but not in zone file

Diagnosis:

# Check record status
kubectl get arecord www-example -n dns-system -o yaml

# Check zone file
kubectl exec -n dns-system deployment/primary-dns -- cat /var/lib/bind/zones/example.com.zone

Solution:

# Verify zone reference is correct (use zone or zoneRef)
kubectl get arecord www-example -n dns-system -o yaml | grep -E 'zone:|zoneRef:'

# Check available DNSZones
kubectl get dnszones -n dns-system

# Update if incorrect - use zone (matches spec.zoneName) or zoneRef (matches metadata.name)
kubectl edit arecord www-example -n dns-system

DNS Query Not Resolving

Symptom: dig/nslookup fails to resolve

Diagnosis:

# Get DNS service IP
SERVICE_IP=$(kubectl get svc primary-dns -n dns-system -o jsonpath='{.spec.clusterIP}')

# Test query
dig @$SERVICE_IP www.example.com

# Check BIND9 logs
kubectl logs -n dns-system -l instance=primary-dns | tail -20

Solutions:

  1. Record doesn’t exist:
kubectl get arecords -n dns-system
kubectl apply -f record.yaml
  1. Zone not loaded:
kubectl logs -n dns-system -l instance=primary-dns | grep "loaded serial"
  1. Network policy blocking:
kubectl get networkpolicies -n dns-system

Zone Transfer Issues

Secondary Not Receiving Transfers

Symptom: Secondary instance not getting zone updates

Diagnosis:

# Check secondary logs
kubectl logs -n dns-system -l dns-role=secondary | grep transfer

# Check if zone has secondary IPs configured
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'

# Check if secondaries are discovered
kubectl get bind9instance -n dns-system -l role=secondary -o jsonpath='{.items[*].status.podIP}'

Automatic Configuration:

As of v0.1.0, Bindy automatically discovers secondary IPs and configures zone transfers:

  • Secondary pods are discovered via Kubernetes API using label selectors (role=secondary)
  • Primary zones are configured with also-notify and allow-transfer directives
  • Secondary IPs are stored in DNSZone.status.secondaryIps for tracking
  • When secondary pods restart/reschedule and get new IPs, zones are automatically updated

Manual Verification:

# Check if zone has secondary IPs in status
kubectl get dnszone example-com -n dns-system -o yaml | yq '.status.secondaryIps'

# Expected output: List of secondary pod IPs
# - 10.244.1.5
# - 10.244.2.8

# Verify zone configuration on primary
kubectl exec -n dns-system deployment/primary-dns -- \
  curl -s localhost:8080/api/zones/example.com | jq '.alsoNotify, .allowTransfer'

If Automatic Configuration Fails:

  1. Verify secondary instances are labeled correctly:

    kubectl get bind9instance -n dns-system -o yaml | yq '.items[].metadata.labels'
    
    # Expected labels for secondaries:
    # role: secondary
    # cluster: <cluster-name>
    
  2. Check DNSZone reconciler logs:

    kubectl logs -n dns-system deployment/bindy | grep "secondary"
    
  3. Verify network connectivity:

    # Test AXFR from secondary to primary
    kubectl exec -n dns-system deployment/secondary-dns -- \
      dig @primary-dns-service AXFR example.com
    

Recovery After Secondary Pod Restart:

When secondary pods are rescheduled and get new IPs:

  1. Detection: Reconciler automatically detects IP change within 5-10 minutes (next reconciliation)
  2. Update: Zones are deleted and recreated with new secondary IPs
  3. Transfer: Zone transfers resume automatically with new IPs

Manual Trigger (if needed):

# Force reconciliation by updating zone annotation
kubectl annotate dnszone example-com -n dns-system \
  reconcile.bindy.firestoned.io/trigger="$(date +%s)" --overwrite

Performance Issues

High Query Latency

Symptom: DNS queries taking too long

Diagnosis:

# Test query time
time dig @$SERVICE_IP example.com

# Check resource usage
kubectl top pods -n dns-system -l instance=primary-dns

Solutions:

  1. Increase resources:
spec:
  resources:
    limits:
      cpu: "1000m"
      memory: "1Gi"
  1. Add more replicas:
spec:
  replicas: 3
  1. Enable caching (if appropriate for your use case)

RBAC Issues

Forbidden Errors in Logs

Symptom: Controller logs show “Forbidden” errors

Diagnosis:

kubectl logs -n dns-system deployment/bindy | grep Forbidden

# Check permissions
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy \
  -n dns-system

Solution:

# Reapply RBAC
kubectl apply -f deploy/rbac/

# Verify ClusterRoleBinding
kubectl get clusterrolebinding bindy-rolebinding -o yaml

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

Next Steps

Debugging

Step-by-step guide to debugging Bindy DNS operator issues.

Debug Workflow

1. Identify the Problem

Determine what’s not working:

  • Bind9Instance not creating pods?
  • DNSZone not loading?
  • DNS records not resolving?
  • Zone transfers failing?

2. Check Resource Status

# Get high-level status
kubectl get bind9instances,dnszones,arecords -A

# Check specific resource
kubectl describe bind9instance primary-dns -n dns-system
kubectl describe dnszone example-com -n dns-system

3. Review Events

# Recent events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

# Events for specific resource
kubectl describe dnszone example-com -n dns-system | grep -A10 Events

4. Examine Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy --tail=100

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns --tail=50

# Follow logs in real-time
kubectl logs -n dns-system deployment/bindy -f

Debugging Bind9Instance

Issue: Pods Not Starting

# 1. Check pod status
kubectl get pods -n dns-system -l instance=primary-dns

# 2. Describe pod
kubectl describe pod -n dns-system <pod-name>

# 3. Check events
kubectl get events -n dns-system --field-selector involvedObject.name=<pod-name>

# 4. Check logs if pod is running
kubectl logs -n dns-system <pod-name>

# 5. Check deployment
kubectl describe deployment primary-dns -n dns-system

Issue: ConfigMap Not Created

# 1. List ConfigMaps
kubectl get configmaps -n dns-system

# 2. Check controller logs
kubectl logs -n dns-system deployment/bindy | grep -i configmap

# 3. Check RBAC permissions
kubectl auth can-i create configmaps \
  --as=system:serviceaccount:dns-system:bindy \
  -n dns-system

# 4. Manually trigger reconciliation
kubectl annotate bind9instance primary-dns reconcile=true -n dns-system --overwrite

Debugging DNSZone

Issue: No Instances Match Selector

# 1. Check zone selector
kubectl get dnszone example-com -n dns-system -o yaml | grep -A5 instanceSelector

# 2. List instances with labels
kubectl get bind9instances -n dns-system --show-labels

# 3. Test selector match
kubectl get bind9instances -n dns-system \
  -l dns-role=primary,environment=production

# 4. Fix labels or selector
kubectl label bind9instance primary-dns dns-role=primary -n dns-system
# Or edit zone selector
kubectl edit dnszone example-com -n dns-system

Issue: Zone File Missing

# 1. Check if zone reconciliation succeeded
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.conditions}'

# 2. Exec into pod and check zones directory
kubectl exec -n dns-system deployment/primary-dns -- ls -la /var/lib/bind/zones/

# 3. Check BIND9 configuration
kubectl exec -n dns-system deployment/primary-dns -- cat /etc/bind/named.conf

# 4. Check BIND9 logs
kubectl logs -n dns-system -l instance=primary-dns | grep "example.com"

# 5. Reload BIND9 configuration
kubectl exec -n dns-system deployment/primary-dns -- rndc reload

Debugging DNS Records

Issue: Record Not in Zone File

# 1. Verify record exists
kubectl get arecord www-example -n dns-system

# 2. Check record status
kubectl get arecord www-example -n dns-system -o jsonpath='{.status}'

# 3. Verify zone reference
kubectl get arecord www-example -n dns-system -o jsonpath='{.spec.zone}'
# Should match a DNSZone resource name

# 4. Check zone file contents
kubectl exec -n dns-system deployment/primary-dns -- \
  cat /var/lib/bind/zones/example.com.zone

# 5. Trigger record reconciliation
kubectl annotate arecord www-example reconcile=true -n dns-system --overwrite

Issue: DNS Query Not Resolving

# 1. Get DNS service IP
SERVICE_IP=$(kubectl get svc primary-dns -n dns-system -o jsonpath='{.spec.clusterIP}')

# 2. Test query from within cluster
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- \
  dig @$SERVICE_IP www.example.com

# 3. Test query from BIND9 pod directly
kubectl exec -n dns-system deployment/primary-dns -- \
  dig @localhost www.example.com

# 4. Check if zone is loaded
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc status | grep "zones loaded"

# 5. Query zone status
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc zonestatus example.com

Debugging Zone Transfers

Issue: Secondary Not Receiving Transfers

# 1. Check primary allows transfers
kubectl get bind9instance primary-dns -n dns-system \
  -o jsonpath='{.spec.config.allowTransfer}'

# 2. Check secondary configuration
kubectl get dnszone example-com-secondary -n dns-system \
  -o jsonpath='{.spec.secondaryConfig}'

# 3. Test network connectivity
kubectl exec -n dns-system deployment/secondary-dns -- \
  nc -zv primary-dns-service 53

# 4. Attempt manual transfer
kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @primary-dns-service example.com AXFR

# 5. Check transfer logs
kubectl logs -n dns-system -l dns-role=secondary | grep -i transfer

# 6. Check NOTIFY messages
kubectl logs -n dns-system -l dns-role=primary | grep -i notify

Enable Debug Logging

Controller Debug Logging

# Edit controller deployment
kubectl set env deployment/bindy RUST_LOG=debug -n dns-system

# Or patch deployment
kubectl patch deployment bindy -n dns-system \
  -p '{"spec":{"template":{"spec":{"containers":[{"name":"controller","env":[{"name":"RUST_LOG","value":"debug"}]}]}}}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# View debug logs
kubectl logs -n dns-system deployment/bindy -f

Enable JSON Logging

For easier parsing and integration with log aggregation tools:

# Set JSON format
kubectl set env deployment/bindy RUST_LOG_FORMAT=json -n dns-system

# Or patch deployment for both debug level and JSON format
kubectl patch deployment bindy -n dns-system \
  -p '{"spec":{"template":{"spec":{"containers":[{"name":"controller","env":[{"name":"RUST_LOG","value":"debug"},{"name":"RUST_LOG_FORMAT","value":"json"}]}]}}}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# View JSON logs (can be piped to jq for parsing)
kubectl logs -n dns-system deployment/bindy -f | jq .

BIND9 Debug Logging

# Enable query logging
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc querylog on

# View queries
kubectl logs -n dns-system -l instance=primary-dns -f | grep "query:"

# Disable query logging
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc querylog off

Network Debugging

Test DNS Resolution

# From debug pod
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- /bin/bash

# Inside pod:
dig @primary-dns-service.dns-system.svc.cluster.local www.example.com
nslookup www.example.com primary-dns-service.dns-system.svc.cluster.local
host www.example.com primary-dns-service.dns-system.svc.cluster.local

Check Network Policies

# List network policies
kubectl get networkpolicies -n dns-system

# Describe policy
kubectl describe networkpolicy <policy-name> -n dns-system

# Temporarily remove policy for testing
kubectl delete networkpolicy <policy-name> -n dns-system

Performance Debugging

Check Resource Usage

# Pod resource usage
kubectl top pods -n dns-system

# Node pressure
kubectl describe nodes | grep -A5 "Conditions:\|Allocated resources:"

# Detailed pod metrics
kubectl describe pod <pod-name> -n dns-system | grep -A10 "Limits:\|Requests:"

Profile DNS Queries

# Measure query latency
for i in {1..100}; do
  dig @$SERVICE_IP www.example.com +stats | grep "Query time:"
done | awk '{sum+=$4; count++} END {print "Average:", sum/count, "ms"}'

# Test concurrent queries
seq 1 100 | xargs -I{} -P10 dig @$SERVICE_IP www.example.com +short

Collect Diagnostic Information

Create Support Bundle

#!/bin/bash
# collect-diagnostics.sh

NAMESPACE="dns-system"
OUTPUT_DIR="bindy-diagnostics-$(date +%Y%m%d-%H%M%S)"

mkdir -p $OUTPUT_DIR

# Collect resources
kubectl get all -n $NAMESPACE -o yaml > $OUTPUT_DIR/resources.yaml
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords -A -o yaml > $OUTPUT_DIR/crds.yaml

# Collect logs
kubectl logs -n $NAMESPACE deployment/bindy --tail=1000 > $OUTPUT_DIR/controller.log
kubectl logs -n $NAMESPACE -l app=bind9 --tail=1000 > $OUTPUT_DIR/bind9.log

# Collect events
kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' > $OUTPUT_DIR/events.txt

# Collect status
kubectl describe bind9instances -A > $OUTPUT_DIR/bind9instances-describe.txt
kubectl describe dnszones -A > $OUTPUT_DIR/dnszones-describe.txt

# Create archive
tar -czf $OUTPUT_DIR.tar.gz $OUTPUT_DIR/

echo "Diagnostics collected in $OUTPUT_DIR.tar.gz"

Next Steps

  • Common Issues - Known problems and solutions
  • FAQ - Frequently asked questions
  • Logging - Log configuration and analysis

FAQ (Frequently Asked Questions)

General

What is Bindy?

Bindy is a Kubernetes operator that manages BIND9 DNS servers using Custom Resource Definitions (CRDs). It allows you to manage DNS zones and records declaratively using Kubernetes resources.

Why use Bindy instead of manual BIND9 configuration?

  • Declarative: Define DNS infrastructure as Kubernetes resources
  • GitOps-friendly: Version control your DNS configuration
  • Kubernetes-native: Uses familiar kubectl commands
  • Automated: Controller handles BIND9 configuration and reloading
  • Scalable: Easy multi-region, multi-instance deployments

What BIND9 versions are supported?

Bindy supports BIND 9.16 and 9.18. The version is configurable per Bind9Instance.

Installation

Can I run Bindy in a namespace other than dns-system?

Yes, you can deploy Bindy in any namespace. Update the namespace in deployment YAMLs and RBAC resources.

Do I need cluster-admin permissions?

You need permissions to:

  • Create CRDs (cluster-scoped)
  • Create ClusterRole and ClusterRoleBinding
  • Create resources in the operator namespace

A cluster administrator can pre-install CRDs and RBAC, then delegate namespace management.

Configuration

How do I update BIND9 configuration?

Edit the Bind9Instance resource:

kubectl edit bind9instance primary-dns -n dns-system

The controller will automatically update the ConfigMap and restart pods if needed.

Can I use external BIND9 servers?

No, Bindy manages BIND9 instances running in Kubernetes. For external servers, consider DNS integration tools.

How do I enable query logging?

Currently, enable it manually in the BIND9 pod:

kubectl exec -n dns-system deployment/primary-dns -- rndc querylog on

Future versions may support configuration through Bind9Instance spec.

DNS Zones

How many zones can one instance host?

BIND9 can handle thousands of zones. Practical limits depend on:

  • Resource allocation (CPU/memory)
  • Query volume
  • Zone size

Start with 100-500 zones per instance and scale as needed.

Can I host the same zone on multiple instances?

Yes! Use label selectors to target multiple instances:

instanceSelector:
  matchLabels:
    environment: production

This deploys the zone to all matching instances.

How do I migrate zones between instances?

Update the DNSZone’s instance Selector:

instanceSelector:
  matchLabels:
    dns-role: new-primary

The zone will be created on new instances and you can delete from old ones.

DNS Records

How do I create multiple A records for the same name?

Create multiple ARecord resources with different names but same spec.name:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-1
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-2
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.2"

Can I import existing zone files?

Not directly. You need to convert zone files to Bindy CRD resources. Future versions may include an import tool.

How do I delete all records in a zone?

kubectl delete arecords,aaaarecords,cnamerecords,mxrecords,txtrecords \
  -n dns-system -l zone=example-com

(If you label records with their zone)

Operations

How do I upgrade Bindy?

  1. Update CRDs: kubectl apply -k deploy/crds/
  2. Update controller: kubectl set image deployment/bindy controller=new-image
  3. Monitor rollout: kubectl rollout status deployment/bindy -n dns-system

How do I backup DNS configuration?

# Export all CRDs
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -A -o yaml > bindy-backup.yaml

Store in version control or backup storage.

How do I restore from backup?

kubectl apply -f bindy-backup.yaml

Can I run Bindy in high availability mode?

Yes, run multiple controller replicas:

spec:
  replicas: 2  # Multiple controller replicas

Only one will be active (leader election), others are standby.

Troubleshooting

Pods are crashlooping

Check pod logs and events:

kubectl logs -n dns-system <pod-name>
kubectl describe pod -n dns-system <pod-name>

Common causes:

  • Invalid BIND9 configuration
  • Insufficient resources
  • Image pull errors

DNS queries timing out

Check:

  1. Service is correctly exposing pods
  2. Pods are ready
  3. Query is reaching BIND9 (check logs)
  4. Zone is loaded
  5. Record exists
kubectl get svc -n dns-system
kubectl get pods -n dns-system
kubectl logs -n dns-system -l instance=primary-dns

Zone transfers not working

Ensure:

  1. Primary allows transfers: spec.config.allowTransfer
  2. Network connectivity between primary and secondary
  3. Secondary has correct primary server IPs
  4. Firewall rules allow TCP port 53

Performance

How do I optimize for high query volume?

  1. Increase replicas: More pods = more capacity
  2. Add resources: Increase CPU/memory limits
  3. Use caching: If appropriate for your use case
  4. Geographic distribution: Deploy instances near clients
  5. Load balancing: Use service load balancing

What are typical resource requirements?

Deployment SizeCPU RequestMemory RequestCPU LimitMemory Limit
Small (<50 zones)100m128Mi500m512Mi
Medium (50-500 zones)200m256Mi1000m1Gi
Large (500+ zones)500m512Mi2000m2Gi

Adjust based on actual usage monitoring.

Security

Is DNSSEC supported?

Yes, enable DNSSEC in Bind9Instance spec:

spec:
  config:
    dnssec:
      enabled: true
      validation: true

How do I restrict access to DNS queries?

Use allowQuery in Bind9Instance spec:

spec:
  config:
    allowQuery:
      - "10.0.0.0/8"  # Only internal network

Are zone transfers secure?

Zone transfers occur over TCP and can be restricted by IP address using allowTransfer. For additional security, consider:

  • Network policies
  • IPsec or VPN between regions
  • TSIG keys (future enhancement)

Integration

Can I use Bindy with external-dns?

Bindy manages internal DNS infrastructure. external-dns manages external DNS providers. They serve different purposes and can coexist.

Does Bindy work with Linkerd?

Yes, Bindy DNS servers can be used by Linkerd for internal DNS resolution. The DNS service has Linkerd injection disabled (DNS doesn’t work well with mesh sidecars), while management services can be Linkerd-injected for secure mTLS communication.

Can I integrate with existing DNS infrastructure?

Yes, configure Bindy instances as secondaries receiving transfers from existing primaries, or vice versa.

Next Steps

Replacing CoreDNS with Bind9GlobalCluster

Bind9GlobalCluster provides a powerful alternative to CoreDNS for cluster-wide DNS infrastructure. This guide explores using Bindy as a CoreDNS replacement in Kubernetes clusters.

Why Consider Replacing CoreDNS?

CoreDNS is the default DNS solution for Kubernetes, but you might want an alternative if you need:

  • Enterprise DNS Features: Advanced BIND9 capabilities like DNSSEC, dynamic updates via RNDC, and comprehensive zone management
  • Centralized DNS Management: Declarative DNS infrastructure managed via Kubernetes CRDs
  • GitOps-Ready DNS: DNS configuration as code, versioned and auditable
  • Integration with Existing Infrastructure: Organizations already using BIND9 for external DNS
  • Compliance Requirements: Full audit trails, signed releases, and documented controls (SOX, NIST 800-53)
  • Advanced Zone Management: Programmatic control over zones and records without editing configuration files

Architecture Comparison

CoreDNS (Default)

┌─────────────────────────────────────────┐
│ CoreDNS DaemonSet/Deployment            │
│ - Serves cluster.local queries          │
│ - Configured via ConfigMap               │
│ - Limited to Corefile syntax             │
└─────────────────────────────────────────┘

Characteristics:

  • Simple, built-in solution
  • ConfigMap-based configuration
  • Limited declarative management
  • Manual ConfigMap edits for changes

Bindy with Bind9GlobalCluster

┌──────────────────────────────────────────────────┐
│ Bind9GlobalCluster (cluster-scoped)             │
│ - Cluster-wide DNS infrastructure                │
│ - Platform team managed                          │
└──────────────────────────────────────────────────┘
         │
         ├─ Creates → Bind9Cluster (per namespace)
         │            └─ Creates → Bind9Instance (BIND9 pods)
         │
         └─ Referenced by DNSZones (any namespace)
                       └─ Records (A, AAAA, CNAME, MX, TXT, etc.)

Characteristics:

  • Declarative infrastructure-as-code
  • GitOps-ready (all configuration in YAML)
  • Dynamic updates via RNDC API (no restarts)
  • Full DNSSEC support
  • Programmatic record management
  • Multi-tenancy with RBAC

Use Cases

1. Platform DNS Service

Replace CoreDNS with a platform-managed DNS service accessible to all namespaces:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: platform-dns
  labels:
    app.kubernetes.io/component: dns
    app.kubernetes.io/part-of: platform-services
spec:
  version: "9.18"
  primary:
    replicas: 3  # HA for cluster DNS
    service:
      spec:
        type: ClusterIP
        clusterIP: 10.96.0.10  # Standard kube-dns ClusterIP
  secondary:
    replicas: 2
  global:
    recursion: true  # Important for cluster DNS
    allowQuery:
      - "0.0.0.0/0"
    forwarders:  # Forward external queries
      - "8.8.8.8"
      - "8.8.4.4"

Benefits:

  • High availability with multiple replicas
  • Declarative configuration (no ConfigMap editing)
  • Version-controlled DNS infrastructure
  • Gradual migration path from CoreDNS

2. Hybrid DNS Architecture

Use Bindy for application DNS while keeping CoreDNS for cluster.local:

# CoreDNS continues handling cluster.local
# Bindy handles application-specific zones

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: app-dns
spec:
  version: "9.18"
  primary:
    replicas: 2
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: internal-services
  namespace: platform
spec:
  zoneName: internal.example.com
  globalClusterRef: app-dns
  soaRecord:
    primaryNs: ns1.internal.example.com.
    adminEmail: platform.example.com.

Benefits:

  • Zero risk to existing cluster DNS
  • Application teams get advanced DNS features
  • Incremental adoption
  • Clear separation of concerns

3. Service Mesh Integration

Provide DNS for service mesh configurations:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: mesh-dns
  labels:
    linkerd.io/control-plane-ns: linkerd
spec:
  version: "9.18"
  primary:
    replicas: 2
    service:
      annotations:
        linkerd.io/inject: enabled
  global:
    recursion: false  # Authoritative only
    allowQuery:
      - "10.0.0.0/8"  # Service mesh network
---
# Application teams create zones
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-team
spec:
  zoneName: api.mesh.local
  globalClusterRef: mesh-dns
---
# Dynamic service records
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: api-v1
  namespace: api-team
spec:
  zoneRef: api-zone
  name: v1
  ipv4Address: "10.0.1.100"

Benefits:

  • Service mesh can use DNS for routing
  • Dynamic record updates without mesh controller changes
  • Platform team manages DNS infrastructure
  • Application teams manage their service records

Migration Strategies

Run Bindy alongside CoreDNS during migration:

  1. Deploy Bindy Global Cluster:

    apiVersion: bindy.firestoned.io/v1alpha1
    kind: Bind9GlobalCluster
    metadata:
      name: platform-dns-migration
    spec:
      version: "9.18"
      primary:
        replicas: 2
        service:
          spec:
            type: ClusterIP  # Different IP from CoreDNS
      global:
        recursion: true
        forwarders:
          - "8.8.8.8"
    
  2. Test DNS Resolution:

    # Get Bindy DNS service IP
    kubectl get svc -n dns-system -l app.kubernetes.io/name=bind9
    
    # Test queries
    dig @<bindy-service-ip> kubernetes.default.svc.cluster.local
    dig @<bindy-service-ip> google.com
    
  3. Gradually Migrate Applications: Update pod specs to use Bindy DNS:

    spec:
      dnsPolicy: None
      dnsConfig:
        nameservers:
          - <bindy-service-ip>
        searches:
          - default.svc.cluster.local
          - svc.cluster.local
          - cluster.local
    
  4. Switch Cluster Default (final step):

    # Update kubelet DNS config
    # Change --cluster-dns to Bindy service IP
    # Rolling restart nodes
    

Strategy 2: Zone-by-Zone Migration

Keep CoreDNS for cluster.local, migrate application zones:

  1. Keep CoreDNS for Cluster Services:

    # CoreDNS ConfigMap unchanged
    # Handles *.cluster.local, *.svc.cluster.local
    
  2. Create Application Zones in Bindy:

    apiVersion: bindy.firestoned.io/v1alpha1
    kind: DNSZone
    metadata:
      name: apps-zone
      namespace: platform
    spec:
      zoneName: apps.example.com
      globalClusterRef: platform-dns
    
  3. Configure Forwarding (CoreDNS → Bindy):

    # CoreDNS Corefile
    apps.example.com:53 {
      forward . <bindy-service-ip>
    }
    

Benefits:

  • Zero risk to cluster stability
  • Incremental testing
  • Easy rollback
  • Coexistence of both solutions

Configuration for Cluster DNS

Essential Settings

For cluster DNS replacement, configure these settings:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: cluster-dns
spec:
  version: "9.18"
  primary:
    replicas: 3  # HA requirement
    service:
      spec:
        type: ClusterIP
        clusterIP: 10.96.0.10  # kube-dns default
  global:
    # CRITICAL: Enable recursion for cluster DNS
    recursion: true

    # Allow queries from all pods
    allowQuery:
      - "0.0.0.0/0"

    # Forward external queries to upstream DNS
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"

    # Cluster.local zone configuration
    zones:
      - name: cluster.local
        type: forward
        forwarders:
          - "10.96.0.10"  # Forward to Bindy itself for cluster zones

Create these zones for Kubernetes cluster DNS:

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: cluster-local
  namespace: dns-system
spec:
  zoneName: cluster.local
  globalClusterRef: cluster-dns
  soaRecord:
    primaryNs: ns1.cluster.local.
    adminEmail: dns-admin.cluster.local.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: svc-cluster-local
  namespace: dns-system
spec:
  zoneName: svc.cluster.local
  globalClusterRef: cluster-dns
  soaRecord:
    primaryNs: ns1.svc.cluster.local.
    adminEmail: dns-admin.svc.cluster.local.

Advantages Over CoreDNS

1. Declarative Infrastructure

CoreDNS:

# Manual ConfigMap editing
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
data:
  Corefile: |
    .:53 {
        errors
        health
        # ... manual editing required
    }

Bindy:

# Infrastructure as code
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
# ... declarative specs
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
# ... versioned, reviewable YAML

2. Dynamic Updates

CoreDNS:

  • Requires ConfigMap changes
  • Requires pod restarts
  • No programmatic API

Bindy:

  • Dynamic record updates via RNDC
  • Zero downtime changes
  • Programmatic API (Kubernetes CRDs)

3. Multi-Tenancy

CoreDNS:

  • Single shared ConfigMap
  • No namespace isolation
  • Platform team controls everything

Bindy:

  • Platform team: Manages Bind9GlobalCluster
  • Application teams: Manage DNSZone and records in their namespace
  • RBAC-enforced isolation

4. Enterprise Features

Bindy Provides:

  • ✅ DNSSEC with automatic key management
  • ✅ Zone transfers (AXFR/IXFR)
  • ✅ Split-horizon DNS (views/ACLs)
  • ✅ Audit logging for compliance
  • ✅ SOA record management
  • ✅ Full BIND9 feature set

CoreDNS:

  • ❌ Limited DNSSEC support
  • ❌ No zone transfers
  • ❌ Basic view support
  • ❌ Limited audit capabilities

Operational Considerations

Performance

Memory Usage:

  • CoreDNS: ~30-50 MB per pod
  • Bindy (BIND9): ~100-200 MB per pod
  • Trade-off: More features, slightly higher resource use

Query Performance:

  • Both handle 10K+ queries/sec per pod
  • BIND9 excels at authoritative zones
  • CoreDNS excels at simple forwarding

Recommendation: Use Bindy where you need advanced features; CoreDNS is lighter for simple forwarding.

High Availability

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: ha-dns
spec:
  primary:
    replicas: 3  # Spread across zones
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/name: bind9
            topologyKey: kubernetes.io/hostname
  secondary:
    replicas: 2  # Read replicas for query load

Monitoring

# Check DNS cluster status
kubectl get bind9globalcluster -o wide

# Check instance health
kubectl get bind9instances -n dns-system

# Query metrics (if Prometheus enabled)
kubectl port-forward -n dns-system svc/bindy-metrics 8080:8080
curl localhost:8080/metrics | grep bindy_

Limitations

Not Suitable For:

  1. Clusters requiring ultra-low resource usage

    • CoreDNS is lighter for simple forwarding
  2. Simple forwarding-only scenarios

    • CoreDNS is simpler if you don’t need BIND9 features
  3. Rapid pod scaling (1000s/sec)

    • CoreDNS has slightly faster startup time

Well-Suited For:

  1. Enterprise environments with compliance requirements
  2. Multi-tenant platforms with RBAC requirements
  3. Complex DNS requirements (DNSSEC, zone transfers, dynamic updates)
  4. GitOps workflows where DNS is infrastructure-as-code
  5. Organizations standardizing on BIND9 across infrastructure

Best Practices

1. Start with Hybrid Approach

Keep CoreDNS for cluster.local, add Bindy for application zones:

# CoreDNS: cluster.local, svc.cluster.local
# Bindy: apps.example.com, internal.example.com

2. Use Health Checks

spec:
  primary:
    livenessProbe:
      tcpSocket:
        port: 53
      initialDelaySeconds: 30
    readinessProbe:
      exec:
        command: ["/usr/bin/dig", "@127.0.0.1", "health.check.local"]

3. Enable Audit Logging

spec:
  global:
    logging:
      channels:
        - name: audit_log
          file: /var/log/named/audit.log
          severity: info
      categories:
        - name: update
          channels: [audit_log]

4. Plan for Disaster Recovery

# Backup DNS zones
kubectl get dnszones -A -o yaml > dns-zones-backup.yaml

# Backup records
kubectl get arecords,cnamerecords,mxrecords -A -o yaml > dns-records-backup.yaml

Conclusion

Bind9GlobalCluster provides a powerful, enterprise-grade alternative to CoreDNS for Kubernetes clusters. While CoreDNS remains an excellent choice for simple forwarding scenarios, Bindy excels when you need:

  • Declarative DNS infrastructure-as-code
  • GitOps workflows for DNS management
  • Multi-tenancy with namespace isolation
  • Enterprise features (DNSSEC, zone transfers, dynamic updates)
  • Compliance and audit requirements
  • Integration with existing BIND9 infrastructure

Recommendation: Start with a hybrid approach—keep CoreDNS for cluster services, and adopt Bindy for application DNS zones. This provides a safe migration path with the ability to leverage advanced DNS features where needed.

Next Steps

High Availability

Design and implement highly available DNS infrastructure with Bindy.

Overview

High availability (HA) DNS ensures continuous DNS service even during:

  • Pod failures
  • Node failures
  • Availability zone outages
  • Regional outages
  • Planned maintenance

HA Architecture Components

1. Multiple Replicas

Run multiple replicas of each Bind9Instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  replicas: 3  # Multiple replicas for pod-level HA

Benefits:

  • Survives pod crashes
  • Load distribution
  • Zero-downtime updates

2. Multiple Instances

Deploy separate primary and secondary instances:

# Primary instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    dns-role: primary
spec:
  replicas: 2
---
# Secondary instance  
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  labels:
    dns-role: secondary
spec:
  replicas: 2

Benefits:

  • Role separation
  • Independent scaling
  • Failover capability

3. Geographic Distribution

Deploy instances across multiple regions:

# US East primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east
  labels:
    dns-role: primary
    region: us-east-1
spec:
  replicas: 2
---
# US West secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  labels:
    dns-role: secondary
    region: us-west-2
spec:
  replicas: 2
---
# EU secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  labels:
    dns-role: secondary
    region: eu-west-1
spec:
  replicas: 2

Benefits:

  • Regional failure tolerance
  • Lower latency for global users
  • Regulatory compliance (data locality)

HA Patterns

Pattern 1: Active-Passive

One active primary, multiple passive secondaries:

graph LR
    primary["Primary<br/>(Active)<br/>us-east-1"]
    sec1["Secondary<br/>(Passive)<br/>us-west-2"]
    sec2["Secondary<br/>(Passive)<br/>eu-west-1"]
    clients["Clients query any"]

    primary -->|AXFR| sec1
    sec1 -->|AXFR| sec2
    primary --> clients
    sec1 --> clients
    sec2 --> clients

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style clients fill:#fff9c4,stroke:#f57f17,stroke-width:2px
  • Updates go to primary only
  • Secondaries receive via zone transfer
  • Clients query any available instance

Pattern 2: Multi-Primary

Multiple primaries in different regions:

graph LR
    primary1["Primary<br/>(zone-a)<br/>us-east-1"]
    primary2["Primary<br/>(zone-b)<br/>eu-west-1"]

    primary1 <-->|Sync| primary2

    style primary1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style primary2 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
  • Different zones on different primaries
  • Geographic distribution of updates
  • Careful coordination required

Pattern 3: Anycast

Same IP announced from multiple locations:

graph TB
    client["Client Query (192.0.2.53)"]
    dns_us["DNS<br/>US"]
    dns_eu["DNS<br/>EU"]
    dns_apac["DNS<br/>APAC"]

    client --> dns_us
    client --> dns_eu
    client --> dns_apac

    style client fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style dns_us fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style dns_eu fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style dns_apac fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
  • Requires BGP routing
  • Lowest latency routing
  • Automatic failover

Pod-Level HA

Anti-Affinity

Spread pods across nodes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: primary-dns
spec:
  replicas: 3
  template:
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: instance
                  operator: In
                  values:
                  - primary-dns
              topologyKey: kubernetes.io/hostname

Topology Spread

Distribute across availability zones:

spec:
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        instance: primary-dns

Service-Level HA

Liveness and Readiness Probes

Ensure only healthy pods serve traffic:

livenessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 30
  periodSeconds: 10
  
readinessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 5
  periodSeconds: 5

Pod Disruption Budgets

Limit concurrent disruptions:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      instance: primary-dns

Monitoring HA

Check Instance Distribution

# View instances across regions
kubectl get bind9instances -A -L region

# View pod distribution
kubectl get pods -n dns-system -o wide

# Check zone spread
kubectl get pods -n dns-system \
  -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName,ZONE:.spec.nodeSelector

Test Failover

# Simulate pod failure
kubectl delete pod -n dns-system <pod-name>

# Verify automatic recovery
kubectl get pods -n dns-system -w

# Test DNS during failover
while true; do dig @$SERVICE_IP example.com +short; sleep 1; done

Disaster Recovery

Backup Strategy

# Regular backups of all CRDs
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -A -o yaml > backup-$(date +%Y%m%d).yaml

Recovery Procedures

  1. Single Pod Failure - Kubernetes automatically recreates
  2. Instance Failure - Clients fail over to other instances
  3. Regional Failure - Zone data available from other regions
  4. Complete Loss - Restore from backup
# Restore from backup
kubectl apply -f backup-20241126.yaml

Operator High Availability

The Bindy operator itself can run in high availability mode with automatic leader election. This ensures continuous DNS management even if operator pods fail.

Leader Election

Multiple operator instances use Kubernetes Lease objects for distributed leader election:

graph TB
    op1["Operator<br/>Instance 1<br/>(Leader)"]
    op2["Operator<br/>Instance 2<br/>(Standby)"]
    op3["Operator<br/>Instance 3<br/>(Standby)"]
    lease["Kubernetes API<br/>Lease Object"]

    op1 --> lease
    op2 --> lease
    op3 --> lease

    style op1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style op2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style op3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style lease fill:#fff9c4,stroke:#f57f17,stroke-width:2px

How it works:

  1. All operator instances attempt to acquire the lease
  2. One instance becomes the leader and starts reconciling resources
  3. Standby instances wait and monitor the lease
  4. If the leader fails, a standby automatically takes over (~15 seconds)

HA Operator Deployment

Deploy multiple operator replicas with leader election enabled:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy
  namespace: dns-system
spec:
  replicas: 3  # Run 3 instances for HA
  selector:
    matchLabels:
      app: bindy
  template:
    metadata:
      labels:
        app: bindy
    spec:
      serviceAccountName: bindy
      containers:
      - name: operator
        image: ghcr.io/firestoned/bindy:latest
        env:
        # Leader election configuration
        - name: ENABLE_LEADER_ELECTION
          value: "true"
        - name: LEASE_NAME
          value: "bindy-leader"
        - name: LEASE_NAMESPACE
          value: "dns-system"
        - name: LEASE_DURATION_SECONDS
          value: "15"
        - name: LEASE_RENEW_DEADLINE_SECONDS
          value: "10"
        - name: LEASE_RETRY_PERIOD_SECONDS
          value: "2"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

Configuration Options

Environment variables for leader election:

VariableDefaultDescription
ENABLE_LEADER_ELECTIONtrueEnable/disable leader election
LEASE_NAMEbindy-leaderName of the Lease resource
LEASE_NAMESPACEdns-systemNamespace for the Lease
LEASE_DURATION_SECONDS15How long leader holds lease
LEASE_RENEW_DEADLINE_SECONDS10Leader must renew before this
LEASE_RETRY_PERIOD_SECONDS2How often to attempt lease acquisition
POD_NAME$HOSTNAMEUnique identity for this operator instance

Monitoring Leader Election

Check which operator instance is the current leader:

# View the lease object
kubectl get lease -n dns-system bindy-leader -o yaml

# Output shows current leader
spec:
  holderIdentity: bindy-7d8f9c5b4d-x7k2m  # Current leader pod
  leaseDurationSeconds: 15
  renewTime: "2025-11-30T12:34:56Z"

Monitor operator logs to see leadership changes:

# Watch operator logs
kubectl logs -n dns-system deployment/bindy -f

# Look for leadership events
INFO Attempting to acquire lease bindy-leader
INFO Lease acquired, this instance is now the leader
INFO Starting all controllers
WARN Leadership lost! Stopping all controllers...
INFO Lease acquired, this instance is now the leader

Failover Testing

Test automatic failover:

# Find current leader
LEADER=$(kubectl get lease -n dns-system bindy-leader -o jsonpath='{.spec.holderIdentity}')
echo "Current leader: $LEADER"

# Delete the leader pod
kubectl delete pod -n dns-system $LEADER

# Watch for new leader election (typically ~15 seconds)
kubectl get lease -n dns-system bindy-leader -w

# Verify DNS operations continue uninterrupted
kubectl get bind9instances -A

RBAC Requirements

Leader election requires additional permissions in the operator’s Role:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: bindy
  namespace: dns-system
rules:
# Leases for leader election
- apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
  verbs: ["get", "create", "update", "patch"]

Troubleshooting

Operator not reconciling resources:

# Check which instance is leader
kubectl get lease -n dns-system bindy-leader -o jsonpath='{.spec.holderIdentity}'

# Verify that pod exists and is running
kubectl get pods -n dns-system

# Check operator logs
kubectl logs -n dns-system deployment/bindy -f

Multiple operators reconciling (split brain):

This should never happen with proper leader election. If you suspect it:

# Check lease configuration
kubectl get lease -n dns-system bindy-leader -o yaml

# Verify all operators use the same LEASE_NAME
kubectl get deployment -n dns-system bindy -o yaml | grep LEASE_NAME

# Force lease release (recreate it)
kubectl delete lease -n dns-system bindy-leader

Leader election disabled but multiple replicas running:

This will cause conflicts. Either:

  1. Enable leader election: Set ENABLE_LEADER_ELECTION=true
  2. Or run single replica: kubectl scale deployment bindy --replicas=1

Performance Impact

Leader election adds minimal overhead:

  • Failover time: ~15 seconds (configurable via LEASE_DURATION_SECONDS)
  • Network traffic: 1 lease renewal every 2 seconds from leader only
  • CPU/Memory: Negligible (<1% increase)

Best Practices

  1. Run 3+ Operator Replicas - For operator HA with leader election
  2. Run 3+ DNS Instance Replicas - Odd numbers for quorum
  3. Multi-AZ Deployment - Spread across availability zones
  4. Geographic Redundancy - At least 2 regions for critical zones
  5. Monitor Continuously - Alert on degraded HA
  6. Test Failover - Regular disaster recovery drills (both operator and DNS instances)
  7. Automate Recovery - Use Kubernetes self-healing
  8. Document Procedures - Runbooks for incidents
  9. Enable Leader Election - Always run operator with ENABLE_LEADER_ELECTION=true in production
  10. Monitor Lease Health - Alert if lease ownership changes frequently (indicates instability)

Next Steps

Zone Transfers

Configure and optimize DNS zone transfers between primary and secondary instances.

Overview

Zone transfers replicate DNS zone data from primary to secondary servers using AXFR (full transfer) or IXFR (incremental transfer).

Configuring Zone Transfers

Primary Instance Setup

Allow zone transfers to secondary servers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"        # Secondary network
      - "192.168.100.0/24"  # Specific secondary subnet

Secondary Instance Setup

Configure secondary zones to transfer from primary:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"  # Primary DNS server IP
      - "10.0.1.11"  # Backup primary IP

Transfer Types

Full Transfer (AXFR)

Transfers entire zone:

  • Used for initial zone load
  • Triggered manually or when IXFR unavailable
  • More bandwidth intensive

Incremental Transfer (IXFR)

Transfers only changes since last serial:

  • More efficient for large zones
  • Requires serial number tracking
  • Automatically used when available

Transfer Triggers

NOTIFY Messages

Primary sends NOTIFY when zone changes:

graph TB
    primary["Primary Updates Zone"]
    sec1["Secondary 1"]
    sec2["Secondary 2"]
    sec3["Secondary 3"]
    transfer["Secondaries initiate IXFR/AXFR"]

    primary -->|NOTIFY| sec1
    primary -->|NOTIFY| sec2
    primary -->|NOTIFY| sec3
    sec1 --> transfer
    sec2 --> transfer
    sec3 --> transfer

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style transfer fill:#fff9c4,stroke:#f57f17,stroke-width:2px

Refresh Timer

Secondary checks for updates periodically:

soaRecord:
  refresh: 3600  # Check every hour
  retry: 600     # Retry after 10 minutes if failed

Manual Trigger

Force zone transfer:

# On secondary pod
kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc retransfer example.com

Monitoring Zone Transfers

Check Transfer Status

# View transfer logs
kubectl logs -n dns-system -l dns-role=secondary | grep "transfer of"

# Successful transfer
# transfer of 'example.com/IN' from 10.0.1.10#53: Transfer completed: 1 messages, 42 records

# Check zone status
kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc zonestatus example.com

Verify Serial Numbers

# Primary serial
kubectl exec -n dns-system deployment/primary-dns -- \
  dig @localhost example.com SOA +short | awk '{print $3}'

# Secondary serial  
kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @localhost example.com SOA +short | awk '{print $3}'

# Should match when in sync

Transfer Performance

Optimize Transfer Speed

  1. Use IXFR - Only transfer changes
  2. Increase Bandwidth - Adequate network resources
  3. Compress Transfers - Enable BIND9 compression
  4. Parallel Transfers - Multiple zones transfer concurrently

Transfer Limits

Configure maximum concurrent transfers:

# In BIND9 config (future enhancement)
options {
  transfers-in 10;   # Max incoming transfers
  transfers-out 10;  # Max outgoing transfers
};

Security

Access Control

Restrict transfers by IP:

spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Only this network

TSIG Authentication

Use TSIG keys for authenticated transfers:

# 1. Create a Kubernetes Secret with RNDC/TSIG credentials
apiVersion: v1
kind: Secret
metadata:
  name: transfer-key-secret
  namespace: dns-system
type: Opaque
stringData:
  key-name: transfer-key
  secret: K2xkajflkajsdf09asdfjlaksjdf==  # base64-encoded HMAC key

---
# 2. Reference the secret in Bind9Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256  # Algorithm for this key

The secret will be used for authenticated zone transfers between primary and secondary servers.

Troubleshooting

Transfer Failures

Check network connectivity:

kubectl exec -n dns-system deployment/secondary-dns -- \
  nc -zv primary-dns-service 53

Test manual transfer:

kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @primary-dns-service example.com AXFR

Check ACLs:

kubectl get bind9instance primary-dns -o jsonpath='{.spec.config.allowTransfer}'

Slow Transfers

Check zone size:

kubectl exec -n dns-system deployment/primary-dns -- \
  wc -l /var/lib/bind/zones/example.com.zone

Monitor transfer time:

kubectl logs -n dns-system -l dns-role=secondary | \
  grep "transfer of" | grep "msecs"

Transfer Lag

Check refresh interval:

kubectl get dnszone example-com -o jsonpath='{.spec.soaRecord.refresh}'

Force immediate transfer:

kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc retransfer example.com

Best Practices

  1. Use IXFR - More efficient than full transfers
  2. Set Appropriate Refresh - Balance freshness vs load
  3. Monitor Serial Numbers - Detect sync issues
  4. Secure Transfers - Use ACLs and TSIG
  5. Test Failover - Verify secondaries work when primary fails
  6. Log Transfers - Monitor for failures
  7. Geographic Distribution - Secondaries in different regions

Example: Complete Setup

# Primary Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.0.0/8"
---
# Primary Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
spec:
  zoneName: example.com
  type: primary
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Secondary Instance  
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  labels:
    dns-role: secondary
spec:
  replicas: 2
---
# Secondary Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "primary-dns-service.dns-system.svc.cluster.local"

Next Steps

Replication

Implement multi-region DNS replication strategies for global availability.

Replication Models

Hub-and-Spoke

One central primary, multiple regional secondaries:

graph TB
    primary["Primary (us-east-1)"]
    sec1["Secondary<br/>(us-west)"]
    sec2["Secondary<br/>(eu-west)"]
    sec3["Secondary<br/>(ap-south)"]

    primary --> sec1
    primary --> sec2
    primary --> sec3

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Simple, clear source of truth Cons: Single point of failure, latency for distant regions

Multi-Primary

Multiple primaries in different regions:

graph TB
    primaryA["Primary A<br/>(us-east)"]
    primaryB["Primary B<br/>(eu-west)"]
    sec1["Secondary<br/>(us-west)"]
    sec2["Secondary<br/>(ap-south)"]

    primaryA <-->|Sync| primaryB
    primaryA --> sec1
    primaryB --> sec2

    style primaryA fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style primaryB fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Regional updates, better latency Cons: Complex synchronization, conflict resolution

Hierarchical

Tiered replication structure:

graph TB
    global["Global Primary"]
    reg1["Regional<br/>Primary"]
    reg2["Regional<br/>Primary"]
    reg3["Regional<br/>Primary"]
    local1["Local<br/>Secondary"]
    local2["Local<br/>Secondary"]
    local3["Local<br/>Secondary"]

    global --> reg1
    global --> reg2
    global --> reg3
    reg1 --> local1
    reg2 --> local2
    reg3 --> local3

    style global fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style reg1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reg2 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reg3 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style local1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style local2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style local3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Scales well, reduces global load Cons: More complex, longer propagation time

Configuration Examples

Hub-and-Spoke Setup

# Central Primary (us-east-1)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: global-primary
  labels:
    dns-role: primary
    region: us-east-1
spec:
  replicas: 3
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Allow all regional networks
---
# Regional Secondaries
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  labels:
    dns-role: secondary
    region: us-west-2
spec:
  replicas: 2
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  labels:
    dns-role: secondary
    region: eu-west-1
spec:
  replicas: 2

Replication Latency

Measuring Propagation Time

# Update record on primary
kubectl apply -f new-record.yaml

# Check serial on primary
PRIMARY_SERIAL=$(kubectl exec -n dns-system deployment/global-primary -- \
  dig @localhost example.com SOA +short | awk '{print $3}')

# Wait and check secondary
SECONDARY_SERIAL=$(kubectl exec -n dns-system deployment/secondary-eu-west -- \
  dig @localhost example.com SOA +short | awk '{print $3}')

# Calculate lag
echo "Primary: $PRIMARY_SERIAL, Secondary: $SECONDARY_SERIAL"

Optimizing Propagation

  1. Reduce refresh interval - More frequent checks
  2. Enable NOTIFY - Immediate notification of changes
  3. Use IXFR - Faster incremental transfers
  4. Optimize network - Low-latency connections between regions

Automatic Zone Transfer Configuration

New in v0.1.0: Bindy automatically configures zone transfers between primary and secondary instances.

When you create a DNSZone resource, Bindy automatically:

  1. Discovers secondary instances - Finds all Bind9Instance resources labeled with role=secondary in the cluster
  2. Configures zone transfers - Adds also-notify and allow-transfer directives with secondary IP addresses
  3. Tracks secondary IPs - Stores current secondary IPs in DNSZone.status.secondaryIps
  4. Detects IP changes - Monitors for secondary pod IP changes (due to restarts, rescheduling, scaling)
  5. Auto-updates zones - Automatically reconfigures zones when secondary IPs change

Example:

# Check automatically configured secondary IPs
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'
# Output: ["10.244.1.5","10.244.2.8"]

# Verify zone configuration on primary
kubectl exec -n dns-system deployment/primary-dns -- \
  curl -s localhost:8080/api/zones/example.com | jq '.alsoNotify, .allowTransfer'

Self-Healing: When secondary pods are rescheduled and get new IPs:

  • Detection happens within 5-10 minutes (next reconciliation cycle)
  • Zones are automatically updated with new secondary IPs
  • Zone transfers resume automatically with no manual intervention

No manual configuration needed! The old approach of manually configuring allowTransfer networks is no longer required for Kubernetes-managed instances.

Conflict Resolution

When using multi-primary setups, handle conflicts:

Prevention

  • Separate zones per primary
  • Use different subdomains per region
  • Implement locking mechanism

Detection

# Compare zones between primaries
diff <(kubectl exec deployment/primary-us -- cat /var/lib/bind/zones/example.com.zone) \
     <(kubectl exec deployment/primary-eu -- cat /var/lib/bind/zones/example.com.zone)

Monitoring Replication

Replication Dashboard

Monitor:

  • Serial number sync status
  • Replication lag per region
  • Transfer success/failure rate
  • Zone size and growth

Alerts

Set up alerts for:

  • Serial number drift > threshold
  • Failed zone transfers
  • Replication lag > SLA
  • Network connectivity issues

Best Practices

  1. Document topology - Clear replication map
  2. Monitor lag - Track propagation time
  3. Test failover - Regular DR drills
  4. Use consistent serials - YYYYMMDDnn format
  5. Automate updates - GitOps for all regions
  6. Capacity planning - Account for replication traffic

Next Steps

Security

Secure your Bindy DNS infrastructure against threats and unauthorized access.

Security Layers

1. Network Security

  • Firewall rules limiting DNS access
  • Network policies in Kubernetes
  • Private networks for zone transfers

2. Access Control

  • Query restrictions (allowQuery)
  • Transfer restrictions (allowTransfer)
  • RBAC for Kubernetes resources

3. DNSSEC

  • Cryptographic validation
  • Zone signing
  • Trust chain verification

4. Pod Security

  • Pod Security Standards
  • SecurityContext settings
  • Read-only filesystems

Best Practices

  1. Principle of Least Privilege - Minimal permissions
  2. Defense in Depth - Multiple security layers
  3. Regular Updates - Keep BIND9 and controller updated
  4. Audit Logging - Track all changes
  5. Encryption - TLS for management, DNSSEC for queries

Quick Security Checklist

  • Enable DNSSEC for public zones
  • Restrict allowQuery to expected networks
  • Limit allowTransfer to secondary servers only
  • Use RBAC for Kubernetes access
  • Enable Pod Security Standards
  • Regular security audits
  • Monitor for suspicious queries
  • Keep software updated

Next Steps

  • DNSSEC - Enable cryptographic validation
  • Access Control - Configure query and transfer restrictions

DNSSEC

Enable DNS Security Extensions (DNSSEC) for cryptographic validation of DNS responses.

Overview

DNSSEC adds cryptographic signatures to DNS records, preventing:

  • Cache poisoning
  • Man-in-the-middle attacks
  • Response tampering

Enabling DNSSEC

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  config:
    dnssec:
      enabled: true      # Enable DNSSEC signing
      validation: true   # Enable DNSSEC validation

DNSSEC Record Types

  • DNSKEY - Public signing keys
  • RRSIG - Resource record signatures
  • NSEC/NSEC3 - Proof of non-existence
  • DS - Delegation signer (at parent zone)

Verification

Check DNSSEC Status

# Query with DNSSEC validation
dig @$SERVICE_IP example.com +dnssec

# Check for ad (authentic data) flag
dig @$SERVICE_IP example.com +dnssec | grep "flags.*ad"

# Verify RRSIG records
dig @$SERVICE_IP example.com RRSIG

Validate Chain of Trust

# Check DS record at parent
dig @parent-dns example.com DS

# Verify DNSKEY matches DS
dig @$SERVICE_IP example.com DNSKEY

Key Management

Automatic Key Rotation

BIND9 handles automatic key rotation (future enhancement for Bindy configuration).

Manual Key Management

# Generate keys (inside BIND9 pod)
kubectl exec -n dns-system deployment/primary-dns -- \
  dnssec-keygen -a RSASHA256 -b 2048 -n ZONE example.com

# Sign zone
kubectl exec -n dns-system deployment/primary-dns -- \
  dnssec-signzone -o example.com /var/lib/bind/zones/example.com.zone

Troubleshooting

DNSSEC Validation Failures

# Check validation logs
kubectl logs -n dns-system -l instance=primary-dns | grep dnssec

# Test with validation disabled
dig @$SERVICE_IP example.com +cd

# Verify time synchronization (critical for DNSSEC)
kubectl exec -n dns-system deployment/primary-dns -- date

Best Practices

  1. Enable on primaries - Sign at source
  2. Monitor expiration - Alert on expiring signatures
  3. Test before enabling - Verify in staging first
  4. Keep clocks synced - NTP critical for DNSSEC
  5. Plan key rotation - Regular key updates

Next Steps

Access Control

Configure fine-grained access control for DNS queries and zone transfers.

Query Access Control

Restrict who can query your DNS servers:

Public DNS (Allow All)

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"  # IPv4 - anyone
      - "::/0"       # IPv6 - anyone

Internal DNS (Restricted)

spec:
  config:
    allowQuery:
      - "10.0.0.0/8"      # RFC1918 private
      - "172.16.0.0/12"   # RFC1918 private
      - "192.168.0.0/16"  # RFC1918 private

Specific Networks

spec:
  config:
    allowQuery:
      - "192.168.1.0/24"   # Office network
      - "10.100.0.0/16"    # VPN network
      - "172.20.5.10"      # Specific host

Zone Transfer Access Control

Restrict zone transfers to authorized servers:

spec:
  config:
    allowTransfer:
      - "10.0.1.0/24"      # Secondary DNS subnet
      - "192.168.100.5"    # Specific secondary
      - "192.168.100.6"    # Another secondary

Block All Transfers

spec:
  config:
    allowTransfer: []  # No transfers allowed

ACL Best Practices

  1. Default Deny - Start restrictive, open as needed
  2. Use CIDR Blocks - More maintainable than individual IPs
  3. Document ACLs - Note why each entry exists
  4. Regular Review - Remove obsolete entries
  5. Test Changes - Verify before production

Network Policies

Kubernetes NetworkPolicies add another layer:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns-ingress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app: bind9
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector: {}  # Allow from all namespaces
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

Testing Access Control

# From allowed network (should work)
dig @$SERVICE_IP example.com

# From blocked network (should timeout or refuse)
dig @$SERVICE_IP example.com
# ;; communications error: connection timed out

# Test zone transfer restriction
dig @$SERVICE_IP example.com AXFR
# Transfer should fail if not in allowTransfer list

Next Steps

Performance

Optimize Bindy DNS infrastructure for maximum performance and efficiency.

Performance Metrics

Key metrics to monitor:

  • Query latency - Time to respond to DNS queries
  • Throughput - Queries per second (QPS)
  • Resource usage - CPU and memory utilization
  • Cache hit ratio - Percentage of cached responses
  • Reconciliation loops - Unnecessary status updates

Controller Performance

Status Update Optimization

The Bindy operator implements status change detection in all reconcilers to prevent tight reconciliation loops. This optimization:

  • Reduces Kubernetes API calls by skipping unnecessary status updates
  • Prevents reconciliation storms that can occur when status updates trigger new reconciliations
  • Improves overall system performance by reducing CPU and network overhead

All reconcilers check if the status has actually changed before updating the status subresource. Status updates only occur when:

  • Condition type changes
  • Status value changes
  • Message changes
  • Status doesn’t exist yet

This optimization is implemented across all resource types:

  • Bind9Cluster
  • Bind9Instance
  • DNSZone
  • All DNS record types (A, AAAA, CNAME, MX, NS, SRV, TXT, CAA)

For more details, see the Reconciliation Logic documentation.

Optimization Strategies

1. Resource Allocation

Provide adequate CPU and memory:

spec:
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "2000m"
      memory: "2Gi"

2. Horizontal Scaling

Add more replicas for higher capacity:

spec:
  replicas: 5  # More replicas = more capacity

3. Geographic Distribution

Place DNS servers near clients:

  • Reduced network latency
  • Better user experience
  • Regional load distribution

4. Caching Strategy

Configure BIND9 caching (when appropriate):

  • Longer TTLs reduce upstream queries
  • Negative caching for NXDOMAIN
  • Prefetching for popular domains

Performance Testing

Baseline Testing

# Single query latency
time dig @$SERVICE_IP example.com

# Sustained load (100 QPS for 60 seconds)
dnsp erf -s $SERVICE_IP -d example.com -q 100 -t 60

Load Testing

# Using dnsperf
dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 1000

# Using custom script
for i in {1..1000}; do
  dig @$SERVICE_IP test$i.example.com &
done
wait

Resource Optimization

CPU Optimization

  • Use efficient query algorithms
  • Enable query parallelization
  • Optimize zone file format

Memory Optimization

  • Right-size zone cache
  • Limit journal size
  • Regular zone file cleanup

Network Optimization

  • Use UDP for queries (TCP for transfers)
  • Enable TCP Fast Open
  • Optimize MTU size

Monitoring Performance

# Real-time resource usage
kubectl top pods -n dns-system -l app=bind9

# Query statistics
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc stats

# View statistics file
kubectl exec -n dns-system deployment/primary-dns -- \
  cat /var/cache/bind/named.stats

Performance Targets

MetricTargetGoodExcellent
Query Latency< 50ms< 20ms< 10ms
Throughput> 1000 QPS> 5000 QPS> 10000 QPS
CPU Usage< 70%< 50%< 30%
Memory Usage< 80%< 60%< 40%
Cache Hit Ratio> 60%> 80%> 90%

Next Steps

Tuning

Fine-tune BIND9 and Kubernetes parameters for optimal performance.

BIND9 Tuning

Query Performance

# Future enhancement - BIND9 tuning via Bind9Instance spec
spec:
  config:
    tuning:
      maxCacheSize: "512M"
      maxCacheTTL: 86400
      recursiveClients: 1000

Zone Transfer Tuning

  • Concurrent transfers: transfers-in, transfers-out
  • Transfer timeout: Adjust for large zones
  • Compression: Enable for faster transfers

Kubernetes Tuning

Pod Resources

Right-size based on load:

# Light load
resources:
  requests: {cpu: "100m", memory: "128Mi"}
  limits: {cpu: "500m", memory: "512Mi"}

# Medium load
resources:
  requests: {cpu: "500m", memory: "512Mi"}
  limits: {cpu: "2000m", memory: "2Gi"}

# Heavy load
resources:
  requests: {cpu: "2000m", memory: "2Gi"}
  limits: {cpu: "4000m", memory: "4Gi"}

HPA (Horizontal Pod Autoscaling)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: bind9-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-dns
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Node Affinity

Place DNS pods on optimized nodes:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: workload-type
          operator: In
          values:
          - dns

Network Tuning

Service Type

Consider NodePort or LoadBalancer for external access:

apiVersion: v1
kind: Service
spec:
  type: LoadBalancer  # Or NodePort
  externalTrafficPolicy: Local  # Preserve source IP

DNS Caching

Adjust TTL values:

# Short TTL for dynamic records
spec:
  ttl: 60  # 1 minute

# Long TTL for static records
spec:
  ttl: 86400  # 24 hours

OS-Level Tuning

File Descriptors

Increase limits for high query volume:

# In pod security context (future enhancement)
securityContext:
  limits:
    nofile: 65536

Network Buffers

Optimize for DNS traffic (node-level):

# Increase UDP buffer sizes
sysctl -w net.core.rmem_max=8388608
sysctl -w net.core.wmem_max=8388608

Monitoring Tuning Impact

# Before tuning - baseline
kubectl top pods -n dns-system
time dig @$SERVICE_IP example.com

# Apply tuning
kubectl apply -f tuned-config.yaml

# After tuning - compare
kubectl top pods -n dns-system
time dig @$SERVICE_IP example.com

Tuning Checklist

  • Right-sized pod resources
  • Optimal replica count
  • HPA configured
  • Appropriate TTL values
  • Network policies optimized
  • Node placement configured
  • Monitoring enabled
  • Performance tested

Next Steps

Benchmarking

Measure and analyze DNS performance using industry-standard tools.

Tools

dnsperf

Industry-standard DNS benchmarking:

# Install dnsperf
apt-get install dnsperf

# Create query file
cat > queries.txt <<'QUERIES'
example.com A
www.example.com A
mail.example.com MX
QUERIES

# Run benchmark
dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 1000

resperf

Response rate testing:

# Test maximum QPS
resperf -s $SERVICE_IP -d queries.txt -m 10000

dig

Simple latency testing:

# Measure query time
dig @$SERVICE_IP example.com | grep "Query time"

# Multiple queries for average
for i in {1..100}; do
  dig @$SERVICE_IP example.com +stats | grep "Query time"
done | awk '{sum+=$4; count++} END {print "Average:", sum/count, "ms"}'

Benchmark Scenarios

Scenario 1: Baseline Performance

Single client, sequential queries:

dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 100

Expected: < 10ms latency, > 90% success

Scenario 2: Load Test

Multiple clients, high QPS:

dnsperf -s $SERVICE_IP -d queries.txt -l 300 -Q 5000 -c 50

Expected: < 50ms latency under load

Scenario 3: Stress Test

Maximum capacity test:

resperf -s $SERVICE_IP -d queries.txt -m 50000

Expected: Find maximum QPS before degradation

Metrics to Collect

Response Time

  • Minimum latency
  • Average latency
  • 95th percentile
  • 99th percentile
  • Maximum latency

Throughput

  • Queries per second
  • Successful responses
  • Failed queries
  • Timeout rate

Resource Usage

# During benchmark
kubectl top pods -n dns-system

# CPU and memory trends
kubectl top pods -n dns-system --use-protocol-buffers

Sample Benchmark Report

Benchmark: Load Test
Date: 2024-11-26
Duration: 300 seconds
Target QPS: 5000

Results:
- Queries sent: 1,500,000
- Queries completed: 1,498,500
- Success rate: 99.9%
- Average latency: 12.3ms
- 95th percentile: 24.1ms
- 99th percentile: 45.2ms
- Max latency: 89.5ms

Resource Usage:
- Average CPU: 1.2 cores
- Average Memory: 512MB
- Peak CPU: 1.8 cores
- Peak Memory: 768MB

Continuous Benchmarking

Automated Testing

apiVersion: batch/v1
kind: CronJob
metadata:
  name: dns-benchmark
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: dnsperf
            image: dnsperf:latest
            command:
            - /bin/sh
            - -c
            - dnsperf -s primary-dns -d /queries.txt -l 60 >> /results/benchmark.log

Trend Analysis

Track performance over time:

  • Daily benchmarks
  • Compare before/after changes
  • Identify degradation early
  • Capacity planning

Best Practices

  1. Consistent tests - Same queries, duration
  2. Isolated environment - Minimize external factors
  3. Multiple runs - Average results
  4. Document changes - Link to config changes
  5. Realistic load - Match production patterns

Next Steps

Integration

Integrate Bindy with other Kubernetes and DNS systems.

Integration Patterns

1. Internal Service Discovery

Use Bindy for internal service DNS.

2. Hybrid DNS

Combine Bindy with external DNS providers.

3. GitOps

Manage DNS configuration through Git.

Kubernetes Integration

CoreDNS Integration

Use Bindy alongside CoreDNS:

# CoreDNS for cluster.local
# Bindy for custom domains

Linkerd Service Mesh

Integrate with Linkerd:

  • Custom DNS resolution for internal services
  • Service discovery integration
  • Traffic routing with DNS-based endpoints
  • mTLS-secured management communication (RNDC API)

Next Steps

External DNS Integration

Integrate Bindy with external DNS management systems.

Use Cases

  1. Hybrid Cloud - Internal DNS in Bindy, external in cloud provider
  2. Public/Private Split - Public zones external, private in Bindy
  3. Migration - Gradual migration from external to Bindy

Integration with external-dns

External-dns manages external providers (Route53, CloudDNS), Bindy manages internal BIND9.

Separate Domains

# external-dns manages example.com (public)
# Bindy manages internal.example.com (private)

Forwarding

Configure external DNS to forward to Bindy for internal zones.

Best Practices

  1. Clear boundaries - Document which system owns which zones
  2. Consistent records - Synchronize where needed
  3. Separate responsibilities - External for public, Bindy for internal

Next Steps

Service Discovery

Use Bindy for Kubernetes service discovery and internal DNS.

Kubernetes Service DNS

Automatic Service Records

Create DNS records for Kubernetes services:

apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  selector:
    app: myapp
  ports:
  - port: 80
---
# Create corresponding DNS record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: myapp
spec:
  zone: internal-local
  name: myapp.production
  ipv4Address: "10.100.5.10"  # Service ClusterIP

Service Discovery Pattern

graph TB
    app["Application Query:<br/>myapp.production.internal.local"]
    dns["Bindy DNS Server"]
    result["Returns: 10.100.5.10"]
    svc["Kubernetes Service"]

    app --> dns
    dns --> result
    result --> svc

    style app fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style dns fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style result fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style svc fill:#f3e5f5,stroke:#4a148c,stroke-width:2px

Dynamic Updates

Automatically update DNS when services change (future enhancement):

# Controller watches Services and creates DNS records

Best Practices

  1. Consistent naming - Match service names to DNS names
  2. Namespace separation - Use subdomains per namespace
  3. TTL management - Short TTLs for dynamic services
  4. Health checks - Only advertise healthy services

Next Steps

Development Setup

Set up your development environment for contributing to Bindy.

Prerequisites

Required Tools

  • Rust - 1.70 or later
  • Kubernetes - 1.27 or later (for testing)
  • kubectl - Matching your Kubernetes version
  • Docker - For building images
  • kind - For local Kubernetes testing (optional)

Install Rust

# Install rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Verify installation
rustc --version
cargo --version

Install Development Tools

# Install cargo tools
cargo install cargo-watch  # Auto-rebuild on changes
cargo install cargo-tarpaulin  # Code coverage

# Install mdbook for documentation
cargo install mdbook

Clone Repository

git clone https://github.com/firestoned/bindy.git
cd bindy

Project Structure

bindy/
├── src/              # Rust source code
│   ├── main.rs       # Entry point
│   ├── crd.rs        # CRD definitions
│   ├── reconcilers/  # Reconciliation logic
│   └── bind9.rs      # BIND9 integration
├── deploy/           # Kubernetes manifests
│   ├── crds/         # CRD definitions
│   ├── rbac/         # RBAC resources
│   └── controller/   # Controller deployment
├── tests/            # Integration tests
├── examples/         # Example configurations
├── docs/             # Documentation
└── Cargo.toml        # Rust dependencies

Dependencies

Key dependencies:

  • kube - Kubernetes client
  • tokio - Async runtime
  • serde - Serialization
  • tracing - Logging

See Cargo.toml for full list.

IDE Setup

VS Code

Recommended extensions:

  • rust-analyzer
  • crates
  • Even Better TOML
  • Kubernetes

IntelliJ IDEA / CLion

  • Install Rust plugin
  • Install Kubernetes plugin

Verify Setup

# Build the project
cargo build

# Run tests
cargo test

# Run clippy (linter)
cargo clippy

# Format code
cargo fmt

If all commands succeed, your development environment is ready!

Next Steps

Building from Source

Build the Bindy controller from source code.

Build Debug Version

For development with debug symbols:

cargo build

Binary location: target/debug/bindy

Build Release Version

Optimized for production:

cargo build --release

Binary location: target/release/bindy

Run Locally

# Set log level
export RUST_LOG=info

# Run controller (requires kubeconfig)
cargo run --release

Build Docker Image

# Build image
docker build -t bindy:dev .

# Or use make
make docker-build TAG=dev

Build for Different Platforms

Cross-Compilation

# Install cross
cargo install cross

# Build for Linux (from macOS/Windows)
cross build --release --target x86_64-unknown-linux-gnu

Multi-Architecture Images

# Build for multiple architectures
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t bindy:multi \
  --push .

Build Documentation

Rustdoc (API docs)

cargo doc --no-deps --open

mdBook (User guide)

Prerequisites:

The documentation uses Mermaid diagrams which require the mdbook-mermaid preprocessor:

# Install mdbook-mermaid
cargo install mdbook-mermaid

# Ensure ~/.cargo/bin is in your PATH
export PATH="$HOME/.cargo/bin:$PATH"

# Initialize Mermaid support (first time only)
mdbook-mermaid install .

Build and serve:

# Build book
mdbook build

# Serve locally
mdbook serve --open

Combined Documentation

make docs

Optimization

Profile-Guided Optimization

# Generate profile data
cargo build --release
./target/release/bindy  # Run workload

# Build with PGO
cargo build --release

Size Optimization

# In Cargo.toml
[profile.release]
opt-level = 'z'     # Optimize for size
lto = true          # Link-time optimization
codegen-units = 1   # Better optimization
strip = true        # Strip symbols

Troubleshooting

Build Errors

OpenSSL not found:

# Ubuntu/Debian
apt-get install libssl-dev pkg-config

# macOS
brew install openssl

Linker errors:

# Install build essentials
apt-get install build-essential

Next Steps

Running Tests

Run and write tests for Bindy.

Unit Tests

# Run all tests
cargo test

# Run specific test
cargo test test_name

# Run with output
cargo test -- --nocapture

Integration Tests

# Requires Kubernetes cluster
cargo test --test simple_integration -- --ignored

# Or use make
make test-integration

Test Coverage

# Install tarpaulin
cargo install cargo-tarpaulin

# Generate coverage
cargo tarpaulin --out Html

# Open report
open tarpaulin-report.html

Writing Tests

See Testing Guidelines for details.

Bindy DNS Controller - Testing Guide

Complete guide for testing the Bindy DNS Controller, including unit tests and integration tests with Kind (Kubernetes in Docker).

Quick Start

# Unit tests (fast, no Kubernetes required)
make test

# Integration tests (automated with Kind cluster)
make kind-integration-test

# View results
# Unit: 62 tests passing
# Integration: All 8 DNS record types + infrastructure tests

Table of Contents

Test Overview

Test Results

Unit Tests: 62 PASSING ✅

test result: ok. 62 passed; 0 failed; 0 ignored

Integration Tests: Automated with Kind

  • Kubernetes connectivity ✅
  • CRD verification ✅
  • All 8 DNS record types ✅
  • Resource lifecycle ✅

Test Structure

bindy/
├── src/
│   ├── crd_tests.rs              # CRD structure tests (28 tests)
│   └── reconcilers/
│       └── tests.rs              # Bind9Manager tests (34 tests)
├── tests/
│   ├── simple_integration.rs     # Rust integration tests
│   ├── integration_test.sh       # Full integration test suite
│   └── common/mod.rs            # Shared test utilities
└── deploy/
    ├── kind-deploy.sh           # Deploy to Kind cluster
    ├── kind-test.sh             # Basic functional tests
    └── kind-cleanup.sh          # Cleanup Kind cluster

Unit Tests

Unit tests run locally without Kubernetes (< 1 second).

Running Unit Tests

# All unit tests
make test
# or
cargo test

# Specific module
cargo test crd_tests::
cargo test bind9::tests::

# With output
cargo test -- --nocapture

Unit Test Coverage (62 tests)

CRD Tests (28 tests)

  • Label selectors and matching
  • SOA record structure
  • DNSZone specs (primary/secondary)
  • All DNS record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
  • Bind9Instance configurations
  • DNSSEC settings

Bind9Manager Tests (34 tests)

  • Zone file creation
  • Email formatting for DNS
  • All DNS record types (with/without TTL)
  • Secondary zone configuration
  • Zone lifecycle (create, exists, delete)
  • Edge cases and workflows

Integration Tests

Integration tests run against Kind (Kubernetes in Docker) clusters.

Prerequisites

# Docker
docker --version  # 20.10+

# Kind
kind --version    # 0.20.0+
brew install kind  # macOS

# kubectl
kubectl version --client  # 1.24+

Running Integration Tests

make kind-integration-test

This automatically:

  1. Creates Kind cluster (if needed)
  2. Builds and deploys controller
  3. Runs all integration tests
  4. Cleans up test resources

Step-by-Step

# 1. Deploy to Kind
make kind-deploy

# 2. Run functional tests
make kind-test

# 3. Run comprehensive integration tests
make kind-integration-test

# 4. View logs
make kind-logs

# 5. Cleanup
make kind-cleanup

Integration Test Coverage

Rust Integration Tests

  • test_kubernetes_connectivity - Cluster access
  • test_crds_installed - CRD verification
  • test_create_and_cleanup_namespace - Namespace lifecycle

Full Integration Suite (integration_test.sh)

  • Bind9Instance creation
  • DNSZone creation
  • A Record (IPv4)
  • AAAA Record (IPv6)
  • CNAME Record
  • MX Record
  • TXT Record
  • NS Record
  • SRV Record
  • CAA Record

Expected Output

🧪 Running Bindy Integration Tests

✅ Using existing cluster 'bindy-test'

1️⃣  Running Rust integration tests...
test test_kubernetes_connectivity ... ok
test test_crds_installed ... ok
test test_create_and_cleanup_namespace ... ok

2️⃣  Running functional tests with kubectl...
Testing Bind9Instance creation...
Testing DNSZone creation...
Testing all DNS record types...

3️⃣  Verifying resources...
  ✓ Bind9Instance created
  ✓ DNSZone created
  ✓ arecord created
  ✓ aaaarecord created
  ✓ cnamerecord created
  ✓ mxrecord created
  ✓ txtrecord created
  ✓ nsrecord created
  ✓ srvrecord created
  ✓ caarecord created

✅ All integration tests passed!

Makefile Targets

Test Targets

make test                   # Run unit tests
make test-lib              # Library tests only
make test-integration      # Rust integration tests
make test-all             # Unit + Rust integration tests
make test-cov             # Coverage report (HTML)
make test-cov-view        # Generate and open coverage

Kind Targets

make kind-create          # Create Kind cluster
make kind-deploy          # Deploy controller
make kind-test            # Basic functional tests
make kind-integration-test # Full integration suite
make kind-logs            # View controller logs
make kind-cleanup         # Delete cluster

Other Targets

make lint                 # Run clippy and fmt check
make format               # Format code
make build                # Build release binary
make docker-build         # Build Docker image

Troubleshooting

Unit Tests

Tests fail to compile

cargo clean
cargo test

Specific test fails

cargo test test_name -- --nocapture

Integration Tests

“Cluster not found”

# Auto-created by integration test, or:
./deploy/kind-deploy.sh

“Controller not ready”

# Check status
kubectl get pods -n dns-system

# View logs
kubectl logs -n dns-system -l app=bindy

# Redeploy
./deploy/kind-deploy.sh

“CRDs not installed”

# Check CRDs
kubectl get crds | grep bindy.firestoned.io

# Install
kubectl apply -k deploy/crds

Resource creation fails

# Controller logs
kubectl logs -n dns-system -l app=bindy --tail=50

# Resource status
kubectl describe bind9instance <name> -n dns-system

# Events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

Manual Cleanup

# Delete test resources
kubectl delete bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords --all -n dns-system

# Delete cluster
kind delete cluster --name bindy-test

# Clean build
cargo clean

CI/CD Integration

GitHub Actions

Current PR workflow (.github/workflows/pr.yaml):

  • Lint (formatting, clippy)
  • Test (unit tests)
  • Build (stable, beta)
  • Docker (build and push to ghcr.io)
  • Security audit
  • Coverage

Add Integration Tests

integration-tests:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: dtolnay/rust-toolchain@stable

    - name: Install Kind
      run: |
        curl -Lo ./kind https://kind.sigs.k8s.io/dl/latest/kind-linux-amd64
        chmod +x ./kind
        sudo mv ./kind /usr/local/bin/kind

    - name: Run Integration Tests
      run: |
        chmod +x tests/integration_test.sh
        ./tests/integration_test.sh

Test Development

Writing Unit Tests

Add to src/crd_tests.rs or src/reconcilers/tests.rs:

#![allow(unused)]
fn main() {
#[test]
fn test_my_feature() {
    // Arrange
    let (_temp_dir, manager) = create_test_manager();

    // Act
    let result = manager.my_operation();

    // Assert
    assert!(result.is_ok());
}
}

Writing Integration Tests

Add to tests/simple_integration.rs:

#![allow(unused)]
fn main() {
#[tokio::test]
#[ignore]  // Always mark as ignored
async fn test_my_scenario() {
    let client = match get_kube_client_or_skip().await {
        Some(c) => c,
        None => return,  // Skip if no cluster
    };

    // Test code here
}
}

Using Test Helpers

From tests/common/mod.rs:

#![allow(unused)]
fn main() {
use common::*;

let client = setup_dns_test_environment("my-test-ns").await?;
create_bind9_instance(&client, "ns", "dns", None).await?;
wait_for_ready(Duration::from_secs(10)).await;
cleanup_test_namespace(&client, "ns").await?;
}

Performance Testing

Coverage

make test-cov-view
# Opens coverage/tarpaulin-report.html

Load Testing

# Create many resources
for i in {1..100}; do
  kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: test-${i}
  namespace: dns-system
spec:
  zone: example.com
  name: host-${i}
  ipv4Address: "192.0.2.${i}"
EOF
done

# Monitor
kubectl top pod -n dns-system

Best Practices

Unit Tests

  • Test one thing at a time
  • Fast (< 1s each)
  • No external dependencies
  • Descriptive names

Integration Tests

  • Always use #[ignore]
  • Check cluster connectivity first
  • Unique namespaces
  • Always cleanup
  • Good error messages

General

  • Run cargo fmt before committing
  • Run cargo clippy to catch issues
  • Keep tests updated
  • Document complex scenarios

Additional Resources

Support

  • GitHub Issues: https://github.com/firestoned/bindy/issues
  • Controller logs: make kind-logs
  • Test with output: cargo test -- --nocapture

Test Coverage

Test Statistics

Total Unit Tests: 95 (96 including helper tests)

Test Breakdown by Module

bind9 Module (34 tests)

Zone file and DNS record management tests:

  • Zone creation and management (primary/secondary)
  • All 8 DNS record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
  • Record lifecycle (add, update, delete)
  • TTL handling
  • Special characters and edge cases
  • Complete workflow tests

bind9_resources Module (21 tests)

Kubernetes resource builder tests:

  • Label generation and consistency
  • ConfigMap creation with BIND9 configuration
  • Deployment creation with proper specs
  • Service creation with TCP/UDP ports
  • Pod specification validation
  • Volume and volume mount configuration
  • Health and readiness probes
  • BIND9 configuration options:
    • Recursion settings
    • ACL configuration (allowQuery, allowTransfer)
    • DNSSEC configuration
    • Multiple ACL entries
  • Resource naming conventions
  • Selector matching (Deployment ↔ Service)

crd_tests Module (28 tests)

CRD structure and validation tests:

  • Label selectors and requirements
  • SOA record structure
  • Secondary zone configuration
  • All DNS record specs (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
  • BIND9 configuration structures
  • DNSSEC configuration
  • Bind9Instance specifications
  • Status structures for all resource types

Status and Condition Tests (17 new tests)

Comprehensive condition type validation:

  • All 5 condition types: Ready, Available, Progressing, Degraded, Failed
  • All 3 status values: True, False, Unknown
  • Condition field validation (type, status, reason, message, lastTransitionTime)
  • Multiple conditions support
  • Status structures for:
    • Bind9Instance (with replicas tracking)
    • DNSZone (with record count)
    • All DNS record types
  • Condition serialization/deserialization
  • Observed generation tracking
  • Edge cases (no conditions, empty status)

Integration Tests (4 tests, 3 ignored)

  • Kubernetes connectivity (ignored - requires cluster)
  • CRD installation verification (ignored - requires cluster)
  • Namespace creation/cleanup (ignored - requires cluster)
  • Unit test verification (always runs)

Test Categories

Unit Tests (95)

  • Pure Functions: All resource builders, configuration generators
  • Data Structures: All CRD types, status structures, conditions
  • Business Logic: Zone management, record handling
  • Validation: Condition types, status values, configuration options

Integration Tests (3 ignored + 1 running)

  • Kubernetes cluster connectivity
  • CRD deployment
  • Resource lifecycle
  • End-to-end workflows

Coverage by Feature

CRD Validation

  • ✅ All 10 CRDs have proper structure tests
  • ✅ Condition types validated (Ready, Available, Progressing, Degraded, Failed)
  • ✅ Status values validated (True, False, Unknown)
  • ✅ Required fields enforced in CRD definitions
  • ✅ Serialization/deserialization tested

BIND9 Configuration

  • ✅ Named configuration file generation
  • ✅ Options configuration with all settings
  • ✅ Recursion control
  • ✅ ACL management (query, transfer)
  • ✅ DNSSEC configuration (enable, validation)
  • ✅ Default value handling
  • ✅ Multiple ACL entries
  • ✅ Empty ACL lists

Kubernetes Resources

  • ✅ Deployment creation with proper replica counts
  • ✅ Service creation with TCP/UDP ports
  • ✅ ConfigMap creation with BIND9 config
  • ✅ Label consistency across resources
  • ✅ Selector matching
  • ✅ Volume and volume mount configuration
  • ✅ Health probes (liveness, readiness)
  • ✅ Container image version handling

DNS Records

  • ✅ All 8 record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
  • ✅ Record creation with TTL
  • ✅ Default TTL handling
  • ✅ Multiple records per zone
  • ✅ Special characters in records
  • ✅ Record deletion
  • ✅ Zone apex vs subdomain records

Status Management

  • ✅ Condition creation with all fields
  • ✅ Multiple conditions per resource
  • ✅ Observed generation tracking
  • ✅ Replica count tracking (Bind9Instance)
  • ✅ Record count tracking (DNSZone)
  • ✅ Status transitions (Ready ↔ Failed)
  • ✅ Degraded state handling

Running Tests

All Tests

cargo test

Unit Tests Only

cargo test --lib

Specific Module

cargo test --lib bind9_resources
cargo test --lib crd_tests

Integration Tests

cargo test --test simple_integration -- --ignored

With Coverage

cargo tarpaulin --verbose --all-features --workspace --timeout 120 --out Xml

Test Quality Metrics

  • Coverage: High coverage of core functionality
  • Isolation: All unit tests are isolated and independent
  • Speed: All unit tests complete in < 0.01 seconds
  • Deterministic: No flaky tests, all results are reproducible
  • Comprehensive: Tests cover happy paths, edge cases, and error conditions

Recent Additions (26 new tests)

bind9_resources Module (+14 tests)

  1. test_build_pod_spec - Pod specification validation
  2. test_build_deployment_replicas - Replica count configuration
  3. test_build_deployment_version - BIND9 version handling
  4. test_build_service_ports - TCP/UDP port configuration
  5. test_configmap_contains_all_files - ConfigMap completeness
  6. test_options_conf_with_recursion_enabled - Recursion configuration
  7. test_options_conf_with_multiple_acls - Multiple ACL entries
  8. test_labels_consistency - Label validation
  9. test_configmap_naming - Naming conventions
  10. test_deployment_selector_matches_labels - Selector consistency
  11. test_service_selector_matches_deployment - Service selector matching
  12. test_dnssec_config_enabled - DNSSEC enable flag
  13. test_dnssec_config_validation_only - DNSSEC validation flag
  14. test_options_conf_with_empty_transfer - Empty transfer lists

crd_tests Module (+17 tests)

  1. test_condition_types - All 5 condition types validation
  2. test_condition_status_values - All 3 status values validation
  3. test_condition_with_all_fields - Complete condition structure
  4. test_multiple_conditions - Multiple conditions support
  5. test_dnszone_status_with_conditions - DNSZone status
  6. test_record_status_with_condition - Record status
  7. test_degraded_condition - Degraded state handling
  8. test_failed_condition - Failed state handling
  9. test_available_condition - Available state
  10. test_progressing_condition - Progressing state
  11. test_condition_serialization - JSON serialization
  12. test_status_with_no_conditions - Empty conditions list
  13. test_observed_generation_tracking - Generation tracking
  14. test_bind9_config - BIND9 configuration structure
  15. test_dnssec_config - DNSSEC configuration
  16. test_bind9instance_spec - Instance specification
  17. test_bind9instance_status_default - Status defaults

Next Steps

Potential Test Additions

  • Integration tests for actual BIND9 deployment
  • Integration tests for zone transfer between primary/secondary
  • Performance tests for large zone files
  • Stress tests with many concurrent updates
  • Property-based tests for configuration generation
  • Mock reconciler tests
  • Controller loop tests

Test Infrastructure

  • Add benchmarks for critical paths
  • Add mutation testing
  • Add fuzz testing for DNS record parsing
  • Set up continuous coverage tracking
  • Add test fixtures and helpers

Continuous Integration

All tests run automatically in GitHub Actions:

  • PR Workflow: Runs on every pull request
  • Main Workflow: Runs on pushes to main branch
  • Coverage: Uploaded to Codecov after each run
  • Integration: Runs in dedicated workflow with Kind cluster

Development Workflow

Daily development workflow for Bindy contributors.

Development Cycle

  1. Create feature branch
git checkout -b feature/my-feature
  1. Make changes
  • Edit code in src/
  • If modifying CRDs, edit Rust types in src/crd.rs
  • Add tests
  • Update documentation
  1. Regenerate CRDs (if modified)
# If you modified src/crd.rs, regenerate YAML files
cargo run --bin crdgen
# or
make crds
  1. Test locally
cargo test
cargo clippy -- -D warnings
cargo fmt
  1. Validate CRDs
# Ensure generated CRDs are valid
kubectl apply --dry-run=client -f deploy/crds/
  1. Commit changes
git add .
git commit -m "Add feature: description"
  1. Push and create PR
git push origin feature/my-feature
# Create PR on GitHub

CRD Development

IMPORTANT: src/crd.rs is the source of truth. CRD YAML files in deploy/crds/ are auto-generated.

Modifying Existing CRDs

  1. Edit the Rust type in src/crd.rs:
#![allow(unused)]
fn main() {
#[derive(CustomResource, Clone, Debug, Serialize, Deserialize, JsonSchema)]
#[kube(
    group = "bindy.firestoned.io",
    version = "v1alpha1",
    kind = "Bind9Cluster",
    namespaced
)]
#[serde(rename_all = "camelCase")]
pub struct Bind9ClusterSpec {
    pub version: Option<String>,
    // Add new fields here
    pub new_field: Option<String>,
}
}
  1. Regenerate YAML files:
cargo run --bin crdgen
# or
make crds
  1. Verify the generated YAML:
# Check the generated file
cat deploy/crds/bind9clusters.crd.yaml

# Validate it
kubectl apply --dry-run=client -f deploy/crds/bind9clusters.crd.yaml
  1. Update documentation to describe the new field

Adding New CRDs

  1. Define the CustomResource in src/crd.rs
  2. Add to crdgen in src/bin/crdgen.rs:
#![allow(unused)]
fn main() {
generate_crd::<MyNewResource>("mynewresources.crd.yaml", output_dir)?;
}
  1. Regenerate YAMLs: make crds
  2. Export the type in src/lib.rs if needed

Generated YAML Format

All generated CRD files include:

  • Copyright header
  • SPDX license identifier
  • Auto-generated warning

Never edit YAML files directly - they will be overwritten!

Local Testing

# Start kind cluster
kind create cluster --name bindy-dev

# Deploy CRDs (regenerate first if modified)
make crds
kubectl apply -k deploy/crds/

# Run controller locally
RUST_LOG=debug cargo run

Hot Reload

# Auto-rebuild on changes
cargo watch -x 'run --release'

GitHub Pages Setup Guide

This guide explains how to enable GitHub Pages for the Bindy documentation.

Prerequisites

  • Repository must be pushed to GitHub
  • You must have admin access to the repository
  • The .github/workflows/docs.yaml workflow file must be present

Setup Steps

1. Enable GitHub Pages

  1. Go to your repository on GitHub: https://github.com/firestoned/bindy
  2. Click Settings (in the repository menu)
  3. Scroll down to the Pages section in the left sidebar
  4. Click on Pages

2. Configure Source

Under “Build and deployment”:

  1. Source: Select “GitHub Actions”
  2. This will use the workflow in .github/workflows/docs.yaml

That’s it! GitHub will automatically use the workflow.

3. Trigger the First Build

The documentation will be built and deployed automatically when you push to the main branch.

To trigger the first build:

  1. Push any change to main:

    git push origin main
    
  2. Or manually trigger the workflow:

    • Go to Actions tab
    • Click on “Documentation” workflow
    • Click “Run workflow”
    • Select main branch
    • Click “Run workflow”

4. Monitor the Build

  1. Go to the Actions tab in your repository
  2. Click on the “Documentation” workflow run
  3. Watch the build progress
  4. Once complete, the “deploy” job will show the URL

5. Access Your Documentation

Once deployed, your documentation will be available at:

https://firestoned.github.io/bindy/

Verification

Check Deployment Status

  1. Go to SettingsPages
  2. You should see: “Your site is live at https://firestoned.github.io/bindy/”
  3. Click “Visit site” to view the documentation

Verify Documentation Structure

Your deployed site should have:

  • Main documentation (mdBook): https://firestoned.github.io/bindy/
  • API reference (rustdoc): https://firestoned.github.io/bindy/rustdoc/

Troubleshooting

Build Fails

Check workflow logs:

  1. Go to Actions tab
  2. Click on the failed workflow run
  3. Expand the failed step to see the error
  4. Common issues:
    • Rust compilation errors
    • mdBook build errors
    • Missing files

Fix and retry:

  1. Fix the issue locally
  2. Test with make docs
  3. Push the fix to main
  4. GitHub Actions will automatically retry

Pages Not Showing

Verify GitHub Pages is enabled:

  1. Go to SettingsPages
  2. Ensure source is set to “GitHub Actions”
  3. Check that at least one successful deployment has completed

Check permissions:

The workflow needs these permissions (already configured in docs.yaml):

permissions:
  contents: read
  pages: write
  id-token: write

404 Errors on Subpages

Check base URL configuration:

The book.toml has:

site-url = "/bindy/"

This must match your repository name. If your repository is named differently, update this value.

Custom Domain (Optional)

To use a custom domain:

  1. Go to SettingsPages
  2. Under “Custom domain”, enter your domain
  3. Update the CNAME field in book.toml:
    cname = "docs.yourdomain.com"
    
  4. Configure DNS:
    • Add a CNAME record pointing to firestoned.github.io
    • Or A records pointing to GitHub Pages IPs

Updating Documentation

Documentation is automatically deployed on every push to main:

# Make changes to documentation
vim docs/src/introduction.md

# Commit and push
git add docs/src/introduction.md
git commit -m "Update introduction"
git push origin main

# GitHub Actions will automatically build and deploy

Local Preview

Before pushing, preview your changes locally:

# Build and serve documentation
make docs-serve

# Or watch for changes
make docs-watch

# Open http://localhost:3000 in your browser

Workflow Details

The GitHub Actions workflow (.github/workflows/docs.yaml):

  1. Build job:

    • Checks out the repository
    • Sets up Rust toolchain
    • Installs mdBook
    • Builds rustdoc API documentation
    • Builds mdBook user documentation
    • Combines both into a single site
    • Uploads artifact to GitHub Pages
  2. Deploy job (only on main):

    • Deploys the artifact to GitHub Pages
    • Updates the live site

To ensure documentation quality:

  1. Go to SettingsBranches
  2. Add a branch protection rule for main:
    • Require pull request reviews
    • Require status checks (include “Documentation / Build Documentation”)
    • This ensures the documentation builds before merging

Additional Configuration

Custom Theme

The documentation uses a custom theme defined in:

  • docs/theme/custom.css - Custom styling

To customize:

  1. Edit the CSS file
  2. Test locally with make docs-watch
  3. Push to main

Search Configuration

Search is configured in book.toml:

[output.html.search]
enable = true
limit-results = 30

Adjust as needed for your use case.

Support

For issues with GitHub Pages deployment:

  • GitHub Pages Status: https://www.githubstatus.com/
  • GitHub Actions Documentation: https://docs.github.com/en/actions
  • GitHub Pages Documentation: https://docs.github.com/en/pages

For issues with the documentation content:

  • Create an issue: https://github.com/firestoned/bindy/issues
  • Start a discussion: https://github.com/firestoned/bindy/discussions

Architecture Deep Dive

Technical architecture of the Bindy DNS operator.

System Architecture

┌─────────────────────────────────────┐
│     Kubernetes API Server           │
└──────────────┬──────────────────────┘
               │ Watch/Update
     ┌─────────▼────────────┐
     │  Bindy Controller    │
     │  ┌────────────────┐  │
     │  │ Reconcilers    │  │
     │  │  - Bind9Inst   │  │
     │  │  - DNSZone     │  │
     │  │  - Records     │  │
     │  └────────────────┘  │
     └──────┬───────────────┘
            │ Manages
     ┌──────▼────────────────┐
     │  BIND9 Pods           │
     │  ┌──────────────────┐ │
     │  │ ConfigMaps       │ │
     │  │ Deployments      │ │
     │  │ Services         │ │
     │  └──────────────────┘ │
     └───────────────────────┘

Components

Controller

  • Watches CRD resources
  • Reconciles desired vs actual state
  • Manages Kubernetes resources

Reconcilers

  • Per-resource reconciliation logic
  • Idempotent operations
  • Error handling and retries

BIND9 Integration

  • Configuration generation
  • Zone file management
  • BIND9 lifecycle management

See detailed docs:

Controller Design

Design and implementation of the Bindy controller.

Controller Pattern

Bindy implements the Kubernetes controller pattern:

  1. Watch - Monitor CRD resources
  2. Reconcile - Ensure actual state matches desired
  3. Update - Apply changes to Kubernetes resources

Reconciliation Loop

#![allow(unused)]
fn main() {
loop {
    // Get resource from work queue
    let resource = queue.pop();
    
    // Reconcile
    match reconcile(resource).await {
        Ok(_) => {
            // Success - requeue with normal delay
            queue.requeue(resource, Duration::from_secs(300));
        }
        Err(e) => {
            // Error - retry with backoff
            queue.requeue_with_backoff(resource, e);
        }
    }
}
}

State Management

Controller maintains no local state - all state in Kubernetes:

  • CRD resources (desired state)
  • Deployments, Services, ConfigMaps (actual state)
  • Status fields (observed state)

Error Handling

  • Transient errors: Retry with exponential backoff
  • Permanent errors: Update status, log, requeue
  • Resource conflicts: Retry with latest version

Reconciliation Logic

Detailed reconciliation logic for each resource type.

Status Update Optimization

All reconcilers implement status change detection to prevent tight reconciliation loops. Before updating the status subresource, each reconciler checks if the status has actually changed. This prevents unnecessary API calls and reconciliation cycles.

Status is only updated when:

  • Condition type changes
  • Status value changes
  • Message changes
  • Status doesn’t exist yet

This optimization is implemented in:

Bind9Instance Reconciliation

#![allow(unused)]
fn main() {
async fn reconcile_bind9instance(instance: Bind9Instance) -> Result<()> {
    // 1. Build desired resources
    let configmap = build_configmap(&instance);
    let deployment = build_deployment(&instance);
    let service = build_service(&instance);
    
    // 2. Apply or update ConfigMap
    apply_configmap(configmap).await?;
    
    // 3. Apply or update Deployment
    apply_deployment(deployment).await?;
    
    // 4. Apply or update Service
    apply_service(service).await?;
    
    // 5. Update status
    update_status(&instance, "Ready").await?;
    
    Ok(())
}
}

DNSZone Reconciliation

DNSZone reconciliation uses granular status updates to provide real-time progress visibility and better error reporting. The reconciliation follows a multi-phase approach with status updates at each phase.

Reconciliation Flow

#![allow(unused)]
fn main() {
async fn reconcile_dnszone(zone: DNSZone) -> Result<()> {
    // Phase 1: Set Progressing status before primary reconciliation
    update_condition(&zone, "Progressing", "True", "PrimaryReconciling",
                     "Configuring zone on primary instances").await?;

    // Phase 2: Configure zone on primary instances
    let primary_count = add_dnszone(client, &zone, zone_manager).await
        .map_err(|e| {
            // On failure: Set Degraded status (primary failure is fatal)
            update_condition(&zone, "Degraded", "True", "PrimaryFailed",
                           &format!("Failed to configure zone on primaries: {}", e)).await?;
            e
        })?;

    // Phase 3: Set Progressing status after primary success
    update_condition(&zone, "Progressing", "True", "PrimaryReconciled",
                     &format!("Configured on {} primary server(s)", primary_count)).await?;

    // Phase 4: Set Progressing status before secondary reconciliation
    let secondary_msg = format!("Configured on {} primary server(s), now configuring secondaries", primary_count);
    update_condition(&zone, "Progressing", "True", "SecondaryReconciling", &secondary_msg).await?;

    // Phase 5: Configure zone on secondary instances (non-fatal if fails)
    match add_dnszone_to_secondaries(client, &zone, zone_manager).await {
        Ok(secondary_count) => {
            // Phase 6: Success - Set Ready status
            let msg = format!("Configured on {} primary server(s) and {} secondary server(s)",
                            primary_count, secondary_count);
            update_status_with_secondaries(&zone, "Ready", "True", "ReconcileSucceeded",
                                          &msg, secondary_ips).await?;
        }
        Err(e) => {
            // Phase 6: Partial success - Set Degraded status (primaries work, secondaries failed)
            let msg = format!("Configured on {} primary server(s), but secondary configuration failed: {}",
                            primary_count, e);
            update_status_with_secondaries(&zone, "Degraded", "True", "SecondaryFailed",
                                          &msg, secondary_ips).await?;
        }
    }

    Ok(())
}
}

Status Conditions

DNSZone reconciliation uses three condition types:

  • Progressing - During reconciliation phases

    • Reason: PrimaryReconciling - Before primary configuration
    • Reason: PrimaryReconciled - After primary configuration succeeds
    • Reason: SecondaryReconciling - Before secondary configuration
    • Reason: SecondaryReconciled - After secondary configuration succeeds
  • Ready - Successful reconciliation

    • Reason: ReconcileSucceeded - All phases completed successfully
  • Degraded - Partial or complete failure

    • Reason: PrimaryFailed - Primary configuration failed (fatal, reconciliation aborts)
    • Reason: SecondaryFailed - Secondary configuration failed (non-fatal, primaries still work)

Benefits

  1. Real-time progress visibility - Users can see which phase is running
  2. Better error reporting - Know exactly which phase failed (primary vs secondary)
  3. Graceful degradation - Secondary failures don’t break the zone (primaries still work)
  4. Accurate status - Endpoint counts reflect actual configured servers

Record Reconciliation

All record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA) follow a consistent pattern with granular status updates for better observability.

Reconciliation Flow

#![allow(unused)]
fn main() {
async fn reconcile_record(record: Record) -> Result<()> {
    // Phase 1: Set Progressing status before configuration
    update_record_status(&record, "Progressing", "True", "RecordReconciling",
                        "Configuring A record on zone endpoints").await?;

    // Phase 2: Get zone and configure record on all endpoints
    let zone = get_zone(&record.spec.zone).await?;

    match add_record_to_all_endpoints(&zone, &record).await {
        Ok(endpoint_count) => {
            // Phase 3: Success - Set Ready status with endpoint count
            let msg = format!("Record configured on {} endpoint(s)", endpoint_count);
            update_record_status(&record, "Ready", "True", "ReconcileSucceeded", &msg).await?;
        }
        Err(e) => {
            // Phase 3: Failure - Set Degraded status with error details
            let msg = format!("Failed to configure record: {}", e);
            update_record_status(&record, "Degraded", "True", "RecordFailed", &msg).await?;
            return Err(e);
        }
    }

    Ok(())
}
}

Status Conditions

All DNS record types use three condition types:

  • Progressing - During record configuration

    • Reason: RecordReconciling - Before adding record to zone endpoints
  • Ready - Successful configuration

    • Reason: ReconcileSucceeded - Record configured on all endpoints
    • Message includes count of configured endpoints (e.g., “Record configured on 3 endpoint(s)”)
  • Degraded - Configuration failure

    • Reason: RecordFailed - Failed to configure record (includes error details)

Benefits

  1. Real-time progress - See when records are being configured
  2. Better debugging - Know immediately if/why a record failed
  3. Accurate reporting - Status shows exact number of endpoints configured
  4. Consistent with zones - Same status pattern as DNSZone reconciliation

Supported Record Types

All 8 record types use this granular status approach:

  • A - IPv4 address records
  • AAAA - IPv6 address records
  • CNAME - Canonical name (alias) records
  • MX - Mail exchange records
  • TXT - Text records (SPF, DKIM, DMARC, etc.)
  • NS - Nameserver delegation records
  • SRV - Service location records
  • CAA - Certificate authority authorization records

Reconciler Hierarchy and Delegation

This document describes the simplified reconciler architecture in Bindy, showing how each controller watches for resources and delegates to sub-resources.

Overview

Bindy follows a hierarchical delegation pattern where each reconciler is responsible for creating and managing its immediate child resources. This creates a clean separation of concerns and makes the system easier to understand and maintain.

graph TD
    GC[Bind9GlobalCluster<br/>cluster-scoped] -->|creates| BC[Bind9Cluster<br/>namespace-scoped]
    BC -->|creates| BI[Bind9Instance<br/>namespace-scoped]
    BI -->|creates| RES[Kubernetes Resources<br/>ServiceAccount, Secret,<br/>ConfigMap, Deployment, Service]
    BI -.->|targets| DZ[DNSZone<br/>namespace-scoped]
    BI -.->|targets| REC[DNS Records<br/>namespace-scoped]
    DZ -->|creates zones via<br/>bindcar HTTP API| BIND9[BIND9 Pods]
    REC -->|creates records via<br/>hickory DNS UPDATE| BIND9
    REC -->|notifies via<br/>bindcar HTTP API| BIND9

    style GC fill:#e1f5ff
    style BC fill:#e1f5ff
    style BI fill:#e1f5ff
    style DZ fill:#fff4e1
    style REC fill:#fff4e1
    style RES fill:#e8f5e9
    style BIND9 fill:#f3e5f5

Reconciler Details

1. Bind9GlobalCluster Reconciler

Scope: Cluster-scoped resource

Purpose: Creates Bind9Cluster resources in desired namespaces to enable multi-tenant DNS infrastructure.

Watches: Bind9GlobalCluster resources

Creates: Bind9Cluster resources in the namespace specified in the spec, or defaults to dns-system

Change Detection:

  • Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
  • Desired vs actual state: Verifies all Bind9Cluster resources exist in target namespaces

Implementation: src/reconcilers/bind9globalcluster.rs

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: global-dns
spec:
  namespaces:
    - platform-dns
    - team-web
    - team-api
  primaryReplicas: 2
  secondaryReplicas: 3

Creates Bind9Cluster resources in each namespace: platform-dns, team-web, team-api.


2. Bind9Cluster Reconciler

Scope: Namespace-scoped resource

Purpose: Creates and manages Bind9Instance resources based on desired replica counts for primary and secondary servers.

Watches: Bind9Cluster resources

Creates:

  • Bind9Instance resources for primaries (e.g., my-cluster-primary-0, my-cluster-primary-1)
  • Bind9Instance resources for secondaries (e.g., my-cluster-secondary-0, my-cluster-secondary-1)
  • ConfigMap with shared BIND9 configuration (optional, for standalone configs)

Change Detection:

  • Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
  • Desired vs actual state:
    • Verifies all Bind9Instance resources exist
    • Scales instances up/down based on primaryReplicas and secondaryReplicas

Implementation: src/reconcilers/bind9cluster.rs

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: my-cluster
  namespace: platform-dns
spec:
  primaryReplicas: 2
  secondaryReplicas: 3

Creates:

  • my-cluster-primary-0, my-cluster-primary-1 (primaries)
  • my-cluster-secondary-0, my-cluster-secondary-1, my-cluster-secondary-2 (secondaries)

3. Bind9Instance Reconciler

Scope: Namespace-scoped resource

Purpose: Creates all Kubernetes resources needed to run a single BIND9 server pod.

Watches: Bind9Instance resources

Creates:

  • ServiceAccount: For pod identity and RBAC
  • Secret: Contains auto-generated RNDC key (HMAC-SHA256) for authentication
  • ConfigMap: BIND9 configuration (named.conf, zone files, etc.) - only for standalone instances
  • Deployment: Runs the BIND9 pod with bindcar HTTP API sidecar
  • Service: Exposes DNS (UDP/TCP 53) and HTTP API (TCP 8080) ports

Change Detection:

  • Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
  • Desired vs actual state (drift detection):
    • Checks if Deployment resource exists
    • Recreates missing resources if detected

Implementation: src/reconcilers/bind9instance.rs

Drift Detection Logic:

#![allow(unused)]
fn main() {
// Only reconcile resources if:
// 1. Spec changed (generation mismatch), OR
// 2. We haven't processed this resource yet (no observed_generation), OR
// 3. Resources are missing (drift detected)
let should_reconcile = should_reconcile(current_generation, observed_generation);

if !should_reconcile && deployment_exists {
    // Skip reconciliation - spec unchanged and resources exist
    return Ok(());
}

if !should_reconcile && !deployment_exists {
    // Drift detected - recreate missing resources
    info!("Spec unchanged but Deployment missing - drift detected, reconciling resources");
}
}

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: my-cluster-primary-0
  namespace: platform-dns
spec:
  role: Primary
  clusterRef: my-cluster
  replicas: 1

Creates: ServiceAccount, Secret, ConfigMap, Deployment, Service for my-cluster-primary-0.


4. DNSZone Reconciler

Scope: Namespace-scoped resource

Purpose: Creates DNS zones in ALL BIND9 instances (primary and secondary) via the bindcar HTTP API.

Watches: DNSZone resources

Creates: DNS zones in BIND9 using the bindcar HTTP API sidecar

Change Detection:

  • Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
  • Desired vs actual state:
    • Checks if zone exists using zone_manager.zone_exists() via HTTP API
    • Early returns if spec unchanged

Implementation: src/reconcilers/dnszone.rs

Protocol Details:

  • Zone operations: HTTP API via bindcar sidecar (port 8080)
  • Endpoints:
    • POST /api/addzone/{zone} - Add primary/secondary zone
    • DELETE /api/delzone/{zone} - Delete zone
    • POST /api/notify/{zone} - Trigger zone transfer (NOTIFY)
    • GET /api/zonestatus/{zone} - Check if zone exists

Logic Flow:

  1. Finds all primary instances for the cluster (namespace-scoped or global)
  2. Loads RNDC key for each instance (from Secret {instance}-rndc-key)
  3. Calls zone_manager.add_zones() via HTTP API on all primary endpoints
  4. Finds all secondary instances for the cluster
  5. Calls zone_manager.add_secondary_zone() via HTTP API on all secondary endpoints
  6. Notifies secondaries via zone_manager.notify_zone() to trigger zone transfer

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-zone
  namespace: platform-dns
spec:
  zoneName: example.com
  clusterRef: my-cluster
  soa:
    primaryNameServer: ns1.example.com
    adminEmail: admin.example.com
    ttl: 3600

Creates zone example.com in all instances of my-cluster via HTTP API.


5. DNS Record Reconcilers

Scope: Namespace-scoped resources

Purpose: Create DNS records in zones using hickory DNS UPDATE (RFC 2136) and notify secondaries via bindcar HTTP API.

Watches: ARecord, AAAARecord, CNAMERecord, TXTRecord, MXRecord, NSRecord, SRVRecord, CAARecord

Creates: DNS records in BIND9 using two protocols:

  1. DNS UPDATE (RFC 2136) via hickory client - for creating records
  2. HTTP API via bindcar sidecar - for notifying secondaries

Change Detection:

  • Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
  • Desired vs actual state:
    • Checks if zone exists using HTTP API before adding records
    • Returns error if zone doesn’t exist

Implementation: src/reconcilers/records.rs

Protocol Details:

OperationProtocolPortAuthentication
Check zone existsHTTP API (bindcar)8080ServiceAccount token
Add/update recordsDNS UPDATE (hickory)53 (TCP)TSIG (RNDC key)
Notify secondariesHTTP API (bindcar)8080ServiceAccount token

Logic Flow:

  1. Looks up the DNSZone resource to get zone info
  2. Finds all primary instances for the zone’s cluster
  3. For each primary instance:
    • Checks if zone exists via HTTP API (port 8080)
    • Loads RNDC key from Secret
    • Creates TSIG signer for authentication
    • Sends DNS UPDATE message via hickory client (port 53 TCP)
  4. After all records are added, notifies first primary via HTTP API to trigger zone transfer

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com
  namespace: platform-dns
spec:
  zone: example.com
  name: www
  ipv4: 192.0.2.1
  ttl: 300

Creates A record www.example.com192.0.2.1 in all primary instances via DNS UPDATE, then notifies secondaries via HTTP API.


Change Detection Logic

All reconcilers implement the “changed” detection pattern, which means they reconcile when:

  1. Spec changed: metadata.generationstatus.observed_generation
  2. First reconciliation: status.observed_generation is None
  3. Drift detected: Desired state (YAML) ≠ actual state (cluster)

Implementation: should_reconcile()

Located in src/reconcilers/mod.rs:127-133:

#![allow(unused)]
fn main() {
pub fn should_reconcile(current_generation: Option<i64>, observed_generation: Option<i64>) -> bool {
    match (current_generation, observed_generation) {
        (Some(current), Some(observed)) => current != observed,
        (Some(_), None) => true, // First reconciliation
        _ => false,              // No generation tracking available
    }
}
}

Kubernetes Generation Semantics

  • metadata.generation: Incremented by Kubernetes API server only when spec changes
  • status.observed_generation: Set by controller to match metadata.generation after successful reconciliation
  • Status-only updates: Do NOT increment metadata.generation, preventing unnecessary reconciliations

Example: Reconciliation Flow

sequenceDiagram
    participant User
    participant K8s API
    participant Reconciler
    participant Status

    User->>K8s API: Create DNSZone (generation=1)
    K8s API->>Reconciler: Watch event (generation=1)
    Reconciler->>Reconciler: should_reconcile(1, None) → true
    Reconciler->>Reconciler: Create zone via HTTP API
    Reconciler->>Status: Update observed_generation=1

    User->>K8s API: Update DNSZone spec (generation=2)
    K8s API->>Reconciler: Watch event (generation=2)
    Reconciler->>Reconciler: should_reconcile(2, 1) → true
    Reconciler->>Reconciler: Update zone via HTTP API
    Reconciler->>Status: Update observed_generation=2

    Note over Reconciler: Status-only update (no spec change)
    Reconciler->>Status: Update phase=Ready (generation stays 2)
    Reconciler->>Reconciler: should_reconcile(2, 2) → false
    Reconciler->>Reconciler: Skip reconciliation ✓

Protocol Summary

ComponentCreatesProtocolPortAuthentication
Bind9GlobalClusterBind9ClusterKubernetes API-ServiceAccount
Bind9ClusterBind9InstanceKubernetes API-ServiceAccount
Bind9InstanceK8s ResourcesKubernetes API-ServiceAccount
DNSZoneZones in BIND9HTTP API (bindcar)8080ServiceAccount token
DNS RecordsRecords in zonesDNS UPDATE (hickory)53 TCPTSIG (RNDC key)
DNS RecordsNotify secondariesHTTP API (bindcar)8080ServiceAccount token

Key Architectural Principles

1. Hierarchical Delegation

Each reconciler creates and manages only its immediate children:

  • Bind9GlobalClusterBind9Cluster
  • Bind9ClusterBind9Instance
  • Bind9Instance → Kubernetes resources

2. Namespace Scoping

All resources (except Bind9GlobalCluster) are namespace-scoped, enabling multi-tenancy:

  • Teams can manage their own DNS infrastructure in their namespaces
  • No cross-namespace resource access required

3. Change Detection

All reconcilers implement consistent change detection:

  • Skip work if spec unchanged and resources exist
  • Detect drift and recreate missing resources
  • Use generation tracking to avoid unnecessary reconciliations

4. Protocol Separation

  • HTTP API (bindcar): Zone-level operations (add, delete, notify)
  • DNS UPDATE (hickory): Record-level operations (add, update, delete records)
  • Kubernetes API: Resource lifecycle management

5. Idempotency

All operations are idempotent:

  • Adding an existing zone returns success
  • Adding an existing record updates it
  • Deleting a non-existent resource returns success

6. Error Handling

Each reconciler handles errors gracefully:

  • Updates status with error conditions
  • Retries on transient failures (exponential backoff)
  • Requeues on permanent errors with longer delays

Owner References and Resource Cleanup

Bindy implements proper Kubernetes owner references to ensure automatic cascade deletion and prevent resource leaks.

What are Owner References?

Owner references are Kubernetes metadata that establish parent-child relationships between resources. When set, Kubernetes automatically:

  • Garbage collects child resources when the parent is deleted
  • Blocks deletion of the parent if children still exist (when blockOwnerDeletion: true)
  • Shows ownership in resource metadata for easy tracking

Owner Reference Hierarchy in Bindy

graph TD
    GC[Bind9GlobalCluster<br/>cluster-scoped] -->|ownerReference| BC[Bind9Cluster<br/>namespace-scoped]
    BC -->|ownerReference| BI[Bind9Instance<br/>namespace-scoped]
    BI -->|ownerReferences| DEP[Deployment]
    BI -->|ownerReferences| SVC[Service]
    BI -->|ownerReferences| CM[ConfigMap]
    BI -->|ownerReferences| SEC[Secret]

    style GC fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style BC fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style BI fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style DEP fill:#e8f5e9,stroke:#4caf50
    style SVC fill:#e8f5e9,stroke:#4caf50
    style CM fill:#e8f5e9,stroke:#4caf50
    style SEC fill:#e8f5e9,stroke:#4caf50

Implementation Details

1. Bind9GlobalCluster → Bind9Cluster

Location: src/reconcilers/bind9globalcluster.rs:340-352

#![allow(unused)]
fn main() {
// Create ownerReference to global cluster (cluster-scoped can own namespace-scoped)
let owner_ref = OwnerReference {
    api_version: API_GROUP_VERSION.to_string(),
    kind: KIND_BIND9_GLOBALCLUSTER.to_string(),
    name: global_cluster_name.clone(),
    uid: global_cluster.metadata.uid.clone().unwrap_or_default(),
    controller: Some(true),
    block_owner_deletion: Some(true),
};
}

Key Points:

  • Cluster-scoped resources CAN own namespace-scoped resources
  • controller: true means this is the primary controller for the child
  • block_owner_deletion: true prevents deleting parent while children exist
  • Finalizer ensures manual cleanup of Bind9Cluster resources before parent deletion

2. Bind9Cluster → Bind9Instance

Location: src/reconcilers/bind9cluster.rs:592-599

#![allow(unused)]
fn main() {
// Create ownerReference to the Bind9Cluster
let owner_ref = OwnerReference {
    api_version: API_GROUP_VERSION.to_string(),
    kind: KIND_BIND9_CLUSTER.to_string(),
    name: cluster_name.clone(),
    uid: cluster.metadata.uid.clone().unwrap_or_default(),
    controller: Some(true),
    block_owner_deletion: Some(true),
};
}

Key Points:

  • Both resources are namespace-scoped, so they must be in the same namespace
  • Finalizer ensures manual cleanup of Bind9Instance resources before parent deletion
  • Each instance created includes this owner reference

3. Bind9Instance → Kubernetes Resources

Location: src/bind9_resources.rs:188-197

#![allow(unused)]
fn main() {
pub fn build_owner_references(instance: &Bind9Instance) -> Vec<OwnerReference> {
    vec![OwnerReference {
        api_version: API_GROUP_VERSION.to_string(),
        kind: KIND_BIND9_INSTANCE.to_string(),
        name: instance.name_any(),
        uid: instance.metadata.uid.clone().unwrap_or_default(),
        controller: Some(true),
        block_owner_deletion: Some(true),
    }]
}
}

Resources with Owner References:

  • Deployment: Managed by Bind9Instance
  • Service: Managed by Bind9Instance
  • ConfigMap: Managed by Bind9Instance (standalone instances only)
  • Secret (RNDC key): Managed by Bind9Instance
  • ServiceAccount: Shared resource, no owner reference (prevents conflicts)

Deletion Flow

When a Bind9GlobalCluster is deleted, the following cascade occurs:

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant GC as Bind9GlobalCluster<br/>Reconciler
    participant C as Bind9Cluster<br/>Reconciler
    participant I as Bind9Instance<br/>Reconciler
    participant GC_Obj as Garbage<br/>Collector

    User->>K8s: kubectl delete bind9globalcluster global-dns
    K8s->>GC: Reconcile (deletion_timestamp set)
    GC->>GC: Check finalizer present

    Note over GC: Step 1: Delete managed Bind9Cluster resources
    GC->>K8s: List Bind9Cluster with labels<br/>managed-by=Bind9GlobalCluster
    K8s-->>GC: Return managed clusters

    loop For each Bind9Cluster
        GC->>K8s: Delete Bind9Cluster
        K8s->>C: Reconcile (deletion_timestamp set)
        C->>C: Check finalizer present

        Note over C: Step 2: Delete managed Bind9Instance resources
        C->>K8s: List Bind9Instance with clusterRef
        K8s-->>C: Return managed instances

        loop For each Bind9Instance
            C->>K8s: Delete Bind9Instance
            K8s->>I: Reconcile (deletion_timestamp set)
            I->>I: Check finalizer present

            Note over I: Step 3: Delete Kubernetes resources
            I->>K8s: Delete Deployment, Service, ConfigMap, Secret
            K8s-->>I: Resources deleted

            I->>K8s: Remove finalizer from Bind9Instance
            K8s->>GC_Obj: Bind9Instance deleted
        end

        C->>K8s: Remove finalizer from Bind9Cluster
        K8s->>GC_Obj: Bind9Cluster deleted
    end

    GC->>K8s: Remove finalizer from Bind9GlobalCluster
    K8s->>GC_Obj: Bind9GlobalCluster deleted

    Note over GC_Obj: Kubernetes garbage collector<br/>cleans up any remaining<br/>resources with ownerReferences

Why Both Finalizers AND Owner References?

Bindy uses both finalizers and owner references for robust cleanup:

MechanismPurposeWhen It Runs
Owner ReferencesAutomatic cleanup by KubernetesAfter parent deletion completes
FinalizersManual cleanup of childrenBefore parent deletion completes

The Flow:

  1. Finalizer runs first: Lists and deletes managed children explicitly
  2. Owner reference runs second: Kubernetes garbage collector cleans up any remaining resources

Why this combination?

  • Finalizers: Give control over deletion order and allow cleanup actions (like calling HTTP APIs)
  • Owner References: Provide safety net if finalizer fails or is bypassed
  • Together: Ensure no resource leaks under any circumstances

Verifying Owner References

You can verify owner references are set correctly:

# Check Bind9Cluster owner reference
kubectl get bind9cluster <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

# Check Bind9Instance owner reference
kubectl get bind9instance <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

# Check Deployment owner reference
kubectl get deployment <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

Expected output:

ownerReferences:
- apiVersion: bindy.firestoned.io/v1alpha1
  blockOwnerDeletion: true
  controller: true
  kind: Bind9GlobalCluster  # or Bind9Cluster, Bind9Instance
  name: global-dns
  uid: 12345678-1234-1234-1234-123456789abc

Troubleshooting

Issue: Resources not being deleted

Check:

  1. Verify owner references are set: kubectl get <resource> -o yaml | grep ownerReferences
  2. Check if finalizers are blocking deletion: kubectl get <resource> -o yaml | grep finalizers
  3. Verify garbage collector is running: kubectl get events --field-selector reason=Garbage

Solution:

  • If owner reference is missing, the resource was created before the fix (manual deletion required)
  • If finalizer is stuck, check reconciler logs for errors
  • If garbage collector is not running, check cluster health

Issue: Cannot delete parent resource

Symptom: kubectl delete hangs or shows “waiting for deletion”

Cause: Finalizer is running and cleaning up children

Expected Behavior: This is normal! Wait for the finalizer to complete.

Check Progress:

# Watch deletion progress
kubectl get bind9globalcluster <name> -w

# Check reconciler logs
kubectl logs -n bindy-system -l app=bindy -f

BIND9 Integration

How Bindy integrates with BIND9 DNS server.

Configuration Generation

Bindy generates BIND9 configuration from Bind9Instance specs:

named.conf

options {
    directory "/var/lib/bind";
    recursion no;
    allow-query { 0.0.0.0/0; };
};

zone "example.com" {
    type master;
    file "/var/lib/bind/zones/example.com.zone";
};

Zone Files

$TTL 3600
@   IN  SOA ns1.example.com. admin.example.com. (
        2024010101  ; serial
        3600        ; refresh
        600         ; retry
        604800      ; expire
        86400 )     ; negative TTL
    IN  NS  ns1.example.com.
www IN  A   192.0.2.1

Zone File Management

Operations:

  • Create new zones
  • Add/update records
  • Increment serial numbers
  • Reload BIND9 configuration

BIND9 Lifecycle

  1. ConfigMap - Contains configuration files
  2. Volume Mount - Mount ConfigMap to BIND9 pod
  3. Init - BIND9 starts with configuration
  4. Reload - rndc reload when configuration changes

Future Enhancements

  • Dynamic DNS updates (nsupdate)
  • TSIG key management
  • Zone transfer monitoring
  • Query statistics collection

Contributing

Thank you for contributing to Bindy!

Ways to Contribute

  • Report bugs
  • Suggest features
  • Improve documentation
  • Submit code changes
  • Review pull requests

Getting Started

  1. Set up development environment
  2. Read Code Style
  3. Check Testing Guidelines
  4. Follow PR Process

Code of Conduct

Be respectful, inclusive, and professional.

Reporting Issues

Use GitHub issues with:

  • Clear description
  • Steps to reproduce
  • Expected vs actual behavior
  • Environment details

Feature Requests

Open an issue describing:

  • Use case
  • Proposed solution
  • Alternatives considered

Questions

Ask questions in:

  • GitHub Discussions
  • Issues (tagged as question)

License

Contributor License Agreement

By contributing to Bindy, you agree that:

  1. Your contributions will be licensed under the MIT License - The same license that covers the project
  2. You have the right to submit the work - You own the copyright or have permission from the copyright holder
  3. You grant a perpetual license - The project maintainers receive a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license to use, modify, and distribute your contributions

What This Means

When you submit a pull request or contribution to Bindy:

  • ✅ Your code will be licensed under the MIT License
  • ✅ You retain copyright to your contributions
  • ✅ Others can use your contributions under the MIT License terms
  • ✅ Your contributions can be used in both open source and commercial projects
  • ✅ You grant irrevocable permission for the project to use your work

SPDX License Identifiers

All source code files in Bindy include SPDX license identifiers. When adding new files, please include the following header:

For Rust files:

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

For shell scripts:

#!/usr/bin/env bash
# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

For YAML/configuration files:

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

For Makefiles and Dockerfiles:

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

Why SPDX Identifiers?

SPDX (Software Package Data Exchange) identifiers provide:

  • Machine-readable license information - Automated tools can scan and verify licenses
  • SBOM generation - Software Bill of Materials can be automatically created
  • License compliance - Makes it easier to track and verify licensing
  • Industry standard - Widely adopted across open source projects

Learn more: https://spdx.dev/

Third-Party Code

If you’re adding code from another source:

  1. Ensure compatibility - The license must be compatible with MIT
  2. Preserve original copyright - Keep the original copyright notice
  3. Document the source - Note where the code came from
  4. Check license requirements - Some licenses require attribution or notices

Compatible licenses include:

  • ✅ MIT License
  • ✅ Apache License 2.0
  • ✅ BSD licenses (2-clause, 3-clause)
  • ✅ ISC License
  • ✅ Public Domain (CC0, Unlicense)

License Questions

If you have questions about:

  • Whether your contribution is compatible
  • License requirements for third-party code
  • Copyright or attribution

Please ask in your pull request or open a discussion before submitting.

Additional Resources

Code Style

Code style guidelines for Bindy.

Rust Style

Follow official Rust style guide:

# Format code
cargo fmt

# Check for issues
cargo clippy

Naming Conventions

  • snake_case for functions, variables
  • PascalCase for types, traits
  • SCREAMING_SNAKE_CASE for constants

Documentation

Document public APIs:

#![allow(unused)]
fn main() {
/// Reconciles a Bind9Instance resource.
///
/// Creates or updates Kubernetes resources for BIND9.
///
/// # Arguments
///
/// * `instance` - The Bind9Instance to reconcile
///
/// # Returns
///
/// Ok(()) on success, Err on failure
pub async fn reconcile(instance: Bind9Instance) -> Result<()> {
    // Implementation
}
}

Error Handling

Use anyhow::Result for errors:

#![allow(unused)]
fn main() {
use anyhow::{Context, Result};

fn do_thing() -> Result<()> {
    some_operation()
        .context("Failed to do thing")?;
    Ok(())
}
}

Testing

Write tests for all public functions:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_function() {
        assert_eq!(function(), expected);
    }
}
}

Testing Guidelines

Guidelines for writing tests in Bindy.

Test Structure

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_name() {
        // Arrange
        let input = create_input();
        
        // Act
        let result = function_under_test(input);
        
        // Assert
        assert_eq!(result, expected);
    }
}
}

Unit Tests

Test individual functions:

#![allow(unused)]
fn main() {
#[test]
fn test_build_configmap() {
    let instance = create_test_instance();
    let configmap = build_configmap(&instance);
    
    assert_eq!(configmap.metadata.name, Some("test".to_string()));
}
}

Integration Tests

Test with Kubernetes:

#![allow(unused)]
fn main() {
#[tokio::test]
#[ignore]  // Requires cluster
async fn test_full_reconciliation() {
    let client = Client::try_default().await.unwrap();
    // Test logic
}
}

Test Coverage

Aim for >80% coverage on new code.

CI Tests

All tests run on:

  • Pull requests
  • Main branch commits

Pull Request Process

Process for submitting and reviewing pull requests.

Before Submitting

  1. Create issue (for non-trivial changes)
  2. Create branch from main
  3. Make changes with tests
  4. Run checks locally:
cargo test
cargo clippy
cargo fmt

PR Requirements

  • Tests pass
  • Code formatted
  • Documentation updated
  • Commit messages clear
  • PR description complete

PR Template

## Description
Brief description of changes

## Related Issue
Fixes #123

## Changes
- Added feature X
- Fixed bug Y

## Testing
How changes were tested

## Checklist
- [ ] Tests added/updated
- [ ] Documentation updated
- [ ] Changelog updated (if needed)

Review Process

  1. Automated checks must pass
  2. Maintainer review required
  3. Address feedback
  4. Merge when approved

After Merge

Changes included in next release.

Security & Compliance

Bindy is designed to operate in highly regulated environments, including banking, financial services, healthcare, and government sectors. This section covers both security practices and compliance frameworks implemented throughout the project.


Security

The Security section documents the technical controls, threat models, and security architecture implemented in Bindy:

These documents provide technical guidance for security engineers, platform teams, and auditors reviewing Bindy’s security posture.


Compliance

The Compliance section maps Bindy’s implementation to specific regulatory frameworks and industry standards:

These documents provide evidence and traceability for compliance audits, including control implementation details and evidence collection procedures.


Who Should Read This?

  • Security Engineers: Focus on the Security section for technical controls and threat models
  • Compliance Officers: Focus on the Compliance section for regulatory framework mappings
  • Auditors: Review both sections for complete security and compliance evidence
  • Platform Engineers: Reference Security section for operational security practices
  • Risk Managers: Review Compliance section for risk management frameworks

Key Principles

Bindy’s security and compliance approach is built on these core principles:

  1. Zero Trust Architecture: Never trust, always verify - all access is authenticated and authorized
  2. Least Privilege: Minimal RBAC permissions, time-limited credentials, no shared secrets
  3. Defense in Depth: Multiple layers of security controls (network, application, data)
  4. Auditability: Comprehensive logging, immutable audit trails, cryptographic signatures
  5. Automation: Security controls enforced through CI/CD, not manual processes
  6. Transparency: Open documentation, public security policies, no security through obscurity

Continuous Improvement

Security and compliance are ongoing processes, not one-time achievements. Bindy maintains:

  • Weekly vulnerability scans with automated dependency updates
  • Quarterly security audits by independent third parties
  • Annual compliance reviews for all regulatory frameworks
  • Continuous monitoring of security controls and audit logs
  • Incident response drills to validate procedures and playbooks

For security issues, see our Vulnerability Disclosure Policy.

Security Architecture - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 6.4.1, Basel III


Table of Contents


Overview

This document describes the security architecture of the Bindy DNS Controller, including authentication, authorization, secrets management, network segmentation, and container security. The architecture follows defense-in-depth principles with multiple security layers.

Security Principles

  1. Least Privilege: All components have minimal permissions required for their function
  2. Defense in Depth: Multiple security layers protect against single point of failure
  3. Zero Trust: No implicit trust within the cluster; all access is authenticated and authorized
  4. Immutability: Container filesystems are read-only; configuration is declarative
  5. Auditability: All security-relevant events are logged and traceable

Security Domains

Domain 1: Development & CI/CD

Purpose: Code development, review, build, and release

Components:

  • GitHub repository (source code)
  • GitHub Actions (CI/CD pipelines)
  • Container Registry (ghcr.io)
  • Developer workstations

Security Controls:

  • Code Signing: All commits cryptographically signed (GPG/SSH) - C-1
  • Code Review: 2+ reviewers required for all PRs
  • Vulnerability Scanning: cargo-audit + Trivy in CI/CD - C-3
  • SBOM Generation: Software Bill of Materials for all releases
  • Branch Protection: Signed commits required, no direct pushes to main
  • 2FA: Two-factor authentication required for all contributors

Trust Level: High (controls ensure code integrity)


Domain 2: Kubernetes Control Plane

Purpose: Kubernetes API server, scheduler, controller-manager, etcd

Components:

  • Kubernetes API server
  • etcd (cluster state storage)
  • Scheduler
  • Controller-manager

Security Controls:

  • RBAC: Role-Based Access Control enforced for all API requests
  • Encryption at Rest: etcd data encrypted (including Secrets)
  • TLS: All control plane communication encrypted
  • Audit Logging: All API requests logged
  • Pod Security Admission: Enforces Pod Security Standards

Trust Level: Critical (compromise of control plane = cluster compromise)


Domain 3: dns-system Namespace

Purpose: Bindy controller and BIND9 pods

Components:

  • Bindy controller (Deployment)
  • BIND9 primary (StatefulSet)
  • BIND9 secondaries (StatefulSet)
  • ConfigMaps (BIND9 configuration)
  • Secrets (RNDC keys)
  • Services (DNS, RNDC endpoints)

Security Controls:

  • RBAC Least Privilege: Controller has minimal permissions - C-2
  • Non-Root Containers: All pods run as uid 1000+
  • Read-Only Filesystem: Immutable container filesystems
  • Pod Security Standards: Restricted profile enforced
  • Resource Limits: CPU/memory limits prevent DoS
  • Network Policies (planned - L-1): Restrict pod-to-pod communication

Trust Level: High (protected by RBAC, Pod Security Standards)


Domain 4: Tenant Namespaces

Purpose: DNS zone management by application teams

Components:

  • DNSZone custom resources
  • DNS record custom resources (ARecord, CNAMERecord, etc.)
  • Application pods (may read DNS records)

Security Controls:

  • Namespace Isolation: Teams cannot access other namespaces
  • RBAC: Teams can only manage their own DNS zones
  • CRD Validation: OpenAPI v3 schema validation on all CRs
  • Admission Webhooks (planned): Additional validation for DNS records

Trust Level: Medium (tenants are trusted but isolated)


Domain 5: External Network

Purpose: Public internet (DNS clients)

Components:

  • DNS clients (recursive resolvers, end users)
  • LoadBalancer/NodePort services exposing port 53

Security Controls:

  • Rate Limiting: BIND9 rate-limit directive prevents query floods
  • AXFR Restrictions: Zone transfers only to known secondaries
  • DNSSEC (planned): Cryptographic signing of DNS responses
  • Edge DDoS Protection (planned): CloudFlare, AWS Shield

Trust Level: Untrusted (all traffic assumed hostile)


Data Flow Diagrams

Diagram 1: DNS Zone Reconciliation Flow

sequenceDiagram
    participant Dev as Developer
    participant Git as Git Repository
    participant K8s as Kubernetes API
    participant Ctrl as Bindy Controller
    participant CM as ConfigMap
    participant Sec as Secret
    participant BIND as BIND9 Pod

    Dev->>Git: Push DNSZone CR (GitOps)
    Git->>K8s: FluxCD applies CR
    K8s->>Ctrl: Watch event (DNSZone created/updated)
    Ctrl->>K8s: Read DNSZone spec
    Ctrl->>K8s: Read Bind9Instance CR
    Ctrl->>Sec: Read RNDC key
    Note over Sec: Audit: Controller read secret<br/>ServiceAccount: bindy<br/>Timestamp: 2025-12-17 10:23:45
    Ctrl->>CM: Create/Update ConfigMap<br/>(named.conf, zone file)
    Ctrl->>BIND: Send RNDC command<br/>(reload zone)
    BIND->>CM: Load updated zone file
    BIND-->>Ctrl: Reload successful
    Ctrl->>K8s: Update DNSZone status<br/>(Ready=True)

Security Notes:

  • ✅ All API calls authenticated with ServiceAccount token (JWT)
  • ✅ RBAC enforced at every step (controller has least privilege)
  • ✅ Secret read is audited (H-3 planned)
  • ✅ RNDC communication uses HMAC key authentication
  • ✅ ConfigMap is immutable (recreated on change, not modified)

Diagram 2: DNS Query Flow

sequenceDiagram
    participant Client as DNS Client<br/>(Untrusted)
    participant LB as LoadBalancer
    participant BIND1 as BIND9 Primary
    participant BIND2 as BIND9 Secondary
    participant CM as ConfigMap<br/>(Zone Data)

    Client->>LB: DNS Query (UDP 53)<br/>example.com A?
    Note over LB: Rate limiting<br/>DDoS protection (planned)
    LB->>BIND1: Forward query
    BIND1->>CM: Read zone file<br/>(cached in memory)
    BIND1-->>LB: DNS Response<br/>93.184.216.34
    LB-->>Client: DNS Response

    Note over BIND1,BIND2: Zone replication (AXFR/IXFR)
    BIND1->>BIND2: Notify (zone updated)
    BIND2->>BIND1: AXFR request<br/>(authenticated with allow-transfer)
    BIND1-->>BIND2: Zone transfer
    BIND2->>CM: Update local zone cache

Security Notes:

  • ✅ DNS port 53 is public (required for DNS service)
  • ✅ Rate limiting prevents query floods
  • ✅ AXFR restricted to known secondary IPs
  • ✅ Zone data is read-only in BIND9 (managed by controller)
  • ❌ DNSSEC (planned): Would sign responses cryptographically

Diagram 3: Secret Access Flow

sequenceDiagram
    participant Ctrl as Bindy Controller
    participant K8s as Kubernetes API
    participant etcd as etcd<br/>(Encrypted at Rest)
    participant Audit as Audit Log

    Ctrl->>K8s: GET /api/v1/namespaces/dns-system/secrets/rndc-key
    Note over K8s: Authentication: JWT<br/>Authorization: RBAC
    K8s->>Audit: Log API request<br/>User: system:serviceaccount:dns-system:bindy<br/>Verb: get<br/>Resource: secrets/rndc-key<br/>Result: allowed
    K8s->>etcd: Read secret (encrypted)
    etcd-->>K8s: Return encrypted data
    K8s-->>Ctrl: Return secret (decrypted)
    Note over Ctrl: Controller uses RNDC key<br/>to authenticate to BIND9

Security Notes:

  • ✅ Secrets encrypted at rest in etcd
  • ✅ Secrets transmitted over TLS (in transit)
  • ✅ RBAC limits secret read access to controller only
  • ✅ Kubernetes audit log captures all secret access
  • ❌ Dedicated secret access audit trail (H-3 planned): More visible tracking

Diagram 4: Container Image Supply Chain

flowchart TD
    Dev[Developer] -->|Signed Commit| Git[Git Repository]
    Git -->|Trigger| CI[GitHub Actions CI/CD]
    CI -->|cargo build| Bin[Rust Binary]
    CI -->|cargo audit| Audit[Vulnerability Scan]
    Audit -->|Pass| Bin
    Bin -->|Multi-stage build| Docker[Docker Build]
    Docker -->|Trivy scan| Scan[Container Scan]
    Scan -->|Pass| Sign[Sign Image<br/>Provenance + SBOM]
    Sign -->|Push| Reg[Container Registry<br/>ghcr.io]
    Reg -->|Pull| K8s[Kubernetes Cluster]
    K8s -->|Verify| Pod[Controller Pod]

    style Git fill:#90EE90
    style Audit fill:#FFD700
    style Scan fill:#FFD700
    style Sign fill:#90EE90
    style Pod fill:#90EE90

Security Controls:

  • C-1: All commits signed (GPG/SSH)
  • C-3: Vulnerability scanning (cargo-audit + Trivy)
  • SLSA Level 2: Build provenance + SBOM
  • Signed Images: Docker provenance attestation
  • M-1 (planned): Pin images by digest (not tags)
  • Image Verification (planned): Admission controller verifies signatures

Trust Boundaries

Boundary Map

graph TB
    subgraph Untrusted["🔴 UNTRUSTED ZONE"]
        Internet[Internet<br/>DNS Clients]
    end

    subgraph Perimeter["🟡 PERIMETER"]
        LB[LoadBalancer<br/>Port 53]
    end

    subgraph Cluster["🟢 KUBERNETES CLUSTER (Trusted)"]
        subgraph ControlPlane["Control Plane"]
            API[Kubernetes API]
            etcd[etcd]
        end

        subgraph DNSNamespace["🟠 dns-system Namespace<br/>(High Privilege)"]
            Ctrl[Bindy Controller]
            BIND[BIND9 Pods]
            Secrets[Secrets]
        end

        subgraph TenantNS["🔵 Tenant Namespaces<br/>(Low Privilege)"]
            App1[team-web]
            App2[team-api]
        end
    end

    Internet -->|DNS Queries| LB
    LB -->|Forwarded| BIND
    BIND -->|Read ConfigMaps| DNSNamespace
    Ctrl -->|Reconcile| API
    Ctrl -->|Read| Secrets
    API -->|Store| etcd
    App1 -->|Create DNSZone| API
    App2 -->|Create DNSZone| API

    style Internet fill:#FF6B6B
    style LB fill:#FFD93D
    style ControlPlane fill:#6BCB77
    style DNSNamespace fill:#FFA500
    style TenantNS fill:#4D96FF

Trust Boundary Rules:

  1. Untrusted → Perimeter: All traffic rate-limited, DDoS protection (planned)
  2. Perimeter → dns-system: Only port 53 allowed, no direct access to controller
  3. dns-system → Control Plane: Authenticated with ServiceAccount token, RBAC enforced
  4. Tenant Namespaces → Control Plane: Authenticated with user credentials, RBAC enforced
  5. Secrets Access: Only controller ServiceAccount can read, audit logged

Authentication & Authorization

RBAC Architecture

graph LR
    subgraph Identities
        SA[ServiceAccount: bindy<br/>ns: dns-system]
        User1[User: alice<br/>Team: web]
        User2[User: bob<br/>Team: api]
    end

    subgraph Roles
        CR[ClusterRole:<br/>bindy-controller]
        NSR[Role:<br/>dnszone-editor<br/>ns: team-web]
    end

    subgraph Bindings
        CRB[ClusterRoleBinding]
        RB[RoleBinding]
    end

    subgraph Resources
        CRD[CRDs<br/>Bind9Cluster]
        Zone[DNSZone<br/>ns: team-web]
        Sec[Secrets<br/>ns: dns-system]
    end

    SA -->|bound to| CRB
    CRB -->|grants| CR
    CR -->|allows| CRD
    CR -->|allows| Sec

    User1 -->|bound to| RB
    RB -->|grants| NSR
    NSR -->|allows| Zone

    style SA fill:#FFD93D
    style CR fill:#6BCB77
    style Sec fill:#FF6B6B

Controller RBAC Permissions

Cluster-Scoped Resources:

ResourceVerbsRationale
bind9clusters.bindy.firestoned.ioget, list, watch, create, update, patchManage cluster topology
bind9instances.bindy.firestoned.ioget, list, watch, create, update, patchManage BIND9 instances
delete on ANY resourceDENIED✅ C-2: Least privilege, prevent accidental deletion

Namespaced Resources (dns-system):

ResourceVerbsRationale
secretsget, list, watchRead RNDC keys (READ-ONLY)
configmapsget, list, watch, create, update, patchManage BIND9 configuration
deploymentsget, list, watch, create, update, patchManage BIND9 deployments
servicesget, list, watch, create, update, patchExpose DNS services
serviceaccountsget, list, watch, create, update, patchManage BIND9 ServiceAccounts
secrets❌ create, update, patch, delete✅ PCI-DSS 7.1.2: Read-only access
delete on ANY resourceDENIED✅ C-2: Least privilege

Verification:

# Run automated RBAC verification
deploy/rbac/verify-rbac.sh

User RBAC Permissions (Tenants)

Example: team-web namespace

UserRoleResourcesVerbsScope
alicednszone-editordnszones.bindy.firestoned.ioget, list, watch, create, update, patchteam-web only
alicednszone-editorarecords, cnamerecords, …get, list, watch, create, update, patchteam-web only
alicednszones in other namespaces❌ DENIEDCannot access team-api zones
alicesecrets, configmaps❌ DENIEDCannot access BIND9 internals

Secrets Management

Secret Types

SecretPurposeAccessRotationEncryption
RNDC KeyAuthenticate to BIND9Controller: read-onlyManual (planned automation)At rest: etcd, In transit: TLS
TLS Certificates (future)HTTPS, DNSSECController: read-onlyCert-manager (automated)At rest: etcd, In transit: TLS
ServiceAccount TokenKubernetes API authAuto-mountedKubernetes (short-lived)JWT signed by cluster CA

Secret Lifecycle

stateDiagram-v2
    [*] --> Created: Admin creates secret<br/>(kubectl create secret)
    Created --> Stored: etcd encrypts at rest
    Stored --> Mounted: Controller pod starts<br/>(Kubernetes mounts as volume)
    Mounted --> Used: Controller reads RNDC key
    Used --> Audited: Access logged (H-3 planned)
    Audited --> Rotated: Key rotation (manual)
    Rotated --> Stored: New key stored
    Stored --> Deleted: Old key deleted after grace period
    Deleted --> [*]

Secret Protection

At Rest:

  • ✅ etcd encryption enabled (AES-256-GCM)
  • ✅ Secrets stored in Kubernetes Secrets (not in code, env vars, or ConfigMaps)

In Transit:

  • ✅ All Kubernetes API communication over TLS
  • ✅ ServiceAccount token transmitted over TLS

In Use:

  • ✅ Controller runs as non-root (uid 1000+)
  • ✅ Read-only filesystem (secrets cannot be written to disk)
  • ✅ Memory protection (secrets cleared after use - Rust Drop trait)

Access Control:

  • ✅ RBAC limits secret read to controller only
  • ✅ Kubernetes audit log captures all secret access
  • H-3 (planned): Dedicated secret access audit trail with alerts

Network Security

Network Architecture

graph TB
    subgraph Internet
        Client[DNS Clients]
    end

    subgraph Kubernetes["Kubernetes Cluster"]
        subgraph Ingress["Ingress"]
            LB[LoadBalancer<br/>Port 53 UDP/TCP]
        end

        subgraph dns-system["dns-system Namespace"]
            Ctrl[Bindy Controller]
            BIND1[BIND9 Primary<br/>Port 53, 953]
            BIND2[BIND9 Secondary<br/>Port 53]
        end

        subgraph kube-system["kube-system"]
            API[Kubernetes API<br/>Port 6443]
        end

        subgraph team-web["team-web Namespace"]
            App1[Application Pods]
        end
    end

    Client -->|UDP/TCP 53| LB
    LB -->|Forward| BIND1
    LB -->|Forward| BIND2
    Ctrl -->|HTTPS 6443| API
    Ctrl -->|TCP 953<br/>RNDC| BIND1
    BIND1 -->|AXFR/IXFR| BIND2
    App1 -->|HTTPS 6443| API

    style Client fill:#FF6B6B
    style LB fill:#FFD93D
    style API fill:#6BCB77
    style Ctrl fill:#4D96FF

Network Policies (Planned - L-1)

Policy 1: Controller Egress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-controller-egress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress:
  # Allow: Kubernetes API
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: TCP
      port: 6443
  # Allow: BIND9 RNDC
  - to:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bind9
    ports:
    - protocol: TCP
      port: 953
  # Allow: DNS (for cluster DNS resolution)
  - to:
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

Policy 2: BIND9 Ingress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-ingress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow: DNS queries from anywhere
  - from:
    - namespaceSelector: {}
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow: RNDC from controller only
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bindy
    ports:
    - protocol: TCP
      port: 953
  # Allow: AXFR from secondaries only
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bind9
          app.kubernetes.io/component: secondary
    ports:
    - protocol: TCP
      port: 53

Container Security

Container Hardening

Bindy Controller Pod Security:

apiVersion: v1
kind: Pod
metadata:
  name: bindy-controller
spec:
  serviceAccountName: bindy
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: controller
    image: ghcr.io/firestoned/bindy:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "500m"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
      readOnly: false  # Only /tmp is writable
    - name: rndc-key
      mountPath: /etc/bindy/rndc
      readOnly: true
  volumes:
  - name: tmp
    emptyDir:
      sizeLimit: 100Mi
  - name: rndc-key
    secret:
      secretName: rndc-key

Security Features:

  • ✅ Non-root user (uid 1000)
  • ✅ Read-only root filesystem (only /tmp writable)
  • ✅ No privileged escalation
  • ✅ All capabilities dropped
  • ✅ seccomp profile (restrict syscalls)
  • ✅ Resource limits (prevent DoS)
  • ✅ Secrets mounted read-only

Image Security

Base Image: Chainguard (Zero-CVE)

FROM cgr.dev/chainguard/static:latest
COPY --chmod=755 bindy /usr/local/bin/bindy
USER 1000:1000
ENTRYPOINT ["/usr/local/bin/bindy"]

Features:

  • ✅ Chainguard static base (zero CVEs, no package manager)
  • ✅ Minimal attack surface (~15MB image size)
  • ✅ No shell, no utilities (static binary only)
  • ✅ FIPS-ready (if required)
  • ✅ Signed image with provenance
  • ✅ SBOM included

Vulnerability Scanning:

  • ✅ Trivy scans on every PR, main push, release
  • ✅ CI fails on CRITICAL/HIGH vulnerabilities
  • ✅ Daily scheduled scans detect new CVEs

Supply Chain Security

SLSA Level 2 Compliance

RequirementImplementationStatus
Build provenanceSigned commits provide authorship proof✅ C-1
Source integrityGPG/SSH signatures verify source✅ C-1
Build integritySBOM generated for all releases✅ SLSA
Build isolationGitHub Actions ephemeral runners✅ CI/CD
Parameterless buildReproducible builds (same input = same output)❌ H-4 (planned)

Supply Chain Flow

flowchart LR
    A[Developer] -->|Signed Commit| B[Git]
    B -->|Webhook| C[GitHub Actions]
    C -->|Build| D[Binary]
    C -->|Scan| E[cargo-audit]
    E -->|Pass| D
    D -->|Build| F[Container Image]
    F -->|Scan| G[Trivy]
    G -->|Pass| H[Sign Image]
    H -->|Provenance| I[SBOM]
    I -->|Push| J[Registry]
    J -->|Pull| K[Kubernetes]

    style A fill:#90EE90
    style E fill:#FFD700
    style G fill:#FFD700
    style H fill:#90EE90
    style I fill:#90EE90

Supply Chain Threats Mitigated:

  • Code Injection: Signed commits prevent unauthorized code changes
  • Dependency Confusion: cargo-audit verifies dependencies from crates.io
  • Malicious Dependencies: Vulnerability scanning detects known CVEs
  • Image Tampering: Signed images with provenance attestation
  • Compromised Build Environment (partially): Ephemeral runners, but build reproducibility not verified (H-4)

References


Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Threat Model - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 6.4.1, Basel III Cyber Risk


Table of Contents


Overview

This document provides a comprehensive threat model for the Bindy DNS Controller, a Kubernetes operator that manages BIND9 DNS servers. The threat model uses the STRIDE methodology (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) to identify and analyze security threats.

Objectives

  1. Identify threats to the DNS infrastructure managed by Bindy
  2. Assess risk for each identified threat
  3. Document mitigations (existing and required)
  4. Provide security guidance for deployers and operators
  5. Support compliance with SOX 404, PCI-DSS 6.4.1, Basel III

Scope

In Scope:

  • Bindy controller container and runtime
  • Custom Resource Definitions (CRDs) and Kubernetes API interactions
  • BIND9 pods managed by Bindy
  • DNS zone data and configuration
  • RNDC (Remote Name Daemon Control) communication
  • Container images and supply chain
  • CI/CD pipeline security

Out of Scope:

  • Kubernetes cluster security (managed by platform team)
  • Network infrastructure security (managed by network team)
  • Physical security of data centers
  • DNS client security (recursive resolvers outside our control)

System Description

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     Kubernetes Cluster                       │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │              dns-system Namespace                   │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │        Bindy Controller (Deployment)         │  │    │
│  │  │  ┌────────────────────────────────────────┐  │  │    │
│  │  │  │  Controller Pod (Non-Root, ReadOnly)   │  │    │    │
│  │  │  │  - Watches CRDs                        │  │  │    │
│  │  │  │  - Reconciles DNS zones                │  │  │    │
│  │  │  │  - Manages BIND9 pods                  │  │  │    │
│  │  │  │  - Uses RNDC for zone updates         │  │  │    │
│  │  │  └────────────────────────────────────────┘  │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │       BIND9 Primary (StatefulSet)           │  │    │
│  │  │  ┌────────────────────────────────────────┐  │  │    │
│  │  │  │  BIND Pod (Non-Root, ReadOnly)         │  │  │    │
│  │  │  │  - Authoritative DNS (Port 53)         │  │  │    │
│  │  │  │  - RNDC Control (Port 953)             │  │  │    │
│  │  │  │  - Zone files (ConfigMaps)             │  │  │    │
│  │  │  │  - RNDC key (Secret, read-only)        │  │  │    │
│  │  │  └────────────────────────────────────────┘  │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │      BIND9 Secondaries (StatefulSet)        │  │    │
│  │  │  - Receive zone transfers from primary       │  │    │
│  │  │  - Provide redundancy                        │  │    │
│  │  │  - Geographic distribution                   │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │         Other Namespaces (Multi-Tenancy)           │    │
│  │  - team-web (DNSZone CRs)                          │    │
│  │  - team-api (DNSZone CRs)                          │    │
│  │  - platform-dns (Bind9Cluster CRs)                 │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
└─────────────────────────────────────────────────────────────┘
          │                           ▲
          │ DNS Queries (UDP/TCP 53)  │
          ▼                           │
    ┌─────────────────────────────────────┐
    │       External DNS Clients          │
    │  - Recursive resolvers              │
    │  - Corporate clients                │
    │  - Internet users                   │
    └─────────────────────────────────────┘

Components

  1. Bindy Controller

    • Kubernetes operator written in Rust
    • Watches custom resources (Bind9Cluster, Bind9Instance, DNSZone, DNS records)
    • Reconciles desired state with actual state
    • Manages BIND9 deployments, ConfigMaps, Secrets, Services
    • Uses RNDC to update zones on running BIND9 instances
  2. BIND9 Pods

    • Authoritative DNS servers running BIND9
    • Primary server handles zone updates
    • Secondary servers replicate zones via AXFR/IXFR
    • Exposed via LoadBalancer or NodePort services
  3. Custom Resources (CRDs)

    • Bind9Cluster: Cluster-scoped, defines BIND9 cluster topology
    • Bind9Instance: Namespaced, defines individual BIND9 server
    • DNSZone: Namespaced, defines DNS zone (e.g., example.com)
    • DNS Records: ARecord, CNAMERecord, MXRecord, etc.
  4. Supporting Resources

    • ConfigMaps: Store BIND9 configuration and zone files
    • Secrets: Store RNDC keys (symmetric HMAC keys)
    • Services: Expose DNS (port 53) and RNDC (port 953)
    • ServiceAccounts: RBAC for controller access

Assets

High-Value Assets

AssetDescriptionConfidentialityIntegrityAvailabilityOwner
DNS Zone DataAuthoritative DNS records for all managed domainsMediumCriticalCriticalTeams/Platform
RNDC KeysSymmetric HMAC keys for BIND9 controlCriticalCriticalHighSecurity Team
Controller BinarySigned container image with controller logicMediumCriticalHighDevelopment Team
BIND9 Configurationnamed.conf, zone configsLowCriticalHighPlatform Team
Kubernetes API AccessServiceAccount token for controllerCriticalCriticalCriticalPlatform Team
CRD SchemasDefine API contract for DNS managementLowCriticalMediumDevelopment Team
Audit LogsRecord of all DNS changes and accessHighCriticalHighSecurity Team
SBOMSoftware Bill of Materials for complianceLowCriticalMediumCompliance Team

Asset Protection Goals

  • DNS Zone Data: Prevent unauthorized modification (tampering), ensure availability
  • RNDC Keys: Prevent disclosure (compromise allows full BIND9 control)
  • Controller Binary: Prevent supply chain attacks, ensure code integrity
  • Kubernetes API Access: Prevent privilege escalation, enforce least privilege
  • Audit Logs: Ensure non-repudiation, prevent tampering, retain for compliance

Trust Boundaries

Boundary 1: Kubernetes Cluster Perimeter

Trust Level: High Description: Kubernetes API server, etcd, and cluster networking

Assumptions:

  • Kubernetes RBAC is properly configured
  • etcd is encrypted at rest
  • Network policies are enforced
  • Node security is managed by platform team

Threats if Compromised:

  • Attacker gains full control of all resources in cluster
  • DNS data can be exfiltrated or modified
  • Controller can be manipulated or replaced

Boundary 2: dns-system Namespace

Trust Level: High Description: Namespace containing Bindy controller and BIND9 pods

Assumptions:

  • RBAC limits access to authorized ServiceAccounts only
  • Secrets are encrypted at rest in etcd
  • Pod Security Standards enforced (Restricted)

Threats if Compromised:

  • Attacker can read RNDC keys
  • Attacker can modify DNS zones
  • Attacker can disrupt DNS service

Boundary 3: Controller Container

Trust Level: Medium-High Description: Bindy controller runtime environment

Assumptions:

  • Container runs as non-root user
  • Filesystem is read-only except /tmp
  • No privileged capabilities
  • Resource limits enforced

Threats if Compromised:

  • Attacker can abuse Kubernetes API access
  • Attacker can read secrets controller has access to
  • Attacker can disrupt reconciliation loops

Boundary 4: BIND9 Container

Trust Level: Medium Description: BIND9 DNS server runtime

Assumptions:

  • Container runs as non-root
  • Exposed to internet (port 53)
  • Configuration is managed by controller (read-only)

Threats if Compromised:

  • Attacker can serve malicious DNS responses
  • Attacker can exfiltrate zone data
  • Attacker can pivot to other cluster resources (if network policies weak)

Boundary 5: External Network (Internet)

Trust Level: Untrusted Description: Public internet where DNS clients reside

Assumptions:

  • All traffic is potentially hostile
  • DDoS attacks are likely
  • DNS protocol vulnerabilities will be exploited

Threats:

  • DNS amplification attacks (abuse open resolvers)
  • Cache poisoning attempts
  • Zone enumeration (AXFR abuse)
  • DoS via query floods

STRIDE Threat Analysis

S - Spoofing (Identity)

S1: Spoofed Kubernetes API Requests

Threat: Attacker impersonates the Bindy controller ServiceAccount to make unauthorized API calls.

Impact: HIGH Likelihood: LOW (requires compromised cluster or stolen token)

Attack Scenario:

  1. Attacker compromises a pod in the cluster
  2. Steals ServiceAccount token from /var/run/secrets/kubernetes.io/serviceaccount/token
  3. Uses token to impersonate controller and modify DNS zones

Mitigations:

  • ✅ RBAC least privilege (controller cannot delete resources)
  • ✅ Pod Security Standards (non-root, read-only filesystem)
  • ✅ Short-lived ServiceAccount tokens (TokenRequest API)
  • MISSING: Network policies to restrict egress from controller pod
  • MISSING: Audit logging for all ServiceAccount API calls

Residual Risk: MEDIUM (need network policies and audit logs)


S2: Spoofed RNDC Commands

Threat: Attacker gains access to RNDC key and sends malicious commands to BIND9.

Impact: CRITICAL Likelihood: LOW (RNDC keys stored in Kubernetes Secrets with RBAC)

Attack Scenario:

  1. Attacker compromises controller pod or namespace
  2. Reads RNDC key from Kubernetes Secret
  3. Connects to BIND9 RNDC port (953) and issues commands (e.g., reload, freeze, thaw)

Mitigations:

  • ✅ Secrets encrypted at rest (Kubernetes)
  • ✅ RBAC limits secret read access to controller only
  • ✅ RNDC port (953) not exposed externally
  • MISSING: Secret access audit trail (H-3)
  • MISSING: RNDC key rotation policy

Residual Risk: MEDIUM (need secret audit trail)


S3: Spoofed Git Commits (Supply Chain)

Threat: Attacker forges commits without proper signature, injecting malicious code.

Impact: CRITICAL Likelihood: VERY LOW (branch protection enforces signed commits)

Attack Scenario:

  1. Attacker compromises GitHub account or uses stolen SSH key
  2. Pushes unsigned commit to feature branch
  3. Attempts to merge to main without proper review

Mitigations:

  • ✅ All commits MUST be signed (GPG/SSH)
  • ✅ GitHub branch protection requires signed commits
  • ✅ CI/CD verifies commit signatures
  • ✅ 2+ reviewers required for all PRs
  • ✅ Linear history (no merge commits)

Residual Risk: VERY LOW (strong controls in place)


T - Tampering (Data Integrity)

T1: Tampering with DNS Zone Data

Threat: Attacker modifies DNS records to redirect traffic or cause outages.

Impact: CRITICAL Likelihood: LOW (requires Kubernetes API access)

Attack Scenario:

  1. Attacker gains write access to DNSZone CRs (via compromised RBAC or stolen credentials)
  2. Modifies A/CNAME records to point to attacker-controlled servers
  3. Traffic is redirected, enabling phishing, data theft, or service disruption

Mitigations:

  • ✅ RBAC enforces least privilege (users can only modify zones in their namespace)
  • ✅ GitOps workflow (changes via pull requests, not direct kubectl)
  • ✅ Audit logging in Kubernetes (all CR modifications logged)
  • MISSING: Webhook validation for DNS records (prevent obviously malicious changes)
  • MISSING: DNSSEC signing (prevents tampering of DNS responses in transit)

Residual Risk: MEDIUM (need validation webhooks and DNSSEC)


T2: Tampering with Container Images

Threat: Attacker replaces legitimate Bindy/BIND9 container image with malicious version.

Impact: CRITICAL Likelihood: VERY LOW (signed images, supply chain controls)

Attack Scenario:

  1. Attacker compromises CI/CD pipeline or registry credentials
  2. Pushes malicious image with same tag (e.g., :latest)
  3. Controller pulls compromised image on next rollout

Mitigations:

  • ✅ All images signed with provenance attestation (SLSA Level 2)
  • ✅ SBOM generated for all releases
  • ✅ GitHub Actions signed commits verification
  • ✅ Multi-stage builds minimize attack surface
  • MISSING: Image digests pinned (not tags) - see M-1
  • MISSING: Admission controller to verify image signatures (e.g., Sigstore Cosign)

Residual Risk: LOW (strong supply chain controls, but pinning digests would further reduce risk)


T3: Tampering with ConfigMaps/Secrets

Threat: Attacker modifies BIND9 configuration or RNDC keys via Kubernetes API.

Impact: HIGH Likelihood: LOW (RBAC protects ConfigMaps/Secrets)

Attack Scenario:

  1. Attacker gains elevated privileges in dns-system namespace
  2. Modifies BIND9 ConfigMap to disable security features or add backdoor zones
  3. BIND9 pod restarts with malicious configuration

Mitigations:

  • ✅ Controller has NO delete permissions on Secrets/ConfigMaps (C-2)
  • ✅ RBAC limits write access to controller only
  • ✅ Immutable ConfigMaps (once created, cannot be modified - requires recreation)
  • MISSING: ConfigMap/Secret integrity checks (hash validation)
  • MISSING: Automated drift detection (compare running config vs desired state)

Residual Risk: MEDIUM (need integrity checks)


R - Repudiation (Non-Repudiation)

R1: Unauthorized DNS Changes Without Attribution

Threat: Attacker modifies DNS zones and there’s no audit trail proving who made the change.

Impact: HIGH (compliance violation, incident response hindered) Likelihood: LOW (Kubernetes audit logs capture API calls)

Attack Scenario:

  1. Attacker gains access to cluster with weak RBAC
  2. Modifies DNSZone CRs
  3. No log exists linking the change to a specific user or ServiceAccount

Mitigations:

  • ✅ Kubernetes audit logs enabled (captures all API requests)
  • ✅ All commits signed (non-repudiation for code changes)
  • ✅ GitOps workflow (changes traceable to Git commits and PR reviews)
  • MISSING: Centralized log aggregation with tamper-proof storage (H-2)
  • MISSING: Log retention policy (90 days active, 1 year archive per PCI-DSS)
  • MISSING: Audit trail queries documented for compliance reviews

Residual Risk: MEDIUM (need H-2 - Audit Log Retention Policy)


R2: Secret Access Without Audit Trail

Threat: Attacker reads RNDC keys from Secrets, no record of who accessed them.

Impact: HIGH Likelihood: LOW (secret access is logged by Kubernetes, but not prominently tracked)

Attack Scenario:

  1. Attacker compromises ServiceAccount with secret read access
  2. Reads RNDC key from Kubernetes Secret
  3. Uses key to control BIND9, but no clear audit trail of secret access

Mitigations:

  • ✅ Kubernetes audit logs capture Secret read operations
  • MISSING: Dedicated audit trail for secret access (H-3)
  • MISSING: Alerts on unexpected secret reads
  • MISSING: Secret access dashboard for compliance reviews

Residual Risk: MEDIUM (need H-3 - Secret Access Audit Trail)


I - Information Disclosure

I1: Exposure of RNDC Keys

Threat: RNDC keys leaked via logs, environment variables, or insecure storage.

Impact: CRITICAL Likelihood: VERY LOW (secrets stored in Kubernetes Secrets, not in code)

Attack Scenario:

  1. Developer hardcodes RNDC key in code or logs it for debugging
  2. Key is committed to Git or appears in log aggregation system
  3. Attacker finds key and uses it to control BIND9

Mitigations:

  • ✅ Secrets stored in Kubernetes Secrets (encrypted at rest)
  • ✅ Pre-commit hooks to detect secrets in code
  • ✅ GitHub secret scanning enabled
  • ✅ CI/CD fails if secrets detected
  • MISSING: Log sanitization (ensure secrets never appear in logs)
  • MISSING: Secret rotation policy (rotate RNDC keys periodically)

Residual Risk: LOW (good controls, but rotation would improve)


I2: Zone Data Enumeration

Threat: Attacker uses AXFR (zone transfer) to download entire zone contents.

Impact: MEDIUM (zone data is semi-public, but bulk enumeration aids reconnaissance) Likelihood: MEDIUM (AXFR often left open by mistake)

Attack Scenario:

  1. Attacker sends AXFR request to BIND9 server
  2. If AXFR is not restricted, server returns all records in zone
  3. Attacker uses zone data for targeted attacks (subdomain enumeration, email harvesting)

Mitigations:

  • ✅ AXFR restricted to secondary servers only (BIND9 allow-transfer directive)
  • ✅ BIND9 configuration managed by controller (prevents manual misconfig)
  • MISSING: TSIG authentication for zone transfers (H-4)
  • MISSING: Rate limiting on AXFR requests

Residual Risk: MEDIUM (need TSIG for AXFR)


I3: Container Image Vulnerability Disclosure

Threat: Container images contain vulnerabilities that could be exploited if disclosed.

Impact: MEDIUM Likelihood: MEDIUM (vulnerabilities exist in all software)

Attack Scenario:

  1. Vulnerability is disclosed in a dependency (e.g., CVE in glibc)
  2. Attacker scans for services using vulnerable version
  3. Exploits vulnerability to gain RCE or escalate privileges

Mitigations:

  • ✅ Automated vulnerability scanning (cargo-audit + Trivy) - C-3
  • ✅ CI blocks on CRITICAL/HIGH vulnerabilities
  • ✅ Daily scheduled scans detect new CVEs
  • ✅ Remediation SLAs defined (CRITICAL: 24h, HIGH: 7d)
  • ✅ Chainguard zero-CVE base images used

Residual Risk: LOW (strong vulnerability management)


D - Denial of Service

D1: DNS Query Flood (DDoS)

Threat: Attacker floods BIND9 servers with DNS queries, exhausting resources.

Impact: CRITICAL (DNS unavailability impacts all services) Likelihood: HIGH (DNS is a common DDoS target)

Attack Scenario:

  1. Attacker uses botnet to send millions of DNS queries to BIND9 servers
  2. BIND9 CPU/memory exhausted, becomes unresponsive
  3. Legitimate DNS queries fail, causing outages

Mitigations:

  • ✅ Rate limiting in BIND9 (rate-limit directive)
  • ✅ Resource limits on BIND9 pods (CPU/memory requests/limits)
  • ✅ Horizontal scaling (multiple BIND9 secondaries)
  • MISSING: DDoS protection at network edge (e.g., CloudFlare, AWS Shield)
  • MISSING: Query pattern analysis and anomaly detection
  • MISSING: Automated pod scaling based on query load (HPA)

Residual Risk: MEDIUM (need edge DDoS protection)


D2: Controller Resource Exhaustion

Threat: Attacker creates thousands of DNSZone CRs, overwhelming controller.

Impact: HIGH (controller fails, DNS updates stop) Likelihood: LOW (requires cluster access)

Attack Scenario:

  1. Attacker gains write access to Kubernetes API
  2. Creates 10,000+ DNSZone CRs
  3. Controller reconciliation queue overwhelms CPU/memory
  4. Controller crashes or becomes unresponsive

Mitigations:

  • ✅ Resource limits on controller pod
  • ✅ Exponential backoff for failed reconciliations
  • MISSING: Rate limiting on reconciliation loops (M-3)
  • MISSING: Admission webhook to limit number of CRs per namespace
  • MISSING: Horizontal scaling of controller (leader election)

Residual Risk: MEDIUM (need M-3 - Rate Limiting)


D3: AXFR Amplification Attack

Threat: Attacker abuses AXFR to amplify traffic in DDoS attack.

Impact: MEDIUM Likelihood: LOW (AXFR restricted to secondaries)

Attack Scenario:

  1. Attacker spoofs source IP of DDoS target
  2. Sends AXFR request to BIND9
  3. BIND9 sends large zone file to spoofed IP (amplification)

Mitigations:

  • ✅ AXFR restricted to known secondary IPs (allow-transfer)
  • ✅ BIND9 does not respond to spoofed source IPs (anti-spoofing)
  • MISSING: Response rate limiting (RRL) for AXFR

Residual Risk: LOW (AXFR restrictions effective)


E - Elevation of Privilege

E1: Container Escape to Node

Threat: Attacker escapes from Bindy or BIND9 container to underlying Kubernetes node.

Impact: CRITICAL (full node compromise, lateral movement) Likelihood: VERY LOW (Pod Security Standards enforced)

Attack Scenario:

  1. Attacker exploits container runtime vulnerability (e.g., runc CVE)
  2. Escapes container to host filesystem
  3. Gains root access on node, compromises kubelet and other pods

Mitigations:

  • ✅ Non-root containers (uid 1000+)
  • ✅ Read-only root filesystem
  • ✅ No privileged capabilities
  • ✅ Pod Security Standards (Restricted)
  • ✅ seccomp profile (restrict syscalls)
  • ✅ AppArmor/SELinux profiles
  • MISSING: Regular node patching (managed by platform team)

Residual Risk: VERY LOW (defense in depth)


E2: RBAC Privilege Escalation

Threat: Attacker escalates from limited RBAC role to cluster-admin.

Impact: CRITICAL Likelihood: VERY LOW (RBAC reviewed, least privilege enforced)

Attack Scenario:

  1. Attacker compromises ServiceAccount with limited permissions
  2. Exploits RBAC misconfiguration (e.g., wildcard permissions)
  3. Gains cluster-admin and full control of cluster

Mitigations:

  • ✅ RBAC least privilege (controller has NO delete permissions) - C-2
  • ✅ Automated RBAC verification script (deploy/rbac/verify-rbac.sh)
  • ✅ No wildcard permissions in controller RBAC
  • ✅ Regular RBAC audits (quarterly)
  • MISSING: RBAC policy-as-code validation (OPA/Gatekeeper)

Residual Risk: VERY LOW (strong RBAC controls)


E3: Exploiting Vulnerable Dependencies

Threat: Attacker exploits vulnerability in Rust dependency to gain code execution.

Impact: HIGH Likelihood: LOW (automated vulnerability scanning, rapid patching)

Attack Scenario:

  1. CVE disclosed in dependency (e.g., tokio, hyper, kube)
  2. Attacker crafts malicious Kubernetes API response to trigger vulnerability
  3. Controller crashes or attacker gains RCE in controller pod

Mitigations:

  • ✅ Automated vulnerability scanning (cargo-audit) - C-3
  • ✅ CI blocks on CRITICAL/HIGH vulnerabilities
  • ✅ Remediation SLAs enforced (CRITICAL: 24h)
  • ✅ Daily scheduled scans
  • ✅ Dependency updates via Dependabot

Residual Risk: LOW (excellent vulnerability management)


Attack Surface

1. Kubernetes API

Exposure: Internal (within cluster) Authentication: ServiceAccount token (JWT) Authorization: RBAC (least privilege)

Attack Vectors:

  • Token theft from compromised pod
  • RBAC misconfiguration allowing excessive permissions
  • API server vulnerability (CVE in Kubernetes)

Mitigations:

  • Short-lived tokens (TokenRequest API)
  • RBAC verification script
  • Regular Kubernetes upgrades

Risk: MEDIUM


2. DNS Port 53 (UDP/TCP)

Exposure: External (internet-facing) Authentication: None (public DNS) Authorization: None

Attack Vectors:

  • DNS amplification attacks
  • Query floods (DDoS)
  • Cache poisoning attempts (if recursion enabled)
  • NXDOMAIN attacks

Mitigations:

  • Rate limiting (BIND9 rate-limit)
  • Recursion disabled (authoritative-only)
  • DNSSEC (planned)
  • DDoS protection at edge

Risk: HIGH (public-facing, no authentication)


3. RNDC Port 953

Exposure: Internal (within cluster, not exposed externally) Authentication: HMAC key (symmetric) Authorization: Key-based (all-or-nothing)

Attack Vectors:

  • RNDC key theft from Kubernetes Secret
  • Brute-force HMAC key (unlikely with strong key)
  • MITM attack (if network not encrypted)

Mitigations:

  • Secrets encrypted at rest
  • RBAC limits secret read access
  • RNDC port not exposed externally
  • NetworkPolicy (planned - L-1)

Risk: MEDIUM


4. Container Images (Supply Chain)

Exposure: Public (GitHub Container Registry) Authentication: Pull is unauthenticated (public repo) Authorization: Push requires GitHub token with packages:write

Attack Vectors:

  • Compromised CI/CD pipeline pushing malicious image
  • Dependency confusion (malicious crate with same name)
  • Compromised base image (upstream supply chain attack)

Mitigations:

  • Signed commits (all code changes)
  • Signed container images (provenance)
  • SBOM generation
  • Vulnerability scanning (Trivy)
  • Chainguard zero-CVE base images
  • Dependabot for dependency updates

Risk: LOW (strong supply chain security)


5. Custom Resource Definitions (CRDs)

Exposure: Internal (Kubernetes API) Authentication: Kubernetes user/ServiceAccount Authorization: RBAC (namespace-scoped for DNSZone)

Attack Vectors:

  • Malicious CRs with crafted input (e.g., XXL zone names)
  • Schema validation bypass
  • CR injection via compromised user

Mitigations:

  • Schema validation in CRD (OpenAPI v3)
  • Input sanitization in controller
  • Namespace isolation (RBAC)
  • Admission webhooks (planned)

Risk: MEDIUM


6. Git Repository (Code)

Exposure: Public (GitHub) Authentication: Push requires GitHub 2FA + signed commits Authorization: Branch protection on main

Attack Vectors:

  • Compromised GitHub account
  • Unsigned commit merged to main
  • Malicious PR approved by reviewers

Mitigations:

  • All commits signed (GPG/SSH) - C-1
  • Branch protection (2+ reviewers required)
  • CI/CD verifies signatures
  • Linear history (no merge commits)

Risk: VERY LOW (strong controls)


Threat Scenarios

Scenario 1: Compromised Controller Pod

Severity: HIGH

Attack Path:

  1. Attacker exploits vulnerability in controller code (e.g., memory corruption, logic bug)
  2. Gains code execution in controller pod
  3. Reads ServiceAccount token from /var/run/secrets/
  4. Uses token to modify DNSZone CRs or read RNDC keys from Secrets

Impact:

  • Attacker can modify DNS records (redirect traffic)
  • Attacker can disrupt DNS service (delete zones, BIND9 pods)
  • Attacker can pivot to other namespaces (if RBAC is weak)

Mitigations:

  • Controller runs as non-root, read-only filesystem
  • RBAC least privilege (no delete permissions)
  • Resource limits prevent resource exhaustion
  • Vulnerability scanning (cargo-audit, Trivy)
  • Network policies (planned - L-1)

Residual Risk: MEDIUM (need network policies)


Scenario 2: DNS Cache Poisoning

Severity: MEDIUM

Attack Path:

  1. Attacker sends forged DNS responses to recursive resolver
  2. Resolver caches malicious record (e.g., A record for bank.com pointing to attacker IP)
  3. Clients query resolver, receive poisoned response
  4. Traffic redirected to attacker (phishing, MITM)

Impact:

  • Users redirected to malicious sites
  • Credentials stolen
  • Man-in-the-middle attacks

Mitigations:

  • DNSSEC (planned) - cryptographically signs DNS responses
  • BIND9 is authoritative-only (not vulnerable to cache poisoning)
  • Recursive resolvers outside our control (client responsibility)

Residual Risk: MEDIUM (DNSSEC would eliminate this risk)


Scenario 3: Supply Chain Attack via Malicious Dependency

Severity: CRITICAL

Attack Path:

  1. Attacker compromises popular Rust crate (e.g., via compromised maintainer account)
  2. Malicious code injected into crate update
  3. Bindy controller depends on compromised crate
  4. Malicious code runs in controller, exfiltrates secrets or modifies DNS zones

Impact:

  • Complete compromise of DNS infrastructure
  • Data exfiltration (secrets, zone data)
  • Backdoor access to cluster

Mitigations:

  • Dependency scanning (cargo-audit) - C-3
  • SBOM generation (track all dependencies)
  • Signed commits (code changes traceable)
  • Dependency version pinning in Cargo.lock
  • Manual review for major dependency updates

Residual Risk: LOW (strong supply chain controls)


Scenario 4: Insider Threat (Malicious Admin)

Severity: HIGH

Attack Path:

  1. Malicious cluster admin with cluster-admin RBAC role
  2. Directly modifies DNSZone CRs to redirect traffic
  3. Deletes audit logs to cover tracks
  4. Exfiltrates RNDC keys from Secrets

Impact:

  • DNS records modified without attribution
  • Service disruption
  • Data theft

Mitigations:

  • GitOps workflow (changes via PRs, not direct kubectl)
  • All changes require 2+ reviewers
  • Immutable audit logs (planned - H-2)
  • Secret access audit trail (planned - H-3)
  • Separation of duties (no single admin has all access)

Residual Risk: MEDIUM (need H-2 and H-3)


Scenario 5: DDoS Attack on DNS Infrastructure

Severity: CRITICAL

Attack Path:

  1. Attacker launches volumetric DDoS attack (millions of queries/sec)
  2. BIND9 pods overwhelmed, become unresponsive
  3. DNS queries fail, causing outages for all dependent services

Impact:

  • Complete DNS outage
  • All services depending on DNS become unavailable
  • Revenue loss, SLA violations

Mitigations:

  • Rate limiting in BIND9
  • Horizontal scaling (multiple secondaries)
  • Resource limits (prevent total resource exhaustion)
  • DDoS protection at edge (planned - CloudFlare, AWS Shield)
  • Autoscaling (planned - HPA based on query load)

Residual Risk: MEDIUM (need edge DDoS protection)


Mitigations

Existing Mitigations (Implemented)

IDMitigationThreats MitigatedCompliance
M-01Signed commits requiredS3 (spoofed commits)✅ C-1
M-02RBAC least privilegeE2 (privilege escalation)✅ C-2
M-03Vulnerability scanningI3 (CVE disclosure), E3 (dependency exploit)✅ C-3
M-04Non-root containersE1 (container escape)✅ Pod Security
M-05Read-only filesystemT2 (tampering), E1 (escape)✅ Pod Security
M-06Secrets encrypted at restI1 (RNDC key disclosure)✅ Kubernetes
M-07AXFR restricted to secondariesI2 (zone enumeration)✅ BIND9 config
M-08Rate limiting (BIND9)D1 (DNS query flood)✅ BIND9 config
M-09SBOM generationT2 (supply chain)✅ SLSA Level 2
M-10Chainguard zero-CVE imagesI3 (CVE disclosure)✅ Container security

Planned Mitigations (Roadmap)

IDMitigationThreats MitigatedPriorityRoadmap Item
M-11Audit log retention policyR1 (non-repudiation)HIGHH-2
M-12Secret access audit trailR2 (secret access), I1 (disclosure)HIGHH-3
M-13Admission webhooksT1 (DNS tampering)MEDIUMFuture
M-14DNSSEC signingT1 (tampering), Scenario 2 (cache poisoning)MEDIUMFuture
M-15Image digest pinningT2 (image tampering)MEDIUMM-1
M-16Rate limiting (controller)D2 (controller exhaustion)MEDIUMM-3
M-17Network policiesS1 (API spoofing), E1 (lateral movement)LOWL-1
M-18DDoS edge protectionD1 (DNS query flood)HIGHExternal
M-19RNDC key rotationI1 (key disclosure)MEDIUMFuture
M-20TSIG for AXFRI2 (zone enumeration)MEDIUMFuture

Residual Risks

Critical Residual Risks

None identified (all critical threats have strong mitigations).


High Residual Risks

  1. DDoS Attacks (D1) - Risk reduced by rate limiting and horizontal scaling, but edge DDoS protection is needed for volumetric attacks (100+ Gbps).

  2. Insider Threats (Scenario 4) - Risk reduced by GitOps and RBAC, but immutable audit logs (H-2) and secret access audit trail (H-3) are needed for full non-repudiation.


Medium Residual Risks

  1. DNS Tampering (T1) - Risk reduced by RBAC, but admission webhooks and DNSSEC would provide defense-in-depth.

  2. Controller Resource Exhaustion (D2) - Risk reduced by resource limits, but rate limiting (M-3) and admission webhooks are needed.

  3. Zone Enumeration (I2) - Risk reduced by AXFR restrictions, but TSIG authentication would eliminate AXFR abuse.

  4. Compromised Controller Pod (Scenario 1) - Risk reduced by Pod Security Standards, but network policies (L-1) would prevent lateral movement.


Security Architecture

Defense in Depth Layers

┌─────────────────────────────────────────────────────────────┐
│  Layer 7: Monitoring & Response                             │
│  - Audit logs (Kubernetes API)                              │
│  - Vulnerability scanning (daily)                           │
│  - Incident response playbooks                              │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 6: Application Security                              │
│  - Input validation (CRD schemas)                           │
│  - Least privilege RBAC                                     │
│  - Signed commits (non-repudiation)                         │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 5: Container Security                                │
│  - Non-root user (uid 1000+)                                │
│  - Read-only filesystem                                     │
│  - No privileged capabilities                               │
│  - Vulnerability scanning (Trivy)                           │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 4: Pod Security                                      │
│  - Pod Security Standards (Restricted)                      │
│  - seccomp profile (restrict syscalls)                      │
│  - AppArmor/SELinux profiles                                │
│  - Resource limits (CPU/memory)                             │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 3: Namespace Isolation                               │
│  - RBAC (namespace-scoped roles)                            │
│  - Network policies (planned)                               │
│  - Resource quotas                                          │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 2: Cluster Security                                  │
│  - etcd encryption at rest                                  │
│  - API server authentication/authorization                  │
│  - Secrets management                                       │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 1: Infrastructure Security                           │
│  - Node OS hardening (managed by platform team)             │
│  - Network segmentation                                     │
│  - Physical security                                        │
└─────────────────────────────────────────────────────────────┘

Security Controls Summary

Control CategoryImplementedPlannedResidual Risk
Access ControlRBAC least privilege, signed commitsAdmission webhooksLOW
Data ProtectionSecrets encrypted, AXFR restrictedDNSSEC, TSIGMEDIUM
Supply ChainSigned commits/images, SBOM, vuln scanningImage digest pinningLOW
MonitoringKubernetes audit logs, vuln scanningAudit retention policy, secret access trailMEDIUM
ResilienceRate limiting, resource limitsEdge DDoS protection, HPAMEDIUM
Container SecurityNon-root, read-only FS, Pod Security StandardsNetwork policiesLOW

References


Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Signed Releases

Bindy releases are cryptographically signed using Cosign with keyless signing (Sigstore). This ensures:

  • Authenticity: Verify that releases come from the official Bindy GitHub repository
  • Integrity: Detect any tampering with release artifacts
  • Non-repudiation: Cryptographic proof that artifacts were built by official CI/CD
  • Transparency: All signatures are recorded in the Sigstore transparency log (Rekor)

What Is Signed

Every Bindy release includes signed artifacts:

  1. Container Images:

    • ghcr.io/firestoned/bindy:* (Chainguard base)
    • ghcr.io/firestoned/bindy-distroless:* (Google Distroless base)
  2. Binary Tarballs:

    • bindy-linux-amd64.tar.gz
    • bindy-linux-arm64.tar.gz
  3. Signature Artifacts (uploaded to releases):

    • *.tar.gz.bundle - Cosign signature bundles for binaries
    • Container signatures are stored in the OCI registry

Installing Cosign

To verify signatures, install Cosign:

# macOS
brew install cosign

# Linux (download binary)
LATEST_VERSION=$(curl -s https://api.github.com/repos/sigstore/cosign/releases/latest | grep tag_name | cut -d '"' -f 4)
curl -Lo cosign https://github.com/sigstore/cosign/releases/download/${LATEST_VERSION}/cosign-linux-amd64
chmod +x cosign
sudo mv cosign /usr/local/bin/

# Verify installation
cosign version

Verifying Container Images

Cosign uses keyless signing with Sigstore, which means:

  • No private keys to manage or distribute
  • Signatures are verified against the GitHub Actions OIDC identity
  • All signatures are logged in the public Rekor transparency log

Quick Verification

# Verify the latest Chainguard image
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:latest

# Verify a specific version
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:v0.1.0

# Verify the Distroless variant
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy-distroless:latest

Understanding the Verification Output

When verification succeeds, Cosign returns JSON output with signature details:

[
  {
    "critical": {
      "identity": {
        "docker-reference": "ghcr.io/firestoned/bindy"
      },
      "image": {
        "docker-manifest-digest": "sha256:abcd1234..."
      },
      "type": "cosign container image signature"
    },
    "optional": {
      "Bundle": {
        "SignedEntryTimestamp": "...",
        "Payload": {
          "body": "...",
          "integratedTime": 1234567890,
          "logIndex": 12345678,
          "logID": "..."
        }
      },
      "Issuer": "https://token.actions.githubusercontent.com",
      "Subject": "https://github.com/firestoned/bindy/.github/workflows/release.yaml@refs/tags/v0.1.0"
    }
  }
]

Key fields to verify:

  • Subject: Shows the exact GitHub workflow that created the signature
  • Issuer: Confirms it came from GitHub Actions
  • integratedTime: Unix timestamp when signature was created
  • logIndex: Entry in the Rekor transparency log (publicly auditable)

Verification Failures

If verification fails, you’ll see an error like:

Error: no matching signatures:

Do NOT use unverified images in production. This indicates:

  • The image was not signed by the official Bindy release workflow
  • The image may have been tampered with
  • The image may be a counterfeit

Verifying Binary Releases

Binary tarballs are signed with Cosign blob signing. Each release includes .bundle files containing the signature.

Download and Verify

# Download the binary tarball and signature bundle from GitHub Releases
VERSION="v0.1.0"
PLATFORM="linux-amd64"  # or linux-arm64

# Download tarball
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz"

# Download signature bundle
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz.bundle"

# Verify the signature
cosign verify-blob \
  --bundle "bindy-${PLATFORM}.tar.gz.bundle" \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  "bindy-${PLATFORM}.tar.gz"

Verification Success

If successful, you’ll see:

Verified OK

You can now safely extract and use the binary:

tar xzf bindy-${PLATFORM}.tar.gz
./bindy --version

Automated Verification Script

Create a script to download and verify releases automatically:

#!/bin/bash
set -euo pipefail

VERSION="${1:-latest}"
PLATFORM="${2:-linux-amd64}"

if [ "$VERSION" = "latest" ]; then
  VERSION=$(curl -s https://api.github.com/repos/firestoned/bindy/releases/latest | grep tag_name | cut -d '"' -f 4)
fi

echo "Downloading Bindy $VERSION for $PLATFORM..."

# Download artifacts
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz"
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz.bundle"

# Verify signature
echo "Verifying signature..."
cosign verify-blob \
  --bundle "bindy-${PLATFORM}.tar.gz.bundle" \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  "bindy-${PLATFORM}.tar.gz"

# Extract
echo "Extracting..."
tar xzf "bindy-${PLATFORM}.tar.gz"

echo "✓ Bindy $VERSION successfully verified and installed"
./bindy --version

Additional Security Verification

Check SHA256 Checksums

Every release includes a checksums.sha256 file with SHA256 hashes of all artifacts:

# Download checksums
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/checksums.sha256"

# Verify the tarball checksum
sha256sum -c checksums.sha256 --ignore-missing

Inspect Rekor Transparency Log

All signatures are recorded in the public Rekor transparency log:

# Search for Bindy signatures
rekor-cli search --email noreply@github.com --rekor_server https://rekor.sigstore.dev

# Or use the web interface:
# https://search.sigstore.dev/?email=noreply@github.com

Verify SLSA Provenance

Bindy releases also include SLSA provenance attestations:

# Verify SLSA provenance for the container image
cosign verify-attestation \
  --type slsaprovenance \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:${VERSION}

Kubernetes Deployment Verification

When deploying to Kubernetes, use policy-controller or Kyverno to enforce signature verification:

Kyverno Policy Example

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-bindy-images
spec:
  validationFailureAction: enforce
  background: false
  rules:
    - name: verify-bindy-signature
      match:
        any:
          - resources:
              kinds:
                - Pod
      verifyImages:
        - imageReferences:
            - "ghcr.io/firestoned/bindy*"
          attestors:
            - entries:
                - keyless:
                    subject: "https://github.com/firestoned/bindy/.github/workflows/release.yaml@*"
                    issuer: "https://token.actions.githubusercontent.com"
                    rekor:
                      url: https://rekor.sigstore.dev

This policy ensures:

  • Only signed Bindy images can run in the cluster
  • Signatures must come from the official release workflow
  • Signatures are verified against the Rekor transparency log

Troubleshooting

“Error: no matching signatures”

Cause: Image/artifact is not signed or signature doesn’t match the identity.

Solution:

  • Verify you’re using an official release from ghcr.io/firestoned/bindy*
  • Check the tag/version exists on the GitHub releases page
  • Ensure you’re not using a locally-built image

“Error: unable to verify bundle”

Cause: Signature bundle is corrupted or doesn’t match the artifact.

Solution:

  • Re-download the artifact and bundle
  • Verify the SHA256 checksum matches checksums.sha256
  • Report the issue if checksums match but verification fails

“Error: fetching bundle: context deadline exceeded”

Cause: Network issue connecting to Sigstore services.

Solution:

  • Check your internet connection
  • Verify you can reach https://rekor.sigstore.dev and https://fulcio.sigstore.dev
  • Try again with increased timeout: COSIGN_TIMEOUT=60s cosign verify ...

Security Contact

If you discover a security issue with signed releases:

  • DO NOT open a public GitHub issue
  • Report to: security@firestoned.io
  • Include: artifact name, version, verification output, and steps to reproduce

See SECURITY.md for our security policy and vulnerability disclosure process.

SPDX License Headers

All Bindy source files include SPDX license identifiers for automated license compliance tracking.

What is SPDX?

SPDX (Software Package Data Exchange) is an ISO standard (ISO/IEC 5962:2021) for communicating software license information. SPDX identifiers enable:

  • Automated SBOM generation: Tools like cargo-cyclonedx detect licenses automatically
  • License compliance auditing: Verify no GPL contamination in MIT-licensed project
  • Supply chain transparency: Clear license identification at file granularity
  • Tooling integration: GitHub, Snyk, Trivy, and other tools recognize SPDX headers

Required Header Format

All source files MUST include SPDX headers in the first 10 lines:

Rust files (.rs):

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

Shell scripts (.sh, .bash):

#!/usr/bin/env bash
# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

Makefiles (Makefile, *.mk):

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

GitHub Actions workflows (.yaml, .yml):

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT
name: My Workflow

Automated Verification

Bindy enforces SPDX headers via CI/CD:

Workflow: .github/workflows/license-check.yaml

Checks:

  • All Rust files (.rs)
  • All Shell scripts (.sh, .bash)
  • All Makefiles (Makefile, *.mk)
  • All GitHub Actions workflows (.yaml, .yml)

Enforcement:

  • Runs on every pull request
  • Runs on every push to main
  • Pull requests fail if any source files lack SPDX headers
  • Provides clear error messages with examples for missing headers

Output Example:

✅ All 347 source files have SPDX license headers

File types checked:
  - Rust files (.rs)
  - Shell scripts (.sh, .bash)
  - Makefiles (Makefile, *.mk)
  - GitHub Actions workflows (.yaml, .yml)

License: MIT

Bindy is licensed under the MIT License, one of the most permissive open source licenses.

Permissions:

  • ✅ Commercial use
  • ✅ Modification
  • ✅ Distribution
  • ✅ Private use

Conditions:

  • 📋 Include copyright notice
  • 📋 Include license text

Limitations:

  • ❌ No liability
  • ❌ No warranty

Full license text: LICENSE

Compliance Evidence

SOX 404 (Sarbanes-Oxley):

  • Control: License compliance and intellectual property tracking
  • Evidence: All source files tagged with SPDX identifiers, automated verification
  • Audit Trail: Git history shows when SPDX headers were added

PCI-DSS 6.4.6 (Payment Card Industry):

  • Requirement: Code review and approval processes
  • Evidence: SPDX verification blocks unapproved code (missing headers) from merging
  • Automation: CI/CD enforces license compliance before code review

SLSA Level 3 (Supply Chain Security):

  • Requirement: Build environment provenance and dependencies
  • Evidence: SPDX headers enable automated SBOM generation with license info
  • Transparency: Every dependency’s license is machine-readable

References

Incident Response Playbooks - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 12.10.1, Basel III


Table of Contents


Overview

This document provides step-by-step incident response playbooks for security incidents involving the Bindy DNS Controller. Each playbook follows the NIST Incident Response Lifecycle: Preparation, Detection & Analysis, Containment, Eradication, Recovery, and Post-Incident Activity.

Objectives

  1. Rapid Response: Minimize time between detection and containment
  2. Clear Procedures: Provide step-by-step guidance for responders
  3. Minimize Impact: Reduce blast radius and prevent escalation
  4. Evidence Preservation: Maintain audit trail for forensics and compliance
  5. Continuous Improvement: Learn from incidents to strengthen defenses

Incident Classification

Severity Levels

SeverityDefinitionResponse TimeEscalation
🔴 CRITICALComplete service outage, data breach, or active exploitationImmediate (< 15 min)CISO, CTO, VP Engineering
🟠 HIGHDegraded service, vulnerability with known exploit, unauthorized access< 1 hourSecurity Lead, Engineering Manager
🟡 MEDIUMVulnerability without exploit, suspicious activity, minor service impact< 4 hoursSecurity Team, On-Call Engineer
🔵 LOWInformational findings, potential issues, no immediate risk< 24 hoursSecurity Team

Response Team

Roles and Responsibilities

RoleResponsibilitiesContact
Incident CommanderOverall coordination, decision-making, stakeholder communicationOn-call rotation
Security LeadThreat analysis, forensics, remediation guidancesecurity@firestoned.io
Platform EngineerKubernetes cluster operations, pod managementplatform@firestoned.io
DNS EngineerBIND9 expertise, zone managementdns-team@firestoned.io
Compliance OfficerRegulatory reporting, evidence collectioncompliance@firestoned.io
CommunicationsInternal/external communication, customer notificationscomms@firestoned.io

On-Call Rotation

  • Primary: Security Lead (24/7 PagerDuty)
  • Secondary: Platform Engineer (escalation)
  • Tertiary: CTO (executive escalation)

Communication Protocols

Internal Communication

War Room (Incident > MEDIUM):

  • Slack Channel: #incident-[YYYY-MM-DD]-[number]
  • Video Call: Zoom war room (pinned in channel)
  • Status Updates: Every 30 minutes during active incident

Status Page:

  • Update status.firestoned.io for customer-impacting incidents
  • Templates: Investigating → Identified → Monitoring → Resolved

External Communication

Regulatory Reporting (CRITICAL incidents only):

  • PCI-DSS: Notify acquiring bank within 24 hours if cardholder data compromised
  • SOX: Document incident for quarterly IT controls audit
  • Basel III: Report cyber risk event to risk management committee

Customer Notification:

  • Criteria: Data breach, prolonged outage (> 4 hours), SLA violation
  • Channel: Email to registered contacts, status page
  • Timeline: Initial notification within 2 hours, updates every 4 hours

Playbook Index

IDPlaybookSeverityTrigger
P1Critical Vulnerability Detected🔴 CRITICALGitHub issue, CVE alert, security scan
P2Compromised Controller Pod🔴 CRITICALAnomalous behavior, unauthorized access
P3DNS Service Outage🔴 CRITICALAll BIND9 pods down, DNS queries failing
P4RNDC Key Compromise🔴 CRITICALKey leaked, unauthorized RNDC access
P5Unauthorized DNS Changes🟠 HIGHUnexpected zone modifications
P6DDoS Attack🟠 HIGHQuery flood, resource exhaustion
P7Supply Chain Compromise🔴 CRITICALMalicious commit, compromised dependency

Playbooks


P1: Critical Vulnerability Detected

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) SLA: Patch deployed within 24 hours

Trigger

  • Daily security scan detects CRITICAL vulnerability (CVSS 9.0-10.0)
  • GitHub Security Advisory published for Bindy dependency
  • CVE announced with active exploitation in the wild
  • Automated GitHub issue created: [SECURITY] CRITICAL vulnerability detected

Detection

# Automated detection via GitHub Actions
# Workflow: .github/workflows/security-scan.yaml
# Frequency: Daily at 00:00 UTC

# Manual check:
cargo audit --deny warnings
trivy image ghcr.io/firestoned/bindy:latest --severity CRITICAL,HIGH

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Acknowledge Incident

# Acknowledge PagerDuty alert
# Create Slack war room: #incident-[date]-vuln-[CVE-ID]

Step 1.2: Assess Vulnerability

# Review GitHub issue or security scan results
# Questions to answer:
# - What is the vulnerable component? (dependency, base image, etc.)
# - What is the CVSS score and attack vector?
# - Is there a known exploit (Exploit-DB, Metasploit)?
# - Is Bindy actually vulnerable (code path reachable)?

Step 1.3: Check Production Exposure

# Verify if vulnerable version is deployed
kubectl get deploy -n dns-system bindy -o jsonpath='{.spec.template.spec.containers[0].image}'

# Check image digest
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy -o jsonpath='{.items[0].spec.containers[0].image}'

# Compare with vulnerable version from security advisory

Step 1.4: Determine Impact

  • If Bindy is NOT vulnerable (code path not reachable):

    • Update to patched version at next release (non-urgent)
    • Document exception in SECURITY.md
    • Close incident as FALSE POSITIVE
  • If Bindy IS vulnerable (exploitable in production):

    • PROCEED TO CONTAINMENT (Phase 2)

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Isolate Vulnerable Pods (if actively exploited)

# Scale down controller to prevent further exploitation
kubectl scale deploy -n dns-system bindy --replicas=0

# NOTE: This stops DNS updates but does NOT affect DNS queries
# BIND9 continues serving existing zones

Step 2.2: Review Audit Logs

# Check for signs of exploitation
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=1000 | grep -i "error\|panic\|exploit"

# Review Kubernetes audit logs (if available)
# Look for: Unusual API calls, secret reads, privilege escalation attempts

Step 2.3: Assess Blast Radius

  • Controller compromised? Check for unauthorized DNS changes, secret reads
  • BIND9 affected? Check if RNDC keys were stolen
  • Data exfiltration? Review network logs for unusual egress traffic

Phase 3: Eradication (T+1 hour to T+24 hours)

Step 3.1: Apply Patch

Option A: Update Dependency (Rust crate)

# Update specific dependency
cargo update -p <vulnerable-package>

# Verify fix
cargo audit

# Run tests
cargo test

# Build new image
docker build -t ghcr.io/firestoned/bindy:hotfix-$(date +%s) .

# Push to registry
docker push ghcr.io/firestoned/bindy:hotfix-$(date +%s)

Option B: Update Base Image

# Update Dockerfile to latest Chainguard image
# docker/Dockerfile:
FROM cgr.dev/chainguard/static:latest-dev  # Use latest digest

# Rebuild and push
docker build -t ghcr.io/firestoned/bindy:hotfix-$(date +%s) .
docker push ghcr.io/firestoned/bindy:hotfix-$(date +%s)

Option C: Apply Workaround (if no patch available)

  • Disable vulnerable feature flag
  • Add input validation to prevent exploit
  • Document workaround in SECURITY.md

Step 3.2: Verify Fix

# Scan patched image
trivy image ghcr.io/firestoned/bindy:hotfix-$(date +%s) --severity CRITICAL,HIGH

# Expected: No CRITICAL vulnerabilities found

Step 3.3: Emergency Release

# Tag release
git tag -s hotfix-v0.1.1 -m "Security hotfix: CVE-XXXX-XXXXX"
git push origin hotfix-v0.1.1

# Trigger release workflow
# Verify signed commits, SBOM generation, vulnerability scans pass

Phase 4: Recovery (T+24 hours to T+48 hours)

Step 4.1: Deploy Patched Version

# Update deployment manifest (GitOps)
# deploy/controller/deployment.yaml:
spec:
  template:
    spec:
      containers:
      - name: bindy
        image: ghcr.io/firestoned/bindy:hotfix-v0.1.1  # Patched version

# Apply via FluxCD (GitOps) or manually
kubectl apply -f deploy/controller/deployment.yaml

# Verify rollout
kubectl rollout status deploy/bindy -n dns-system

# Confirm pods running patched version
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy -o jsonpath='{.items[0].spec.containers[0].image}'

Step 4.2: Verify Service Health

# Check controller logs
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

# Verify reconciliation working
kubectl get dnszones --all-namespaces
kubectl describe dnszone -n team-web example-com

# Test DNS resolution
dig @<bind9-ip> example.com

Step 4.3: Run Security Scans

# Full security scan
cargo audit
trivy image ghcr.io/firestoned/bindy:hotfix-v0.1.1

# Expected: All clear

Phase 5: Post-Incident (T+48 hours to T+1 week)

Step 5.1: Document Incident

  • Update CHANGELOG.md with hotfix details
  • Document root cause in incident report
  • Update SECURITY.md if needed (known issues, exceptions)

Step 5.2: Notify Stakeholders

  • Update status page: “Resolved - Security patch deployed”
  • Send email to compliance team (attach incident report)
  • Notify customers if required (data breach, SLA violation)

Step 5.3: Post-Incident Review (PIR)

  • What went well? (Detection, response time, communication)
  • What could improve? (Patch process, testing, automation)
  • Action items: (Update playbook, add monitoring, improve defenses)

Step 5.4: Update Metrics

  • MTTR (Mean Time To Remediate): ____ hours
  • SLA compliance: ✅ Met / ❌ Missed
  • Update vulnerability dashboard

Success Criteria

  • ✅ Patch deployed within 24 hours
  • ✅ No exploitation detected in production
  • ✅ Service availability maintained (or minimal downtime)
  • ✅ All security scans pass post-patch
  • ✅ Incident documented and reported to compliance

P2: Compromised Controller Pod

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Unauthorized DNS modifications, secret theft, lateral movement

Trigger

  • Anomalous controller behavior (unexpected API calls, network traffic)
  • Unauthorized modifications to DNS zones
  • Security alert from SIEM or IDS
  • Pod logs show suspicious activity (reverse shell, file downloads)

Detection

# Monitor controller logs for anomalies
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=500 | grep -E "(shell|wget|curl|nc|bash)"

# Check for unexpected processes in pod
kubectl exec -n dns-system <controller-pod> -- ps aux

# Review Kubernetes audit logs
# Look for: Unusual secret reads, excessive API calls, privilege escalation attempts

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm Compromise

# Check controller logs
kubectl logs -n dns-system <controller-pod> --tail=1000 > /tmp/controller-logs.txt

# Indicators of compromise (IOCs):
# - Reverse shell activity (nc, bash -i, /dev/tcp/)
# - File downloads (wget, curl to suspicious domains)
# - Privilege escalation attempts (sudo, setuid)
# - Crypto mining (high CPU, connections to mining pools)

Step 1.2: Assess Impact

# Check for unauthorized DNS changes
kubectl get dnszones --all-namespaces -o yaml > /tmp/dnszones-snapshot.yaml

# Compare with known good state (GitOps repo)
diff /tmp/dnszones-snapshot.yaml /path/to/gitops/dnszones/

# Check for secret reads
# Review Kubernetes audit logs for GET /api/v1/namespaces/dns-system/secrets/*

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Isolate Controller Pod

# Apply network policy to block all egress (prevent data exfiltration)
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-controller-quarantine
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress: []  # Block all egress
EOF

# Delete compromised pod (force recreation)
kubectl delete pod -n dns-system <controller-pod> --force --grace-period=0

Step 2.2: Rotate Credentials

# Rotate RNDC key (if potentially stolen)
# Generate new key
tsig-keygen -a hmac-sha256 rndc-key > /tmp/new-rndc-key.conf

# Update secret
kubectl create secret generic rndc-key-new \
  --from-file=rndc.key=/tmp/new-rndc-key.conf \
  -n dns-system \
  --dry-run=client -o yaml | kubectl apply -f -

# Update BIND9 pods to use new key (restart required)
kubectl rollout restart statefulset/bind9-primary -n dns-system
kubectl rollout restart statefulset/bind9-secondary -n dns-system

# Delete old secret
kubectl delete secret rndc-key -n dns-system

Step 2.3: Preserve Evidence

# Save pod logs before deletion
kubectl logs -n dns-system <controller-pod> --all-containers > /tmp/forensics/controller-logs-$(date +%s).txt

# Capture pod manifest
kubectl get pod -n dns-system <controller-pod> -o yaml > /tmp/forensics/controller-pod-manifest.yaml

# Save Kubernetes events
kubectl get events -n dns-system --sort-by='.lastTimestamp' > /tmp/forensics/events.txt

# Export audit logs (if available)
# - ServiceAccount API calls
# - Secret access logs
# - DNS zone modifications

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Root Cause Analysis

# Analyze logs for initial compromise vector
# Common vectors:
# - Vulnerability in controller code (RCE, memory corruption)
# - Compromised dependency (malicious crate)
# - Supply chain attack (malicious image)
# - Misconfigured RBAC (excessive permissions)

# Check image provenance
kubectl get pod -n dns-system <controller-pod> -o jsonpath='{.spec.containers[0].image}'

# Verify image signature and SBOM
# If signature invalid or SBOM shows unexpected dependencies → supply chain attack

Step 3.2: Patch Vulnerability

  • If controller code vulnerability: Apply patch (see P1)
  • If supply chain attack: Investigate upstream, rollback to known good image
  • If RBAC misconfiguration: Fix RBAC, re-run verification script

Step 3.3: Scan for Backdoors

# Scan all images for malware
trivy image ghcr.io/firestoned/bindy:latest --scanners vuln,secret,misconfig

# Check for unauthorized SSH keys, cron jobs, persistence mechanisms
kubectl exec -n dns-system <new-controller-pod> -- ls -la /root/.ssh/
kubectl exec -n dns-system <new-controller-pod> -- cat /etc/crontab

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Deploy Clean Controller

# Verify image integrity
# - Signed commits in Git history
# - Signed container image with provenance
# - Clean vulnerability scan

# Deploy patched controller
kubectl rollout restart deploy/bindy -n dns-system

# Remove quarantine network policy
kubectl delete networkpolicy bindy-controller-quarantine -n dns-system

# Verify health
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

Step 4.2: Verify DNS Zones

# Restore DNS zones from GitOps (if unauthorized changes detected)
# 1. Revert changes in Git
# 2. Force FluxCD reconciliation
flux reconcile kustomization bindy-system --with-source

# Verify all zones match expected state
kubectl get dnszones --all-namespaces -o yaml | diff - /path/to/gitops/dnszones/

Step 4.3: Validate Service

# Test DNS resolution
dig @<bind9-ip> example.com

# Verify controller reconciliation
kubectl get dnszones --all-namespaces
kubectl describe dnszone -n team-web example-com | grep "Ready.*True"

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Forensic Analysis

  • Engage forensics team if required
  • Analyze preserved logs for IOCs
  • Timeline of compromise (initial access → lateral movement → exfiltration)

Step 5.2: Notify Stakeholders

  • Compliance: Report to SOX/PCI-DSS auditors (security incident)
  • Customers: If DNS records were modified or data exfiltrated
  • Regulators: If required by Basel III (cyber risk event reporting)

Step 5.3: Improve Defenses

  • Short-term: Implement missing network policies (L-1)
  • Medium-term: Add runtime security monitoring (Falco, Tetragon)
  • Long-term: Implement admission controller for image verification

Step 5.4: Update Documentation

  • Update incident playbook with lessons learned
  • Document new IOCs for detection rules
  • Update threat model (docs/security/THREAT_MODEL.md)

Success Criteria

  • ✅ Compromised pod isolated within 15 minutes
  • ✅ No lateral movement to other pods/namespaces
  • ✅ Credentials rotated (RNDC keys)
  • ✅ Root cause identified and patched
  • ✅ DNS service fully restored with verified integrity
  • ✅ Forensic evidence preserved for investigation

P3: DNS Service Outage

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: All DNS queries failing, service unavailable

Trigger

  • All BIND9 pods down (CrashLoopBackOff, OOMKilled)
  • DNS queries timing out
  • Monitoring alert: “DNS service unavailable”
  • Customer reports: “Cannot resolve domain names”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+10 min)

Step 1.1: Confirm Outage

# Test DNS resolution
dig @<bind9-loadbalancer-ip> example.com

# Check pod status
kubectl get pods -n dns-system -l app.kubernetes.io/name=bind9

# Check service endpoints
kubectl get svc -n dns-system bind9-dns -o wide
kubectl get endpoints -n dns-system bind9-dns

Step 1.2: Identify Root Cause

# Check pod logs
kubectl logs -n dns-system <bind9-pod> --tail=200

# Common root causes:
# - OOMKilled (memory exhaustion)
# - CrashLoopBackOff (configuration error, missing ConfigMap)
# - ImagePullBackOff (registry issue, image not found)
# - Pending (insufficient resources, node failure)

# Check events
kubectl describe pod -n dns-system <bind9-pod>

Phase 2: Containment & Quick Fix (T+10 min to T+30 min)

Scenario A: OOMKilled (Memory Exhaustion)

# Increase memory limit
kubectl patch statefulset bind9-primary -n dns-system -p '
spec:
  template:
    spec:
      containers:
      - name: bind9
        resources:
          limits:
            memory: "512Mi"  # Increase from 256Mi
'

# Restart pods
kubectl rollout restart statefulset/bind9-primary -n dns-system

Scenario B: Configuration Error

# Check ConfigMap
kubectl get cm -n dns-system bind9-config -o yaml

# Common issues:
# - Syntax error in named.conf
# - Missing zone file
# - Invalid RNDC key

# Fix configuration (update ConfigMap)
kubectl edit cm bind9-config -n dns-system

# Restart pods to apply new config
kubectl rollout restart statefulset/bind9-primary -n dns-system

Scenario C: Image Pull Failure

# Check image pull secret
kubectl get secret -n dns-system ghcr-pull-secret

# Verify image exists
docker pull ghcr.io/firestoned/bindy:latest

# If image missing, rollback to previous version
kubectl rollout undo statefulset/bind9-primary -n dns-system

Phase 3: Recovery (T+30 min to T+2 hours)

Step 3.1: Verify Service Restoration

# Check all pods healthy
kubectl get pods -n dns-system -l app.kubernetes.io/name=bind9

# Test DNS resolution (all zones)
dig @<bind9-ip> example.com
dig @<bind9-ip> test.example.com

# Check service endpoints
kubectl get endpoints -n dns-system bind9-dns
# Should show all healthy pod IPs

Step 3.2: Validate Data Integrity

# Verify all zones loaded
kubectl exec -n dns-system <bind9-pod> -- rndc status

# Check zone serial numbers (ensure no data loss)
dig @<bind9-ip> example.com SOA

# Compare with expected serial (from GitOps)

Phase 4: Post-Incident (T+2 hours to T+1 week)

Step 4.1: Root Cause Analysis

  • Why did BIND9 exhaust memory? (Too many zones, memory leak, query flood)
  • Why did configuration break? (Controller bug, bad CRD validation, manual change)
  • Why did image pull fail? (Registry downtime, authentication issue)

Step 4.2: Preventive Measures

  • Add horizontal pod autoscaling (HPA based on CPU/memory)
  • Add health checks (liveness/readiness probes for BIND9)
  • Add configuration validation (admission webhook for ConfigMaps)
  • Add chaos engineering tests (kill pods, exhaust memory, test recovery)

Step 4.3: Update SLO/SLA

  • Document actual downtime
  • Calculate availability percentage
  • Update SLA reports for customers

Success Criteria

  • ✅ DNS service restored within 30 minutes
  • ✅ All zones serving correctly
  • ✅ No data loss (zone serial numbers match)
  • ✅ Root cause identified and documented
  • ✅ Preventive measures implemented

P4: RNDC Key Compromise

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Attacker can control BIND9 (reload zones, freeze service, etc.)

Trigger

  • RNDC key found in logs, Git commit, or public repository
  • Unauthorized RNDC commands detected (audit logs)
  • Security scan detects secret in code or environment variables

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm Compromise

# Search for leaked key in logs
grep -r "rndc-key" /var/log/ /tmp/

# Search Git history for accidentally committed keys
git log -S "rndc-key" --all

# Check GitHub secret scanning alerts
# GitHub → Security → Secret scanning alerts

Step 1.2: Assess Impact

# Check BIND9 logs for unauthorized RNDC commands
kubectl logs -n dns-system <bind9-pod> --tail=1000 | grep "rndc command"

# Check for malicious activity:
# - rndc freeze (stop zone updates)
# - rndc reload (load malicious zone)
# - rndc querylog on (enable debug logging for reconnaissance)

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Rotate RNDC Key (Emergency)

# Generate new RNDC key
tsig-keygen -a hmac-sha256 rndc-key-emergency > /tmp/rndc-key-new.conf

# Extract key from generated file
cat /tmp/rndc-key-new.conf

# Create new Kubernetes secret
kubectl create secret generic rndc-key-rotated \
  --from-literal=key="<new-key-here>" \
  -n dns-system

# Update controller deployment to use new secret
kubectl set env deploy/bindy -n dns-system RNDC_KEY_SECRET=rndc-key-rotated

# Update BIND9 StatefulSets
kubectl set volume statefulset/bind9-primary -n dns-system \
  --add --name=rndc-key \
  --type=secret \
  --secret-name=rndc-key-rotated \
  --mount-path=/etc/bind/rndc.key \
  --sub-path=rndc.key

# Restart all BIND9 pods
kubectl rollout restart statefulset/bind9-primary -n dns-system
kubectl rollout restart statefulset/bind9-secondary -n dns-system

# Delete compromised secret
kubectl delete secret rndc-key -n dns-system

Step 2.2: Block Network Access (if attacker active)

# Apply network policy to block RNDC port (953) from external access
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-rndc-deny-external
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow DNS queries (port 53)
  - from:
    - namespaceSelector: {}
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow RNDC only from controller
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bindy
    ports:
    - protocol: TCP
      port: 953
EOF

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Remove Leaked Secrets

If secret in Git:

# Remove from Git history (use BFG Repo-Cleaner)
git clone --mirror git@github.com:firestoned/bindy.git
bfg --replace-text passwords.txt bindy.git
cd bindy.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push --force

# Notify all team members to re-clone repository

If secret in logs:

# Rotate logs immediately
kubectl delete pod -n dns-system <controller-pod>  # Forces log rotation

# Purge old logs from log aggregation system
# (Depends on logging backend: Elasticsearch, CloudWatch, etc.)

Step 3.2: Audit All Secret Access

# Review Kubernetes audit logs
# Find all ServiceAccounts that read rndc-key secret in last 30 days
# Check if any unauthorized access occurred

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Verify Key Rotation

# Test RNDC with new key
kubectl exec -n dns-system <controller-pod> -- \
  rndc -s <bind9-ip> -k /etc/bindy/rndc/rndc.key status

# Expected: Command succeeds with new key

# Test DNS service
dig @<bind9-ip> example.com

# Expected: DNS queries work normally

Step 4.2: Update Documentation

# Update secret rotation procedure in SECURITY.md
# Document rotation frequency (e.g., quarterly, or after incident)

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Secret Detection

# Add pre-commit hook to detect secrets
# .git/hooks/pre-commit:
#!/bin/bash
git diff --cached --name-only | xargs grep -E "(rndc-key|BEGIN RSA PRIVATE KEY)" && {
  echo "ERROR: Secret detected in commit. Aborting."
  exit 1
}

# Enable GitHub secret scanning (if not already enabled)
# GitHub → Settings → Code security and analysis → Secret scanning: Enable

Step 5.2: Automate Key Rotation

# Implement automated quarterly key rotation
# Add CronJob to generate and rotate keys every 90 days

Step 5.3: Improve Secret Management

  • Consider external secret manager (HashiCorp Vault, AWS Secrets Manager)
  • Implement secret access audit trail (H-3)
  • Add alerts on unexpected secret reads

Success Criteria

  • ✅ RNDC key rotated within 1 hour
  • ✅ Leaked secret removed from all locations
  • ✅ No unauthorized RNDC commands executed
  • ✅ DNS service fully functional with new key
  • ✅ Secret detection mechanisms implemented
  • ✅ Audit trail reviewed and documented

P5: Unauthorized DNS Changes

Severity: 🟠 HIGH Response Time: < 1 hour Impact: DNS records modified without approval, potential traffic redirection

Trigger

  • Unexpected changes to DNSZone custom resources
  • DNS records pointing to unknown IP addresses
  • GitOps detects drift (actual state ≠ desired state)
  • User reports: “DNS not resolving correctly”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+30 min)

Step 1.1: Identify Unauthorized Changes

# Get current DNSZone state
kubectl get dnszones --all-namespaces -o yaml > /tmp/current-dnszones.yaml

# Compare with GitOps source of truth
diff /tmp/current-dnszones.yaml /path/to/gitops/dnszones/

# Check Kubernetes audit logs for who made changes
# Look for: kubectl apply, kubectl edit, kubectl patch on DNSZone resources

Step 1.2: Assess Impact

# Which zones were modified?
# What records changed? (A, CNAME, MX, TXT)
# Where is traffic being redirected?

# Test DNS resolution
dig @<bind9-ip> suspicious-domain.com

# Check if malicious IP is reachable
nslookup suspicious-domain.com
curl -I http://<suspicious-ip>/

Phase 2: Containment (T+30 min to T+1 hour)

Step 2.1: Revert Unauthorized Changes

# Revert to known good state (GitOps)
kubectl apply -f /path/to/gitops/dnszones/team-web/example-com.yaml

# Force controller reconciliation
kubectl annotate dnszone -n team-web example-com \
  reconcile-at="$(date +%s)" --overwrite

# Verify zone restored
kubectl get dnszone -n team-web example-com -o yaml | grep "status"

Step 2.2: Revoke Access (if compromised user)

# Identify user who made unauthorized change (from audit logs)
# Example: user=alice, namespace=team-web

# Remove user's RBAC permissions
kubectl delete rolebinding dnszone-editor-alice -n team-web

# Force user to re-authenticate
# (Depends on authentication provider: OIDC, LDAP, etc.)

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Root Cause Analysis

  • Compromised user credentials? Rotate passwords, check for MFA bypass
  • RBAC misconfiguration? User had excessive permissions
  • Controller bug? Controller reconciled incorrect state
  • Manual kubectl change? Bypassed GitOps workflow

Step 3.2: Fix Root Cause

# Example: RBAC was too permissive
# Fix RoleBinding to limit scope
kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dnszone-editor-alice
  namespace: team-web
subjects:
- kind: User
  name: alice
roleRef:
  kind: Role
  name: dnszone-editor  # Role only allows CRUD on DNSZones, not deletion
  apiGroup: rbac.authorization.k8s.io
EOF

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Verify DNS Integrity

# Test all zones
for zone in $(kubectl get dnszones --all-namespaces -o jsonpath='{.items[*].spec.zoneName}'); do
  echo "Testing $zone"
  dig @<bind9-ip> $zone SOA
done

# Expected: All zones resolve correctly with expected serial numbers

Step 4.2: Restore User Access (if revoked)

# After confirming user is not compromised, restore access
kubectl apply -f /path/to/gitops/rbac/team-web/alice-rolebinding.yaml

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Admission Webhooks

# Add ValidatingWebhook to prevent suspicious DNS changes
# Example: Block A records pointing to private IPs (RFC 1918)
# Example: Require approval for changes to critical zones (*.bank.com)

Step 5.2: Add Drift Detection

# Implement automated GitOps drift detection
# Alert if cluster state ≠ Git state for > 5 minutes
# Tool: FluxCD notification controller + Slack webhook

Step 5.3: Enforce GitOps Workflow

# Remove direct kubectl access for users
# Require all changes via Pull Requests in GitOps repo
# Implement branch protection: 2+ reviewers required

Success Criteria

  • ✅ Unauthorized changes reverted within 1 hour
  • ✅ Root cause identified (user, RBAC, controller bug)
  • ✅ Access revoked/fixed to prevent recurrence
  • ✅ DNS integrity verified (all zones correct)
  • ✅ Drift detection and admission webhooks implemented

P6: DDoS Attack

Severity: 🟠 HIGH Response Time: < 1 hour Impact: DNS service degraded or unavailable due to query flood

Trigger

  • High query rate (> 10,000 QPS per pod)
  • BIND9 pods high CPU/memory utilization
  • Monitoring alert: “DNS response time elevated”
  • Users report: “DNS slow or timing out”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm DDoS Attack

# Check BIND9 query rate
kubectl exec -n dns-system <bind9-pod> -- rndc status | grep "queries resulted"

# Check pod resource utilization
kubectl top pods -n dns-system -l app.kubernetes.io/name=bind9

# Analyze query patterns
kubectl exec -n dns-system <bind9-pod> -- rndc dumpdb -zones
kubectl exec -n dns-system <bind9-pod> -- cat /var/cache/bind/named_dump.db | head -100

Step 1.2: Identify Attack Type

  • Volumetric attack: Millions of queries from many IPs (botnet)
  • Amplification attack: Abusing AXFR or ANY queries
  • NXDOMAIN attack: Flood of queries for non-existent domains

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Enable Rate Limiting (BIND9)

# Update BIND9 configuration
kubectl edit cm -n dns-system bind9-config

# Add rate-limit directive:
# named.conf:
rate-limit {
    responses-per-second 10;
    nxdomains-per-second 5;
    errors-per-second 5;
    window 10;
};

# Restart BIND9 to apply config
kubectl rollout restart statefulset/bind9-primary -n dns-system

Step 2.2: Scale Up BIND9 Pods

# Horizontal scaling
kubectl scale statefulset bind9-secondary -n dns-system --replicas=5

# Vertical scaling (if needed)
kubectl patch statefulset bind9-primary -n dns-system -p '
spec:
  template:
    spec:
      containers:
      - name: bind9
        resources:
          requests:
            cpu: "1000m"
            memory: "1Gi"
          limits:
            cpu: "2000m"
            memory: "2Gi"
'

Step 2.3: Block Malicious IPs (if identifiable)

# If attack comes from small number of IPs, block at firewall/LoadBalancer
# Example: AWS Network ACL, GCP Cloud Armor

# Add NetworkPolicy to block specific CIDRs
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: block-attacker-ips
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  - from:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 192.0.2.0/24  # Attacker CIDR
        - 198.51.100.0/24  # Attacker CIDR
EOF

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Engage DDoS Protection Service

# If volumetric attack (> 10 Gbps), edge DDoS protection required
# Options:
# - CloudFlare DNS (proxy DNS through CloudFlare)
# - AWS Shield Advanced
# - Google Cloud Armor

# Migrate DNS to CloudFlare (example):
# 1. Add zone to CloudFlare
# 2. Update NS records at domain registrar
# 3. Configure CloudFlare → Origin (BIND9 backend)

Step 3.2: Implement Response Rate Limiting (RRL)

# BIND9 RRL configuration (more aggressive)
rate-limit {
    responses-per-second 5;
    nxdomains-per-second 2;
    referrals-per-second 5;
    nodata-per-second 5;
    errors-per-second 2;
    window 5;
    log-only no;  # Actually drop packets (not just log)
    slip 2;  # Send truncated response every 2nd rate-limited query
    max-table-size 20000;
};

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Monitor Service Health

# Check query rate stabilized
kubectl exec -n dns-system <bind9-pod> -- rndc status

# Check pod resource utilization
kubectl top pods -n dns-system

# Test DNS resolution
dig @<bind9-ip> example.com

# Expected: Normal response times (< 50ms)

Step 4.2: Scale Down (if attack subsided)

# Return to normal replica count
kubectl scale statefulset bind9-secondary -n dns-system --replicas=2

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Permanent DDoS Protection

  • Edge DDoS protection: CloudFlare, AWS Shield, Google Cloud Armor
  • Anycast DNS: Distribute load across multiple geographic locations
  • Autoscaling: HPA based on query rate, CPU, memory

Step 5.2: Improve Monitoring

# Add Prometheus metrics for query rate
# Add alerts:
# - Query rate > 5000 QPS per pod
# - NXDOMAIN rate > 50%
# - Response time > 100ms (p95)

Step 5.3: Document Attack Details

  • Attack duration: ____ hours
  • Peak query rate: ____ QPS
  • Attack type: Volumetric / Amplification / NXDOMAIN
  • Attack sources: IP ranges, ASNs, geolocation
  • Mitigation effectiveness: RRL / Scaling / Edge protection

Success Criteria

  • ✅ DNS service restored within 1 hour
  • ✅ Query rate normalized (< 1000 QPS per pod)
  • ✅ Response times < 50ms (p95)
  • ✅ Permanent DDoS protection implemented (CloudFlare, etc.)
  • ✅ Autoscaling and monitoring in place

P7: Supply Chain Compromise

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Malicious code in controller, backdoor access, data exfiltration

Trigger

  • Malicious commit detected in Git history
  • Dependency vulnerability with active exploit (supply chain attack)
  • Image signature verification fails
  • SBOM shows unexpected dependency or binary

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+30 min)

Step 1.1: Identify Compromised Component

# Check Git commit signatures
git log --show-signature | grep "BAD signature"

# Check image provenance
docker buildx imagetools inspect ghcr.io/firestoned/bindy:latest --format '{{ json .Provenance }}'

# Expected: Valid signature from GitHub Actions

# Check SBOM for unexpected dependencies
# Download SBOM from GitHub release artifacts
curl -L https://github.com/firestoned/bindy/releases/download/v1.0.0/sbom.json | jq '.components[].name'

# Expected: Only known dependencies from Cargo.toml

Step 1.2: Assess Impact

# Check if compromised version deployed to production
kubectl get deploy -n dns-system bindy -o jsonpath='{.spec.template.spec.containers[0].image}'

# If compromised image is running → **CRITICAL** (proceed to containment)
# If compromised image NOT deployed → **HIGH** (patch and prevent deployment)

Phase 2: Containment (T+30 min to T+2 hours)

Step 2.1: Isolate Compromised Controller

# Scale down compromised controller
kubectl scale deploy -n dns-system bindy --replicas=0

# Apply network policy to block egress (prevent exfiltration)
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-quarantine
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress: []
EOF

Step 2.2: Preserve Evidence

# Save pod logs
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --all-containers > /tmp/forensics/controller-logs.txt

# Save compromised image for analysis
docker pull ghcr.io/firestoned/bindy:compromised-tag
docker save ghcr.io/firestoned/bindy:compromised-tag > /tmp/forensics/compromised-image.tar

# Scan for malware
trivy image ghcr.io/firestoned/bindy:compromised-tag --scanners vuln,secret,misconfig

Step 2.3: Rotate All Credentials

# Rotate RNDC keys
# See P4: RNDC Key Compromise

# Rotate ServiceAccount tokens (if controller potentially stole them)
kubectl delete secret -n dns-system $(kubectl get secrets -n dns-system | grep bindy-token | awk '{print $1}')
kubectl rollout restart deploy/bindy -n dns-system  # Will generate new token

Phase 3: Eradication (T+2 hours to T+8 hours)

Step 3.1: Root Cause Analysis

# Identify how malicious code was introduced:
# - Compromised developer account?
# - Malicious dependency in Cargo.toml?
# - Compromised CI/CD pipeline?
# - Insider threat?

# Check Git history for unauthorized commits
git log --all --show-signature

# Check CI/CD logs for anomalies
# GitHub Actions → Workflow runs → Check for unusual activity

# Check dependency sources
cargo tree | grep -v "crates.io"
# Expected: All dependencies from crates.io (no git dependencies)

Step 3.2: Clean Git History (if malicious commit)

# Identify malicious commit
git log --all --oneline | grep "suspicious"

# Revert malicious commit
git revert <malicious-commit-sha>

# Force push (if malicious code not yet merged to main)
git push --force origin feature-branch

# If malicious code merged to main → Contact GitHub Security
# Request help with incident response and forensics

Step 3.3: Rebuild from Clean Source

# Checkout known good commit (before compromise)
git checkout <last-known-good-commit>

# Rebuild binaries
cargo build --release

# Rebuild container image
docker build -t ghcr.io/firestoned/bindy:clean-$(date +%s) .

# Scan for vulnerabilities
cargo audit
trivy image ghcr.io/firestoned/bindy:clean-$(date +%s)

# Expected: All clean

# Push to registry
docker push ghcr.io/firestoned/bindy:clean-$(date +%s)

Phase 4: Recovery (T+8 hours to T+24 hours)

Step 4.1: Deploy Clean Controller

# Update deployment manifest
kubectl set image deploy/bindy -n dns-system \
  bindy=ghcr.io/firestoned/bindy:clean-$(date +%s)

# Remove quarantine network policy
kubectl delete networkpolicy bindy-quarantine -n dns-system

# Verify health
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

Step 4.2: Verify Service Integrity

# Test DNS resolution
dig @<bind9-ip> example.com

# Verify all zones correct
kubectl get dnszones --all-namespaces -o yaml | diff - /path/to/gitops/dnszones/

# Expected: No drift

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Supply Chain Security

# Enable Dependabot security updates
# .github/dependabot.yml:
version: 2
updates:
  - package-ecosystem: "cargo"
    directory: "/"
    schedule:
      interval: "daily"
    open-pull-requests-limit: 10

# Pin dependencies by hash (Cargo.lock already does this)
# Verify Cargo.lock is committed to Git

# Implement image signing verification
# Add admission controller (Kyverno, OPA Gatekeeper) to verify image signatures before deployment

Step 5.2: Implement Code Review Enhancements

# Require 2+ reviewers for all PRs (already implemented)
# Add CODEOWNERS for sensitive files:
# .github/CODEOWNERS:
/Cargo.toml @security-team
/Cargo.lock @security-team
/Dockerfile @security-team
/.github/workflows/ @security-team

Step 5.3: Notify Stakeholders

  • Users: Email notification about supply chain incident
  • Regulators: Report to SOX/PCI-DSS auditors (security incident)
  • GitHub Security: Report compromised dependency or account

Step 5.4: Update Documentation

  • Document supply chain incident in threat model
  • Update supply chain security controls in SECURITY.md
  • Add supply chain attack scenarios to threat model

Success Criteria

  • ✅ Compromised component identified within 30 minutes
  • ✅ Malicious code removed from Git history
  • ✅ Clean controller deployed within 24 hours
  • ✅ All credentials rotated
  • ✅ Supply chain security improvements implemented
  • ✅ Stakeholders notified and incident documented

Post-Incident Activities

Post-Incident Review (PIR) Template

Incident ID: INC-YYYY-MM-DD-XXXX Severity: 🔴 / 🟠 / 🟡 / 🔵 Incident Commander: [Name] Date: [YYYY-MM-DD] Duration: [Detection to resolution]

Summary

[1-2 paragraph summary of incident]

Timeline

TimeEventAction Taken
T+0[Detection event][Action]
T+15min[Analysis][Action]
T+1hr[Containment][Action]
T+4hr[Eradication][Action]
T+24hr[Recovery][Action]

Root Cause

[Detailed root cause analysis]

What Went Well ✅

  • [Detection was fast]
  • [Playbook was clear]
  • [Team communication was effective]

What Could Improve ❌

  • [Monitoring gaps]
  • [Playbook outdated]
  • [Slow escalation]

Action Items

ActionOwnerDue DateStatus
[Implement network policies]Platform Team2025-01-15🔄 In Progress
[Add monitoring alerts]SRE Team2025-01-10✅ Complete
[Update playbook]Security Team2025-01-05✅ Complete

Metrics

  • MTTD (Mean Time To Detect): [X] minutes
  • MTTR (Mean Time To Remediate): [X] hours
  • SLA Met: ✅ Yes / ❌ No
  • Downtime: [X] minutes
  • Customers Impacted: [N]

References


Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Vulnerability Management Policy

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: PCI-DSS 6.2, SOX 404, Basel III Cyber Risk


Table of Contents


Overview

This document defines the vulnerability management policy for the Bindy DNS Controller project. The policy ensures that security vulnerabilities in dependencies, container images, and source code are identified, tracked, and remediated in a timely manner to maintain compliance with PCI-DSS, SOX, and Basel III requirements.

Objectives

  1. Identify vulnerabilities in all software components before deployment
  2. Remediate vulnerabilities within defined SLAs based on severity
  3. Track and report vulnerability metrics for compliance audits
  4. Prevent deployment of code with CRITICAL/HIGH vulnerabilities
  5. Maintain audit trail of vulnerability management activities

Scope

This policy applies to:

  • Rust dependencies (direct and transitive) listed in Cargo.lock
  • Container base images (Debian, Alpine, etc.)
  • Container runtime dependencies (libraries, binaries)
  • Development dependencies used in CI/CD pipelines
  • Third-party libraries and tools

Out of Scope

  • Kubernetes cluster vulnerabilities (managed by platform team)
  • Infrastructure vulnerabilities (managed by operations team)
  • Application logic vulnerabilities (covered by code review process)

Vulnerability Severity Levels

Vulnerabilities are classified using the Common Vulnerability Scoring System (CVSS v3) and mapped to severity levels:

🔴 CRITICAL (CVSS 9.0-10.0)

Definition: Vulnerabilities that can be exploited remotely without authentication and lead to:

  • Remote code execution (RCE)
  • Complete system compromise
  • Data exfiltration of sensitive information
  • Denial of service affecting multiple systems

Examples:

  • Unauthenticated RCE in web server
  • SQL injection with admin access
  • Memory corruption leading to arbitrary code execution

SLA: 24 hours


🟠 HIGH (CVSS 7.0-8.9)

Definition: Vulnerabilities that can be exploited with limited user interaction or authentication and lead to:

  • Privilege escalation
  • Unauthorized data access
  • Significant denial of service
  • Bypass of authentication/authorization controls

Examples:

  • Authenticated RCE
  • Cross-site scripting (XSS) with session hijacking
  • Path traversal allowing file read/write
  • Insecure deserialization

SLA: 7 days


🟡 MEDIUM (CVSS 4.0-6.9)

Definition: Vulnerabilities that require significant user interaction or specific conditions and lead to:

  • Limited information disclosure
  • Localized denial of service
  • Minor authorization bypass
  • Reduced system functionality

Examples:

  • Information disclosure (non-sensitive data)
  • CSRF with limited impact
  • Reflected XSS
  • Resource exhaustion (single process)

SLA: 30 days


🔵 LOW (CVSS 0.1-3.9)

Definition: Vulnerabilities with minimal impact that require significant preconditions:

  • Cosmetic issues
  • Minor information disclosure
  • Difficult-to-exploit conditions
  • No direct security impact

Examples:

  • Version disclosure
  • Clickjacking on non-critical pages
  • Minor configuration issues

SLA: 90 days or next release


Remediation SLAs

SeverityCVSS ScoreDetection to FixApproval to DeployExceptions
🔴 CRITICAL9.0-10.024 hours4 hoursCISO approval required
🟠 HIGH7.0-8.97 days1 business daySecurity lead approval
🟡 MEDIUM4.0-6.930 daysNext sprintTeam lead approval
🔵 LOW0.1-3.990 daysNext releaseAuto-approved

SLA Clock

  • Starts: When vulnerability is first detected by automated scan
  • Pauses: When risk acceptance or exception is granted
  • Stops: When patch is deployed to production OR exception is approved

SLA Escalation

If SLA is at risk of being missed:

  • T-50%: Notification to team lead
  • T-80%: Notification to security team
  • T-100%: Escalation to CISO and incident response team

Scanning Process

Automated Scanning

1. Continuous Integration (CI) Scanning

Frequency: Every PR and commit to main branch

Tools:

  • cargo audit for Rust dependencies
  • Trivy for container images

Process:

  1. PR is opened or updated
  2. CI workflow runs security scans
  3. If CRITICAL/HIGH vulnerabilities found:
    • CI fails
    • PR is blocked from merging
    • GitHub issue is created automatically
  4. Developer must remediate before merge

Workflow: .github/workflows/pr.yaml

2. Scheduled Scanning

Frequency: Daily at 00:00 UTC

Tools:

  • cargo audit for dependencies
  • Trivy for published container images

Process:

  1. Scan runs automatically via GitHub Actions
  2. Results are uploaded to GitHub Security tab
  3. If vulnerabilities found:
    • GitHub issue is created with details
    • Security team is notified
  4. Vulnerabilities are tracked until remediation

Workflow: .github/workflows/security-scan.yaml

3. Release Scanning

Frequency: Every release tag

Tools:

  • cargo audit for final dependency snapshot
  • Trivy for release container image

Process:

  1. Release is tagged
  2. Security scans run before deployment
  3. If CRITICAL/HIGH vulnerabilities found:
    • Release fails
    • Issue is created for emergency fix
  4. Release proceeds only if all scans pass

Workflow: .github/workflows/release.yaml

Manual Scanning

Developers should run scans locally before committing:

# Scan Rust dependencies
cargo audit

# Scan container image
trivy image ghcr.io/firestoned/bindy:latest

Remediation Process

Step 1: Triage (Within 4 hours for CRITICAL, 24 hours for HIGH)

  1. Verify vulnerability applies to Bindy:

    • Check if vulnerable code path is used
    • Verify affected version matches
    • Assess exploitability in Bindy’s context
  2. Assess impact:

    • What data/systems are at risk?
    • What is the attack vector?
    • Is there a known exploit?
  3. Determine remediation approach:

    • Update dependency to patched version
    • Apply workaround/mitigation
    • Accept risk (if low impact)

Step 2: Remediation (Within SLA)

Option A: Update Dependency

# Update single dependency
cargo update -p <package-name>

# Verify fix
cargo audit

# Test
cargo test

Option B: Upgrade Major Version

# Update Cargo.toml
vim Cargo.toml  # Change version constraint

# Update lockfile
cargo update

# Test for breaking changes
cargo test

Option C: Apply Workaround

If no patch is available:

  1. Disable vulnerable feature flag
  2. Implement input validation
  3. Add runtime checks
  4. Document in SECURITY.md

Option D: Request Exception (See Exception Process)

Step 3: Verification

  1. Run cargo audit to confirm vulnerability is resolved
  2. Run cargo test to ensure no regressions
  3. Run integration tests
  4. Document fix in PR description

Step 4: Deployment

  1. Create PR with fix
  2. PR passes all CI checks (including security scans)
  3. Code review and approval
  4. Merge to main
  5. Deploy to production
  6. Close GitHub issue

Step 5: Post-Deployment

  1. Verify vulnerability is resolved in production
  2. Update metrics dashboard
  3. Document lessons learned
  4. Update runbooks if needed

Exception Process

When to Request an Exception

  • No patch available and vulnerability has low exploitability
  • Patch introduces breaking changes requiring extended migration
  • Vulnerability does not apply to Bindy’s use case
  • Compensating controls mitigate the risk

Exception Request Process

  1. Create exception request (GitHub issue or security ticket):

    • Vulnerability ID (CVE, RUSTSEC-ID)
    • Severity and CVSS score
    • Justification for exception
    • Compensating controls
    • Expiration date (max 90 days)
  2. Approval required:

    • CRITICAL: CISO approval
    • HIGH: Security lead approval
    • MEDIUM: Team lead approval
    • LOW: Auto-approved
  3. Document in SECURITY.md:

    ## Known Vulnerabilities (Risk Accepted)
    
    ### CVE-2024-XXXXX - <Package Name>
    - **Severity:** HIGH
    - **Affected Version:** 1.2.3
    - **Status:** Risk Accepted
    - **Justification:** Vulnerability requires local file system access, which is not available in Kubernetes pod security context.
    - **Compensating Controls:** Pod security policy enforces readOnlyRootFilesystem=true
    - **Expiration:** 2025-03-01
    - **Approved By:** Jane Doe (Security Lead)
    - **Date:** 2025-01-15
    
  4. Review exceptions monthly:

    • Check if patch is now available
    • Verify compensating controls are still effective
    • Renew or remediate before expiration

Reporting and Metrics

Weekly Report

Recipients: Development team, Security team

Contents:

  • New vulnerabilities detected
  • Vulnerabilities remediated
  • Open vulnerabilities by severity
  • SLA compliance percentage
  • Aging vulnerabilities (open >30 days)

Source: GitHub Security tab + automated report workflow

Monthly Report

Recipients: Management, Compliance team

Contents:

  • Vulnerability trends (month-over-month)
  • Mean time to remediate (MTTR) by severity
  • SLA compliance rate
  • Exception requests and approvals
  • Top 5 vulnerable dependencies
  • Compliance attestation

Source: Security metrics dashboard

Quarterly Report

Recipients: Executive team, Audit team

Contents:

  • Vulnerability management effectiveness
  • Policy compliance audit results
  • Risk acceptance report
  • Remediation process improvements
  • Compliance attestation (PCI-DSS, SOX, Basel III)

Source: Compliance reporting system

Key Metrics

  1. Mean Time to Detect (MTTD): Time from CVE disclosure to detection in Bindy

    • Target: <24 hours
  2. Mean Time to Remediate (MTTR):

    • CRITICAL: <24 hours
    • HIGH: <7 days
    • MEDIUM: <30 days
  3. SLA Compliance Rate: Percentage of vulnerabilities remediated within SLA

    • Target: >95%
  4. Vulnerability Backlog: Open vulnerabilities by severity

    • Target: Zero CRITICAL, <5 HIGH
  5. Scan Coverage: Percentage of releases scanned

    • Target: 100%

Roles and Responsibilities

Development Team

  • Run local security scans before committing
  • Remediate vulnerabilities assigned to them
  • Create PRs with security fixes
  • Test fixes for regressions
  • Document security changes in CHANGELOG

Security Team

  • Monitor daily scan results
  • Triage and assign vulnerabilities
  • Approve risk exceptions
  • Conduct weekly vulnerability reviews
  • Maintain this policy document
  • Report metrics to management

DevOps/SRE Team

  • Maintain CI/CD scanning infrastructure
  • Deploy security patches to production
  • Monitor for new container base image vulnerabilities
  • Coordinate emergency patching

Compliance Team

  • Review quarterly vulnerability reports
  • Validate SLA compliance for audits
  • Maintain audit trail documentation
  • Coordinate with external auditors

Compliance Requirements

PCI-DSS 6.2

Requirement: Protect all system components from known vulnerabilities by installing applicable security patches/updates.

Implementation:

  • Automated vulnerability scanning (cargo audit + Trivy)
  • Patch within SLA (CRITICAL: 24h, HIGH: 7d)
  • Audit trail of remediation activities
  • Quarterly vulnerability reports

Evidence:

  • GitHub Actions scan logs
  • Security dashboard showing zero CRITICAL vulnerabilities
  • CHANGELOG entries documenting patches
  • Exception approval records

SOX 404 - IT General Controls

Requirement: IT systems must have controls to identify and remediate security vulnerabilities.

Implementation:

  • Documented vulnerability management policy (this document)
  • Automated scanning in CI/CD pipeline
  • SLA-based remediation tracking
  • Monthly compliance reports

Evidence:

  • This policy document
  • CI/CD workflow configurations
  • GitHub issues tracking remediation
  • Monthly vulnerability management reports

Basel III - Operational/Cyber Risk

Requirement: Banks must manage cyber risk through preventive controls.

Implementation:

  • Preventive control: Block deployment of vulnerable code (CI gate)
  • Detective control: Daily scheduled scans
  • Corrective control: SLA-based remediation process
  • Risk acceptance: Exception process with approvals

Evidence:

  • Failed CI builds due to vulnerabilities
  • Scheduled scan results
  • Remediation SLA metrics
  • Exception approval documentation

References


Policy Review

This policy is reviewed and updated:

  • Quarterly: By security team
  • Annually: By compliance team
  • Ad-hoc: When compliance requirements change

Last Review: 2025-12-17 Next Review: 2025-03-17 Approved By: Security Team

Build Reproducibility Verification

Status: ✅ Implemented Compliance: SLSA Level 3, SOX 404 (Supply Chain), PCI-DSS 6.4.6 (Code Review) Last Updated: 2025-12-18 Owner: Security Team


Table of Contents

  1. Overview
  2. SLSA Level 3 Requirements
  3. Build Reproducibility Verification
  4. Sources of Non-Determinism
  5. Verification Process
  6. Container Image Reproducibility
  7. Continuous Verification
  8. Troubleshooting

Overview

Build reproducibility (also called “deterministic builds” or “reproducible builds”) means that building the same source code twice produces bit-for-bit identical binaries. This is critical for:

  • Supply Chain Security: Verify released binaries match source code (detect tampering)
  • SLSA Level 3 Compliance: Required for software supply chain integrity
  • SOX 404 Compliance: Ensures change management controls are effective
  • Incident Response: Verify binaries in production match known-good builds

Why Reproducibility Matters

Attack Scenario (Without Reproducibility):

  1. Attacker compromises CI/CD pipeline or build server
  2. Injects malicious code during build process (e.g., backdoor in binary)
  3. Source code in Git is clean, but distributed binary contains malware
  4. Users cannot verify if binary matches source code

Defense (With Reproducibility):

  1. Independent party rebuilds from source code
  2. Compares hash of rebuilt binary with released binary
  3. If hashes match → binary is authentic ✅
  4. If hashes differ → binary was tampered with 🚨

Current Status

Bindy’s build process is mostly reproducible with the following exceptions:

Build ArtifactReproducible?Status
Rust binary (target/release/bindy)✅ YESDeterministic with Cargo.lock pinned
Container image (Chainguard)⚠️ PARTIALBase image updates break reproducibility
Container image (Distroless)⚠️ PARTIALBase image updates break reproducibility
CRD YAML files✅ YESGenerated from Rust types (deterministic)
SBOM (Software Bill of Materials)✅ YESGenerated from Cargo.lock (deterministic)

Goal: Achieve 100% reproducibility by pinning base image digests and using reproducible timestamps.


SLSA Level 3 Requirements

SLSA (Supply Chain Levels for Software Artifacts) Level 3 requires:

SLSA RequirementBindy ImplementationStatus
Build provenance✅ Signed commits, SBOM, container attestation✅ Complete
Source integrity✅ GPG/SSH signed commits, branch protection✅ Complete
Build integrity✅ Reproducible builds (this document)✅ Complete
Hermetic builds⚠️ Docker builds use network (cargo fetch)⚠️ Partial
Build as code✅ Dockerfile and Makefile in version control✅ Complete
Verification✅ Automated reproducibility checks in CI✅ Complete

SLSA Level 3 Build Requirements

  1. Reproducible: Same source + same toolchain = same binary
  2. Hermetic: Build process has no network access (all deps pre-fetched)
  3. Isolated: Build cannot access secrets or external state
  4. Auditable: Build process fully documented and verifiable

Bindy’s Approach:

  • ✅ Reproducible: Cargo.lock pins all dependencies, Dockerfile uses pinned base images
  • ⚠️ Hermetic: Docker build uses network (acceptable for SLSA Level 2, working toward Level 3)
  • ✅ Isolated: CI/CD builds in ephemeral containers, no persistent state
  • ✅ Auditable: Build process in Makefile, Dockerfile, and GitHub Actions workflows

Build Reproducibility Verification

Prerequisites

To verify build reproducibility, you need:

  1. Same source code: Exact commit hash (e.g., git checkout v0.1.0)
  2. Same toolchain: Same Rust version (e.g., rustc 1.91.0)
  3. Same dependencies: Same Cargo.lock (committed to Git)
  4. Same build flags: Same optimization level, target triple, features

Step 1: Rebuild from Source

# Clone the repository
git clone https://github.com/firestoned/bindy.git
cd bindy

# Check out the exact release tag
git checkout v0.1.0

# Verify commit signature
git verify-commit v0.1.0

# Verify toolchain version matches release
rustc --version
# Expected: rustc 1.91.0 (stable 2024-10-17)

# Build release binary
cargo build --release --locked

# Calculate SHA-256 hash of binary
sha256sum target/release/bindy

Example Output:

abc123def456789... target/release/bindy

Step 2: Compare with Released Binary

# Download released binary from GitHub Releases
curl -LO https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy-linux-amd64

# Calculate SHA-256 hash of released binary
sha256sum bindy-linux-amd64

Expected Output:

abc123def456789... bindy-linux-amd64

Verification:

  • PASS - Hashes match → Binary is authentic and reproducible
  • 🚨 FAIL - Hashes differ → Binary may be tampered or build is non-deterministic

Step 3: Investigate Hash Mismatch

If hashes differ, check the following:

# 1. Verify Rust toolchain version
rustc --version
cargo --version

# 2. Verify Cargo.lock is identical
git diff v0.1.0 -- Cargo.lock

# 3. Verify build flags
cargo build --release --locked --verbose | grep "Running.*rustc"

# 4. Check for timestamp differences
objdump -s -j .comment target/release/bindy

Common Causes of Non-Determinism:

  1. Different Rust toolchain version
  2. Modified Cargo.lock (dependency version mismatch)
  3. Different build flags or features
  4. Embedded timestamps in binary (see Sources of Non-Determinism)

Sources of Non-Determinism

1. Timestamps

Problem: Build timestamps embedded in binaries make them non-reproducible.

Sources in Rust:

  • env!("CARGO_PKG_VERSION") → OK (from Cargo.toml, deterministic)
  • env!("BUILD_DATE") → ❌ NON-DETERMINISTIC (changes every build)
  • File modification times (mtime) → ❌ NON-DETERMINISTIC

Fix:

#![allow(unused)]
fn main() {
// ❌ BAD - Embeds build timestamp
const BUILD_DATE: &str = env!("BUILD_DATE");

// ✅ GOOD - Use Git commit timestamp (deterministic)
const BUILD_DATE: &str = env!("VERGEN_GIT_COMMIT_TIMESTAMP");
}

Using vergen for Deterministic Build Info:

Add to Cargo.toml:

[build-dependencies]
vergen = { version = "8", features = ["git", "gitcl"] }

Create build.rs:

use vergen::EmitBuilder;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    EmitBuilder::builder()
        .git_commit_timestamp()  // Use Git commit timestamp (deterministic)
        .git_sha(false)          // Short Git SHA (deterministic)
        .emit()?;
    Ok(())
}

Use in main.rs:

#![allow(unused)]
fn main() {
const BUILD_DATE: &str = env!("VERGEN_GIT_COMMIT_TIMESTAMP");
const GIT_SHA: &str = env!("VERGEN_GIT_SHA");

println!("Bindy {} ({})", env!("CARGO_PKG_VERSION"), GIT_SHA);
println!("Built: {}", BUILD_DATE);
}

Why This Works:

  • Git commit timestamp is fixed for a given commit (never changes)
  • Independent builds of the same commit will use the same timestamp
  • Verifiable by anyone with access to the Git repository

2. Filesystem Order

Problem: Reading files in directory order is non-deterministic (depends on filesystem).

Example:

#![allow(unused)]
fn main() {
// ❌ BAD - Directory order is non-deterministic
for entry in std::fs::read_dir("zones")? {
    let file = entry?.path();
    process_zone(file);
}

// ✅ GOOD - Sort files before processing
let mut files: Vec<_> = std::fs::read_dir("zones")?
    .collect::<Result<_, _>>()?;
files.sort_by_key(|e| e.path());
for entry in files {
    process_zone(entry.path());
}
}

3. HashMap Iteration Order

Problem: Rust HashMap iteration order is randomized for security (hash DoS protection).

Example:

#![allow(unused)]
fn main() {
use std::collections::HashMap;

// ❌ BAD - HashMap iteration order is non-deterministic
let mut zones = HashMap::new();
zones.insert("example.com", "10.0.0.1");
zones.insert("test.com", "10.0.0.2");

for (zone, ip) in &zones {
    println!("{} -> {}", zone, ip);  // Order is random!
}

// ✅ GOOD - Use BTreeMap for deterministic iteration
use std::collections::BTreeMap;

let mut zones = BTreeMap::new();
zones.insert("example.com", "10.0.0.1");
zones.insert("test.com", "10.0.0.2");

for (zone, ip) in &zones {
    println!("{} -> {}", zone, ip);  // Sorted order (deterministic)
}
}

When This Matters:

  • Generating configuration files (BIND9 named.conf)
  • Serializing data to JSON/YAML
  • Logging or printing debug output that’s included in build artifacts

4. Parallelism and Race Conditions

Problem: Parallel builds may produce different results if intermediate files are generated in different orders.

Example:

#![allow(unused)]
fn main() {
// ❌ BAD - Parallel iterators may produce non-deterministic output
use rayon::prelude::*;

let output = zones.par_iter()
    .map(|zone| generate_config(zone))
    .collect::<Vec<_>>()
    .join("\n");  // Order depends on which thread finishes first!

// ✅ GOOD - Sort after parallel processing
let mut output = zones.par_iter()
    .map(|zone| generate_config(zone))
    .collect::<Vec<_>>();
output.sort();  // Deterministic order
let output = output.join("\n");
}

5. Base Image Updates (Container Images)

Problem: Docker base images update frequently, breaking reproducibility.

Example:

# ❌ BAD - Uses latest version (non-reproducible)
FROM cgr.dev/chainguard/static:latest

# ✅ GOOD - Pin to specific digest
FROM cgr.dev/chainguard/static:latest@sha256:abc123def456...

How to Pin Base Image Digest:

# Get current digest
docker pull cgr.dev/chainguard/static:latest
docker inspect cgr.dev/chainguard/static:latest | jq -r '.[0].RepoDigests[0]'
# Output: cgr.dev/chainguard/static:latest@sha256:abc123def456...

# Update Dockerfile
sed -i 's|cgr.dev/chainguard/static:latest|cgr.dev/chainguard/static:latest@sha256:abc123def456...|' docker/Dockerfile.chainguard

Trade-Off:

  • Pro: Reproducible builds (same base image every time)
  • ⚠️ Con: No automatic security updates (must manually update digest)

Recommended Approach:

  • Pin digest for releases (v0.1.0, v0.2.0, etc.) → Reproducibility
  • Use latest for development builds → Automatic security updates
  • Update base image digest monthly or after CVE disclosures

Verification Process

Automated Verification (CI/CD)

Goal: Rebuild every release and verify the binary hash matches the released artifact.

GitHub Actions Workflow:

# .github/workflows/verify-reproducibility.yaml
name: Verify Build Reproducibility

on:
  release:
    types: [published]
  workflow_dispatch:
    inputs:
      tag:
        description: 'Git tag to verify (e.g., v0.1.0)'
        required: true

jobs:
  verify-reproducibility:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout source code
        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.inputs.tag || github.event.release.tag_name }}

      - name: Install Rust toolchain
        uses: dtolnay/rust-toolchain@stable
        with:
          toolchain: 1.91.0  # Match release toolchain

      - name: Rebuild binary
        run: cargo build --release --locked

      - name: Calculate hash of rebuilt binary
        id: rebuilt-hash
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          echo "Rebuilt binary hash: $HASH"

      - name: Download released binary
        run: |
          TAG=${{ github.event.inputs.tag || github.event.release.tag_name }}
          curl -LO https://github.com/firestoned/bindy/releases/download/$TAG/bindy-linux-amd64

      - name: Calculate hash of released binary
        id: released-hash
        run: |
          HASH=$(sha256sum bindy-linux-amd64 | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          echo "Released binary hash: $HASH"

      - name: Compare hashes
        run: |
          REBUILT="${{ steps.rebuilt-hash.outputs.hash }}"
          RELEASED="${{ steps.released-hash.outputs.hash }}"

          if [ "$REBUILT" == "$RELEASED" ]; then
            echo "✅ PASS: Hashes match - Build is reproducible"
            exit 0
          else
            echo "🚨 FAIL: Hashes differ - Build is NOT reproducible"
            echo "Rebuilt:  $REBUILT"
            echo "Released: $RELEASED"
            exit 1
          fi

      - name: Upload verification report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: reproducibility-report
          path: |
            target/release/bindy
            bindy-linux-amd64

When to Run:

  • Automatically: After every release (GitHub Actions release event)
  • Manually: On-demand for any Git tag (workflow_dispatch)
  • Scheduled: Monthly verification of latest release

Manual Verification (External Auditors)

Goal: Allow external auditors to independently verify builds without access to CI/CD.

Verification Script (scripts/verify-build.sh):

#!/usr/bin/env bash
# Verify build reproducibility for a Bindy release
#
# Usage:
#   ./scripts/verify-build.sh v0.1.0
#
# Requirements:
#   - Git
#   - Rust toolchain (rustc 1.91.0)
#   - curl, sha256sum

set -euo pipefail

TAG="${1:-}"
if [ -z "$TAG" ]; then
  echo "Usage: $0 <git-tag>"
  echo "Example: $0 v0.1.0"
  exit 1
fi

echo "============================================"
echo "Verifying build reproducibility for $TAG"
echo "============================================"

# 1. Check out the source code
echo ""
echo "[1/6] Checking out source code..."
git fetch --tags
git checkout "$TAG"
git verify-commit "$TAG" || {
  echo "⚠️  WARNING: Commit signature verification failed"
}

# 2. Verify Rust toolchain version
echo ""
echo "[2/6] Verifying Rust toolchain..."
EXPECTED_RUSTC="rustc 1.91.0"
ACTUAL_RUSTC=$(rustc --version)
if [[ "$ACTUAL_RUSTC" != "$EXPECTED_RUSTC"* ]]; then
  echo "⚠️  WARNING: Rust version mismatch"
  echo "   Expected: $EXPECTED_RUSTC"
  echo "   Actual:   $ACTUAL_RUSTC"
  echo "   Continuing anyway..."
fi

# 3. Rebuild binary
echo ""
echo "[3/6] Building release binary..."
cargo build --release --locked

# 4. Calculate hash of rebuilt binary
echo ""
echo "[4/6] Calculating hash of rebuilt binary..."
REBUILT_HASH=$(sha256sum target/release/bindy | awk '{print $1}')
echo "   Rebuilt hash: $REBUILT_HASH"

# 5. Download released binary
echo ""
echo "[5/6] Downloading released binary..."
RELEASE_URL="https://github.com/firestoned/bindy/releases/download/$TAG/bindy-linux-amd64"
curl -sL -o bindy-released "$RELEASE_URL"

# 6. Calculate hash of released binary
echo ""
echo "[6/6] Calculating hash of released binary..."
RELEASED_HASH=$(sha256sum bindy-released | awk '{print $1}')
echo "   Released hash: $RELEASED_HASH"

# Compare hashes
echo ""
echo "============================================"
echo "VERIFICATION RESULT"
echo "============================================"
if [ "$REBUILT_HASH" == "$RELEASED_HASH" ]; then
  echo "✅ PASS: Hashes match"
  echo ""
  echo "The released binary is reproducible and matches the source code."
  echo "This confirms the binary was built from the tagged commit without tampering."
  exit 0
else
  echo "🚨 FAIL: Hashes differ"
  echo ""
  echo "Rebuilt:  $REBUILT_HASH"
  echo "Released: $RELEASED_HASH"
  echo ""
  echo "The released binary does NOT match the rebuilt binary."
  echo "Possible causes:"
  echo "  - Different Rust toolchain version"
  echo "  - Non-deterministic build process"
  echo "  - Binary tampering (SECURITY INCIDENT)"
  echo ""
  echo "Next steps:"
  echo "  1. Verify Rust toolchain: rustc --version"
  echo "  2. Check build.rs for timestamps or randomness"
  echo "  3. Contact security@firestoned.io if tampering suspected"
  exit 1
fi

Make executable:

chmod +x scripts/verify-build.sh

Usage:

./scripts/verify-build.sh v0.1.0

Container Image Reproducibility

Challenge: Docker Layers are Non-Deterministic

Docker images are harder to reproduce than binaries because:

  1. Base image updates (even with same tag, digest changes)
  2. File timestamps in layers (mtime)
  3. Layer order affects final hash
  4. Docker build cache affects output

Solution: Use SOURCE_DATE_EPOCH for Reproducible Timestamps

Dockerfile Best Practices:

# docker/Dockerfile.chainguard
# Pin base image digest for reproducibility
ARG BASE_IMAGE_DIGEST=sha256:abc123def456...
FROM cgr.dev/chainguard/static:latest@${BASE_IMAGE_DIGEST}

# Use SOURCE_DATE_EPOCH for reproducible timestamps
ARG SOURCE_DATE_EPOCH
ENV SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH}

# Copy binary (built with same SOURCE_DATE_EPOCH)
COPY --chmod=755 target/release/bindy /usr/local/bin/bindy

USER nonroot:nonroot
ENTRYPOINT ["/usr/local/bin/bindy"]

Build with Reproducible Timestamp:

# Get Git commit timestamp (deterministic)
export SOURCE_DATE_EPOCH=$(git log -1 --format=%ct)

# Build container image
docker build \
  --build-arg SOURCE_DATE_EPOCH=$SOURCE_DATE_EPOCH \
  --build-arg BASE_IMAGE_DIGEST=sha256:abc123def456... \
  -t ghcr.io/firestoned/bindy:v0.1.0 \
  -f docker/Dockerfile.chainguard \
  .

Verify Image Reproducibility:

# Build image twice
docker build ... -t bindy:build1
docker build ... -t bindy:build2

# Compare image digests
docker inspect bindy:build1 | jq -r '.[0].Id'
docker inspect bindy:build2 | jq -r '.[0].Id'

# If digests match → Reproducible ✅
# If digests differ → Non-deterministic 🚨

Multi-Stage Build for Reproducibility

Recommended Pattern:

# Stage 1: Build binary (reproducible)
FROM rust:1.91-alpine AS builder
WORKDIR /build

# Copy dependency manifests
COPY Cargo.toml Cargo.lock ./

# Pre-fetch dependencies (layer cached, reproducible)
RUN cargo fetch --locked

# Copy source code
COPY src/ ./src/
COPY build.rs ./

# Build binary with reproducible timestamp
ARG SOURCE_DATE_EPOCH
ENV SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH}
RUN cargo build --release --locked --offline

# Stage 2: Runtime image (reproducible with pinned base)
ARG BASE_IMAGE_DIGEST=sha256:abc123def456...
FROM cgr.dev/chainguard/static:latest@${BASE_IMAGE_DIGEST}

# Copy binary from builder
COPY --from=builder --chmod=755 /build/target/release/bindy /usr/local/bin/bindy

USER nonroot:nonroot
ENTRYPOINT ["/usr/local/bin/bindy"]

Why This Works:

  • Layer 1 (dependencies): Deterministic (Cargo.lock pinned)
  • Layer 2 (source code): Deterministic (Git commit)
  • Layer 3 (build): Deterministic (SOURCE_DATE_EPOCH)
  • Layer 4 (runtime): Deterministic (pinned base image digest)

Continuous Verification

Daily Verification Checks

Goal: Catch non-determinism regressions early (before releases).

Scheduled GitHub Actions:

# .github/workflows/reproducibility-check.yaml
name: Reproducibility Check

on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM UTC
  push:
    branches:
      - main

jobs:
  build-twice:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable
        with:
          toolchain: 1.91.0

      # Build 1
      - name: Build binary (attempt 1)
        run: cargo build --release --locked

      - name: Calculate hash (attempt 1)
        id: hash1
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          mv target/release/bindy bindy-build1

      # Clean build directory
      - name: Clean build artifacts
        run: cargo clean

      # Build 2
      - name: Build binary (attempt 2)
        run: cargo build --release --locked

      - name: Calculate hash (attempt 2)
        id: hash2
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          mv target/release/bindy bindy-build2

      # Compare
      - name: Verify reproducibility
        run: |
          HASH1="${{ steps.hash1.outputs.hash }}"
          HASH2="${{ steps.hash2.outputs.hash }}"

          if [ "$HASH1" == "$HASH2" ]; then
            echo "✅ PASS: Builds are reproducible"
            exit 0
          else
            echo "🚨 FAIL: Builds are NOT reproducible"
            echo "Build 1: $HASH1"
            echo "Build 2: $HASH2"

            # Show differences
            objdump -s bindy-build1 > build1.dump
            objdump -s bindy-build2 > build2.dump
            diff -u build1.dump build2.dump || true

            exit 1
          fi

When to Alert:

  • Daily check PASS: No action needed
  • 🚨 Daily check FAIL: Alert security team, investigate non-determinism

Troubleshooting

Build Hash Mismatch Debugging

Step 1: Verify Toolchain

# Check Rust version
rustc --version
cargo --version

# Check installed targets
rustup show

# Check default toolchain
rustup default

Expected:

rustc 1.91.0 (stable 2024-10-17)
cargo 1.91.0

Step 2: Compare Build Metadata

# Extract build metadata from binary
strings target/release/bindy | grep -E "(rustc|cargo|VERGEN)"

# Compare with released binary
strings bindy-released | grep -E "(rustc|cargo|VERGEN)"

Look for:

  • Different Rust version strings
  • Different Git commit SHAs
  • Embedded timestamps

Step 3: Disassemble and Diff

# Disassemble both binaries
objdump -d target/release/bindy > rebuilt.asm
objdump -d bindy-released > released.asm

# Diff assembly code
diff -u rebuilt.asm released.asm | head -n 100

Common Patterns:

  • Timestamp differences in .rodata section
  • Different symbol addresses (ASLR-related, cosmetic)
  • Random padding bytes

Step 4: Check for Timestamps

# Search for ISO 8601 timestamps in binary
strings target/release/bindy | grep -E "[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}"

# Search for Unix timestamps
strings target/release/bindy | grep -E "^[0-9]{10}$"

If found: Update source code to use VERGEN_GIT_COMMIT_TIMESTAMP instead of env!("BUILD_DATE")


Container Image Hash Mismatch

Step 1: Verify Base Image Digest

# Get current base image digest
docker pull cgr.dev/chainguard/static:latest
docker inspect cgr.dev/chainguard/static:latest | jq -r '.[0].RepoDigests[0]'

# Compare with Dockerfile
grep "FROM cgr.dev/chainguard/static" docker/Dockerfile.chainguard

If digests differ: Update Dockerfile to pin correct digest


Step 2: Check Layer Timestamps

# Extract image layers
docker save bindy:v0.1.0 | tar -xv

# Check layer timestamps
tar -tvzf <layer-hash>.tar.gz | head -n 20

Look for:

  • Recent timestamps (should all match SOURCE_DATE_EPOCH)
  • Different file mtimes between builds

Step 3: Rebuild with Verbose Output

# Rebuild with verbose Docker output
docker build --no-cache --progress=plain \
  --build-arg SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) \
  -t bindy:debug \
  -f docker/Dockerfile.chainguard \
  . 2>&1 | tee build.log

# Compare build logs
diff -u build1.log build2.log

References


Last Updated: 2025-12-18 Next Review: 2026-03-18 (Quarterly)

Secret Access Audit Trail

Status: ✅ Implemented Compliance: SOX 404 (Access Controls), PCI-DSS 7.1.2 (Least Privilege), Basel III (Cyber Risk) Last Updated: 2025-12-18 Owner: Security Team


Table of Contents

  1. Overview
  2. Secret Access Monitoring
  3. Audit Policy Configuration
  4. Audit Queries
  5. Alerting Rules
  6. Compliance Requirements
  7. Incident Response

Overview

This document describes Bindy’s secret access audit trail implementation, which provides:

  • Comprehensive Logging: All secret access (get, list, watch) is logged via Kubernetes audit logs
  • Immutable Storage: Audit logs stored in S3 with WORM (Object Lock) for tamper-proof retention
  • Real-Time Alerting: Prometheus/Alertmanager alerts on anomalous secret access patterns
  • Compliance Queries: Pre-built queries for SOX 404, PCI-DSS, and Basel III audit reviews
  • Retention: 7-year retention (SOX 404 requirement) with 90-day active storage (Elasticsearch)

Secrets Covered

Bindy audit logging covers all Kubernetes Secrets in the dns-system namespace:

Secret NamePurposeAccess Pattern
rndc-key-*RNDC authentication keys for BIND9 controlController reads on reconciliation (every 5 minutes)
tls-cert-*TLS certificates for DNS-over-TLS/HTTPSBIND9 pods read on startup
Custom secretsUser-defined secrets for DNS credentialsVaries by use case

Compliance Mapping

FrameworkRequirementHow We Comply
SOX 404IT General Controls - Access ControlAudit logs show who accessed secrets and when (7-year retention)
PCI-DSS 7.1.2Restrict access to privileged user IDsRBAC limits secret access to controller (read-only) + audit trail
PCI-DSS 10.2.1Audit log all access to cardholder dataSecret access logged with user, timestamp, action, outcome
Basel IIICyber Risk - Access MonitoringReal-time alerting on anomalous secret access, quarterly reviews

Secret Access Monitoring

What is Logged

Every secret access operation generates an audit log entry with:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a4b5c6d7-e8f9-0a1b-2c3d-4e5f6a7b8c9d",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/dns-system/secrets/rndc-key-primary",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy-controller",
    "uid": "abc123",
    "groups": ["system:serviceaccounts", "system:serviceaccounts:dns-system"]
  },
  "sourceIPs": ["10.244.1.15"],
  "userAgent": "bindy/v0.1.0 (linux/amd64) kubernetes/abc123",
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key-primary",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-18T12:34:56.789Z",
  "stageTimestamp": "2025-12-18T12:34:56.790Z"
}

Key Fields for Auditing

FieldDescriptionAudit Use Case
user.usernameServiceAccount or user who accessed the secretWho accessed the secret
sourceIPsPod IP or client IP that made the requestWhere the request came from
objectRef.nameSecret name (e.g., rndc-key-primary)What secret was accessed
verbAction performed (get, list, watch)How the secret was accessed
responseStatus.codeHTTP status code (200 = success, 403 = denied)Outcome of the access attempt
requestReceivedTimestampWhen the request was madeWhen the access occurred
userAgentClient application (e.g., bindy/v0.1.0)Which application accessed the secret

Audit Policy Configuration

Kubernetes Audit Policy

The audit policy is configured in /etc/kubernetes/audit-policy.yaml on the Kubernetes control plane.

Relevant Section for Secret Access (H-3 Requirement):

apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
  name: bindy-secret-access-audit
rules:
  # ============================================================================
  # H-3: Secret Access Audit Trail
  # ============================================================================

  # Log ALL secret access in dns-system namespace (read operations)
  - level: Metadata
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"  # Only log after response is sent

  # Log ALL secret modifications (should be DENIED by RBAC, but log anyway)
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"

  # Log secret access failures (403 Forbidden)
  # This catches unauthorized access attempts
  - level: Metadata
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"

Audit Log Rotation

Audit logs are rotated and forwarded using Fluent Bit:

# /etc/fluent-bit/fluent-bit.conf
[INPUT]
    Name              tail
    Path              /var/log/kubernetes/audit.log
    Parser            json
    Tag               kube.audit
    Refresh_Interval  5
    Mem_Buf_Limit     50MB
    Skip_Long_Lines   On

[FILTER]
    Name    grep
    Match   kube.audit
    Regex   objectRef.resource secrets

[OUTPUT]
    Name                s3
    Match               kube.audit
    bucket              bindy-audit-logs
    region              us-east-1
    store_dir           /var/log/fluent-bit/s3
    total_file_size     100M
    upload_timeout      10m
    use_put_object      On
    s3_key_format       /audit/secrets/%Y/%m/%d/$UUID.json.gz
    compression         gzip

Key Points:

  • Audit logs filtered to only include secret access (objectRef.resource secrets)
  • Uploaded to S3 in /audit/secrets/ prefix for easy querying
  • Compressed with gzip (10:1 compression ratio)
  • WORM protection via S3 Object Lock (see AUDIT_LOG_RETENTION.md)

Audit Queries

Pre-Built Queries for Compliance Reviews

These queries are designed for use in Elasticsearch (Kibana) or direct S3 queries (Athena).

Q1: All Secret Access by ServiceAccount (Last 90 Days)

Use Case: SOX 404 quarterly access review

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-90d" } } }
      ]
    }
  },
  "aggs": {
    "by_service_account": {
      "terms": {
        "field": "user.username.keyword",
        "size": 50
      },
      "aggs": {
        "by_secret": {
          "terms": {
            "field": "objectRef.name.keyword",
            "size": 20
          },
          "aggs": {
            "access_count": {
              "value_count": {
                "field": "auditID"
              }
            }
          }
        }
      }
    }
  },
  "size": 0
}

Expected Output:

{
  "aggregations": {
    "by_service_account": {
      "buckets": [
        {
          "key": "system:serviceaccount:dns-system:bindy-controller",
          "doc_count": 25920,
          "by_secret": {
            "buckets": [
              {
                "key": "rndc-key-primary",
                "doc_count": 12960,
                "access_count": { "value": 12960 }
              },
              {
                "key": "rndc-key-secondary-1",
                "doc_count": 6480,
                "access_count": { "value": 6480 }
              }
            ]
          }
        }
      ]
    }
  }
}

Interpretation:

  • Controller accessed rndc-key-primary 12,960 times in 90 days
  • Expected: ~144 times/day (reconciliation every 10 minutes = 6 times/hour × 24 hours)
  • 12,960 / 90 days = 144 accesses/day ✅ NORMAL

Q2: Secret Access by Non-Controller ServiceAccounts

Use Case: Detect unauthorized secret access (should be ZERO)

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } }
      ],
      "must_not": [
        { "term": { "user.username.keyword": "system:serviceaccount:dns-system:bindy-controller" } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 100
}

Expected Output: 0 hits (only controller should access secrets)

If non-zero: 🚨 ALERT - Unauthorized secret access detected, trigger incident response (see INCIDENT_RESPONSE.md)


Q3: Failed Secret Access Attempts (403 Forbidden)

Use Case: Detect brute-force attacks or misconfigurations

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        { "term": { "responseStatus.code": 403 } }
      ]
    }
  },
  "aggs": {
    "by_user": {
      "terms": {
        "field": "user.username.keyword",
        "size": 50
      },
      "aggs": {
        "by_secret": {
          "terms": {
            "field": "objectRef.name.keyword",
            "size": 20
          }
        }
      }
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 100
}

Expected Output: Low volume (< 10/day) for misconfigured pods or during upgrades

If high volume (> 100/day): 🚨 ALERT - Potential brute-force attack, investigate source IPs


Q4: Secret Access Outside Business Hours

Use Case: Detect after-hours access (potential insider threat)

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } }
      ],
      "should": [
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "now/d",
              "lte": "now/d+8h",
              "time_zone": "America/New_York"
            }
          }
        },
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "now/d+18h",
              "lte": "now/d+24h",
              "time_zone": "America/New_York"
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "aggs": {
    "by_hour": {
      "date_histogram": {
        "field": "requestReceivedTimestamp",
        "calendar_interval": "hour",
        "time_zone": "America/New_York"
      }
    }
  },
  "size": 100
}

Expected Output: Consistent volume (automated reconciliation runs 24/7)

Anomalies:

  • Sudden spike in after-hours access → 🚨 Investigate source IPs and ServiceAccounts
  • Human users accessing secrets after hours → 🚨 Verify with change management records

Q5: Specific Secret Access History (e.g., rndc-key-primary)

Use Case: Compliance audit - “Show me all access to RNDC key in Q4 2025”

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.name.keyword": "rndc-key-primary" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "2025-10-01T00:00:00Z",
              "lte": "2025-12-31T23:59:59Z"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "access_by_day": {
      "date_histogram": {
        "field": "requestReceivedTimestamp",
        "calendar_interval": "day"
      },
      "aggs": {
        "by_service_account": {
          "terms": {
            "field": "user.username.keyword",
            "size": 10
          }
        }
      }
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "asc" } }
  ],
  "size": 10000
}

Expected Output: Daily access pattern showing controller accessing key ~144 times/day

Export for Auditors:

# Export to CSV for external auditors
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search?scroll=5m" \
  -H 'Content-Type: application/json' \
  -d @query-q5.json | \
  jq -r '.hits.hits[]._source | [
    .requestReceivedTimestamp,
    .user.username,
    .objectRef.name,
    .verb,
    .responseStatus.code,
    .sourceIPs[0]
  ] | @csv' > secret-access-q4-2025.csv

Alerting Rules

Prometheus Alerting for Secret Access Anomalies

Prerequisites:

  • Prometheus configured to scrape audit logs from Elasticsearch
  • Alertmanager configured for email/Slack/PagerDuty notifications

Alert: Unauthorized Secret Access

# /etc/prometheus/rules/bindy-secret-access.yaml
groups:
  - name: bindy_secret_access
    interval: 1m
    rules:
      # CRITICAL: Non-controller ServiceAccount accessed secrets
      - alert: UnauthorizedSecretAccess
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            user_username!~"system:serviceaccount:dns-system:bindy-controller"
          }[5m])) > 0
        for: 1m
        labels:
          severity: critical
          compliance: "SOX-404,PCI-DSS-7.1.2"
        annotations:
          summary: "Unauthorized secret access detected in dns-system namespace"
          description: |
            ServiceAccount {{ $labels.user_username }} accessed secret {{ $labels.objectRef_name }}.
            This violates least privilege RBAC policy (only bindy-controller should access secrets).

            Investigate immediately:
            1. Check source IP: {{ $labels.sourceIP }}
            2. Review audit logs for full context
            3. Verify RBAC policy is applied correctly
            4. Follow incident response: docs/security/INCIDENT_RESPONSE.md#p4
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/security/INCIDENT_RESPONSE.md#p4-rndc-key-compromise"

      # HIGH: Excessive secret access (potential compromised controller)
      - alert: ExcessiveSecretAccess
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            user_username="system:serviceaccount:dns-system:bindy-controller"
          }[5m])) > 10
        for: 10m
        labels:
          severity: warning
          compliance: "SOX-404"
        annotations:
          summary: "Controller accessing secrets at abnormally high rate"
          description: |
            Bindy controller is accessing secrets at {{ $value }}/sec (expected: ~0.5/sec).
            This may indicate:
            - Reconciliation loop bug (rapid retries)
            - Compromised controller pod
            - Performance issue causing excessive reconciliations

            Actions:
            1. Check controller logs for errors
            2. Verify reconciliation requeue times are correct
            3. Check for BIND9 pod restart loops
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/troubleshooting.md"

      # MEDIUM: Failed secret access attempts (brute force detection)
      - alert: FailedSecretAccessAttempts
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            responseStatus_code="403"
          }[5m])) > 1
        for: 5m
        labels:
          severity: warning
          compliance: "PCI-DSS-10.2.1"
        annotations:
          summary: "Multiple failed secret access attempts detected"
          description: |
            {{ $value }} failed secret access attempts per second.
            This may indicate:
            - Misconfigured pod trying to access secrets without RBAC
            - Attacker probing for secrets
            - RBAC policy change breaking legitimate access

            Actions:
            1. Review audit logs to identify source ServiceAccount/IP
            2. Verify RBAC policy is correct
            3. Check for recent RBAC changes
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/security/SECRET_ACCESS_AUDIT.md#q3-failed-secret-access-attempts-403-forbidden"

Alertmanager Routing

# /etc/alertmanager/config.yaml
route:
  group_by: ['alertname', 'severity']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'security-team'
  routes:
    # CRITICAL alerts go to PagerDuty + Slack
    - match:
        severity: critical
      receiver: 'pagerduty-security'
      continue: true
    - match:
        severity: critical
      receiver: 'slack-security'

receivers:
  - name: 'security-team'
    email_configs:
      - to: 'security@firestoned.io'
        from: 'alertmanager@firestoned.io'
        smarthost: 'smtp.sendgrid.net:587'

  - name: 'pagerduty-security'
    pagerduty_configs:
      - service_key: '<PagerDuty Integration Key>'
        description: '{{ .GroupLabels.alertname }}: {{ .Annotations.summary }}'

  - name: 'slack-security'
    slack_configs:
      - api_url: '<Slack Webhook URL>'
        channel: '#security-alerts'
        title: '🚨 {{ .GroupLabels.alertname }}'
        text: |
          *Severity:* {{ .Labels.severity }}
          *Compliance:* {{ .Labels.compliance }}

          {{ .Annotations.description }}

          *Runbook:* {{ .Annotations.runbook_url }}

Compliance Requirements

SOX 404 - IT General Controls

Control Objective: Ensure only authorized users access sensitive secrets

How We Comply:

SOX 404 RequirementBindy ImplementationEvidence
Access logs for all privileged accounts✅ Kubernetes audit logs capture all secret accessQuery Q1 (quarterly review)
Logs retained for 7 years✅ S3 Glacier with WORM (Object Lock)AUDIT_LOG_RETENTION.md
Quarterly access reviews✅ Run Query Q1, review access patternsScheduled Kibana report
Separation of duties (no single person can access + modify)✅ Controller has read-only access (cannot create/update/delete)RBAC policy verification

Quarterly Review Process:

  1. Week 1 of each quarter (Jan, Apr, Jul, Oct):

    • Security team runs Query Q1 (All Secret Access by ServiceAccount)
    • Export results to CSV for offline review
    • Verify only bindy-controller accessed secrets
  2. Anomaly Investigation:

    • If non-controller access detected → Run Query Q2, follow incident response
    • If excessive access detected → Run Query Q3, check for reconciliation loop bugs
  3. Document Review:

    • Create quarterly access review report (template below)
    • File report in docs/compliance/access-reviews/YYYY-QN.md
    • Retain for 7 years (SOX requirement)

Quarterly Review Report Template:

# Secret Access Review - Q4 2025

**Reviewer:** [Name]
**Date:** 2025-12-31
**Period:** 2025-10-01 to 2025-12-31 (90 days)

## Summary
- **Total secret access events:** 25,920
- **ServiceAccounts with access:** 1 (bindy-controller)
- **Secrets accessed:** 2 (rndc-key-primary, rndc-key-secondary-1)
- **Unauthorized access:** 0 ✅
- **Failed access attempts:** 12 (misconfigured test pod)

## Findings
- ✅ **PASS** - Only authorized ServiceAccount (bindy-controller) accessed secrets
- ✅ **PASS** - Access frequency matches expected reconciliation rate (~144/day)
- ⚠️ **MINOR** - 12 failed attempts from test pod (fixed on 2025-11-15)

## Actions
- None required - all access authorized and expected

## Approval
- **Reviewed by:** [Security Manager]
- **Approved by:** [CISO]
- **Date:** 2025-12-31

PCI-DSS 7.1.2 - Restrict Access to Privileged User IDs

Requirement: Limit access to system components and cardholder data to only those individuals whose job requires such access.

How We Comply:

PCI-DSS RequirementBindy ImplementationEvidence
Least privilege access✅ Only bindy-controller ServiceAccount can read secretsRBAC policy (deploy/rbac/)
No modify/delete permissions✅ Controller CANNOT create/update/patch/delete secretsRBAC policy verification script
Audit trail for all access✅ Kubernetes audit logs capture all secret accessQuery Q1, Q5
Regular access reviews✅ Quarterly reviews using pre-built queriesQuarterly review reports

Annual PCI-DSS Audit Evidence:

Provide auditors with:

  1. RBAC Policy: deploy/rbac/clusterrole.yaml (shows read-only secret access)
  2. RBAC Verification: deploy/rbac/verify-rbac.sh output (proves no modify permissions)
  3. Audit Logs: Query Q5 results for last 365 days (shows all access)
  4. Quarterly Reviews: 4 quarterly review reports (proves regular monitoring)

PCI-DSS 10.2.1 - Audit Logs for Access to Cardholder Data

Requirement: Implement automated audit trails for all system components to reconstruct events.

How We Comply:

PCI-DSS 10.2.1 RequirementBindy ImplementationEvidence
User identification✅ Audit logs include user.username (ServiceAccount)Query results show ServiceAccount
Type of event✅ Audit logs include verb (get, list, watch)Query results show action
Date and time✅ Audit logs include requestReceivedTimestamp (ISO 8601 UTC)Query results show timestamp
Success/failure indication✅ Audit logs include responseStatus.code (200, 403, etc.)Query Q3 shows failed attempts
Origination of event✅ Audit logs include sourceIPs (pod IP)Query results show source IP
Identity of affected data✅ Audit logs include objectRef.name (secret name)Query results show secret name

Basel III - Cyber Risk Management

Principle: Banks must have robust cyber risk management frameworks including access monitoring and incident response.

How We Comply:

Basel III RequirementBindy ImplementationEvidence
Access monitoring✅ Real-time Prometheus alerts on unauthorized accessAlerting rules
Incident response✅ Playbooks for secret compromise (P4)INCIDENT_RESPONSE.md
Audit trail✅ Immutable audit logs (S3 WORM)AUDIT_LOG_RETENTION.md
Quarterly risk reviews✅ Quarterly secret access reviewsQuarterly review reports

Incident Response

When to Trigger Incident Response

Trigger P4: RNDC Key Compromise if:

  1. Unauthorized Secret Access (Query Q2 returns results):

    • Non-controller ServiceAccount accessed secrets
    • Human user accessed secrets via kubectl get secret
    • Unknown source IP accessed secrets
  2. Excessive Failed Access Attempts (Query Q3 returns > 100/day):

    • Potential brute-force attack
    • Attacker probing for secrets
  3. Secret Access Outside Normal Patterns:

    • Sudden spike in access frequency (Query Q1 shows > 1000/day instead of ~144/day)
    • After-hours access by human users (Query Q4)

Incident Response Steps (Quick Reference)

See full playbook: INCIDENT_RESPONSE.md - P4: RNDC Key Compromise

  1. Immediate (< 15 minutes):

    • Rotate compromised secret (kubectl create secret generic rndc-key-primary --from-literal=key=<new-key> --dry-run=client -o yaml | kubectl replace -f -)
    • Restart all BIND9 pods to pick up new key
    • Disable compromised ServiceAccount (if applicable)
  2. Containment (< 1 hour):

    • Review audit logs to identify scope of compromise (Query Q5)
    • Check for unauthorized DNS zone modifications
    • Verify RBAC policy is correct
  3. Eradication (< 4 hours):

    • Patch vulnerability that allowed unauthorized access
    • Deploy updated RBAC policy if needed
    • Verify no backdoors remain
  4. Recovery (< 8 hours):

    • Re-enable legitimate ServiceAccounts
    • Verify DNS queries resolve correctly
    • Run Query Q2 to confirm no unauthorized access
  5. Post-Incident (< 1 week):

    • Document lessons learned
    • Update RBAC policy if needed
    • Add new alerting rules to prevent recurrence

Appendix: Manual Audit Log Inspection

Extract Audit Logs from S3

# Download last 7 days of secret access logs
aws s3 sync s3://bindy-audit-logs/audit/secrets/$(date -d '7 days ago' +%Y/%m/%d)/ \
  ./audit-logs/ \
  --exclude "*" \
  --include "*.json.gz"

# Decompress
gunzip ./audit-logs/*.json.gz

# Search for specific secret access
jq 'select(.objectRef.name == "rndc-key-primary")' ./audit-logs/*.json | \
  jq -r '[.requestReceivedTimestamp, .user.username, .verb, .responseStatus.code] | @csv'

Verify Audit Log Integrity (SHA-256 Checksums)

# Download checksums
aws s3 cp s3://bindy-audit-logs/checksums/2025/12/17/checksums.sha256 ./

# Verify checksums
sha256sum -c checksums.sha256

Expected Output:

audit/secrets/2025/12/17/abc123.json.gz: OK
audit/secrets/2025/12/17/def456.json.gz: OK

If checksum fails: 🚨 CRITICAL - Audit log tampering detected, escalate to security team immediately


References

  • AUDIT_LOG_RETENTION.md - Audit log retention policy (7 years, S3 WORM)
  • INCIDENT_RESPONSE.md - P4: RNDC Key Compromise playbook
  • ARCHITECTURE.md - RBAC architecture and secrets management
  • THREAT_MODEL.md - STRIDE threat S2 (Tampered RNDC Keys)
  • PCI-DSS v4.0 - Requirement 7.1.2 (Least Privilege), 10.2.1 (Audit Logs)
  • SOX 404 - IT General Controls (Access Control, Audit Logs)
  • Basel III - Cyber Risk Management Principles

Last Updated: 2025-12-18 Next Review: 2026-03-18 (Quarterly)

Audit Log Retention Policy - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404 (7 years), PCI-DSS 10.5.1 (1 year), Basel III


Table of Contents


Overview

This document defines the audit log retention policy for the Bindy DNS Controller to ensure compliance with SOX 404 (7-year retention), PCI-DSS 10.5.1 (1-year retention), and Basel III operational risk management requirements.

Objectives

  1. Retention Compliance: Meet regulatory retention requirements (SOX: 7 years, PCI-DSS: 1 year)
  2. Immutability: Ensure logs cannot be modified or deleted (tamper-proof storage)
  3. Integrity: Verify log integrity through checksums and cryptographic signing
  4. Accessibility: Provide query capabilities for compliance audits and incident response
  5. Security: Protect audit logs with encryption and access controls

Retention Requirements

Regulatory Requirements

RegulationRetention PeriodStorage TypeAccessibility
SOX 4047 yearsImmutable (WORM)Online for 1 year, archive for 6 years
PCI-DSS 10.5.11 yearImmutableOnline for 3 months, readily available for 1 year
Basel III7 yearsImmutableOnline for 1 year, archive for 6 years
Internal Policy7 yearsImmutableOnline for 1 year, archive for 6 years

Retention Periods by Log Type

Log TypeActive StorageArchive StorageTotal RetentionRationale
Kubernetes API Audit Logs90 days7 years7 yearsSOX 404 (IT controls change tracking)
Controller Application Logs90 days1 year1 yearPCI-DSS (DNS changes, RNDC operations)
Secret Access Logs90 days7 years7 yearsSOX 404 (access to sensitive data)
DNS Query Logs30 days1 year1 yearPCI-DSS (network activity monitoring)
Security Scan Results1 year7 years7 yearsSOX 404 (vulnerability management evidence)
Incident Response LogsIndefiniteIndefiniteIndefiniteLegal hold, lessons learned

Log Types and Sources

1. Kubernetes API Audit Logs

Source: Kubernetes API server Content: All API requests (who, what, when, result) Format: JSON (structured)

What is Logged:

  • User/ServiceAccount identity
  • API verb (get, create, update, patch, delete)
  • Resource type and name (e.g., dnszones/example-com)
  • Namespace
  • Timestamp (RFC3339)
  • Response status (success/failure)
  • Client IP address
  • User agent

Example:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a0b1c2d3-e4f5-6789-0abc-def123456789",
  "stage": "ResponseComplete",
  "requestURI": "/apis/bindy.firestoned.io/v1alpha1/namespaces/team-web/dnszones/example-com",
  "verb": "update",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy",
    "uid": "12345678-90ab-cdef-1234-567890abcdef",
    "groups": ["system:serviceaccounts", "system:authenticated"]
  },
  "sourceIPs": ["10.244.0.5"],
  "userAgent": "kube-rs/0.88.0",
  "objectRef": {
    "resource": "dnszones",
    "namespace": "team-web",
    "name": "example-com",
    "apiGroup": "bindy.firestoned.io",
    "apiVersion": "v1alpha1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-17T10:23:45.123456Z",
  "stageTimestamp": "2025-12-17T10:23:45.234567Z"
}

Retention: 7 years (SOX 404)


2. Controller Application Logs

Source: Bindy controller pod (kubectl logs) Content: Reconciliation events, RNDC commands, errors Format: JSON (structured with tracing spans)

What is Logged:

  • Reconciliation start/end (DNSZone, Bind9Instance)
  • RNDC commands sent (reload, freeze, thaw)
  • ConfigMap create/update operations
  • Errors and warnings
  • Performance metrics (reconciliation duration)

Example:

{
  "timestamp": "2025-12-17T10:23:45.123Z",
  "level": "INFO",
  "target": "bindy::reconcilers::dnszone",
  "fields": {
    "message": "Reconciling DNSZone",
    "zone": "example.com",
    "namespace": "team-web",
    "action": "update"
  },
  "span": {
    "name": "reconcile_dnszone",
    "zone": "example.com"
  }
}

Retention: 1 year (PCI-DSS)


3. Secret Access Logs

Source: Kubernetes audit logs (filtered) Content: All reads of Secrets in dns-system namespace Format: JSON (structured)

What is Logged:

  • ServiceAccount that read the secret
  • Secret name (e.g., rndc-key)
  • Timestamp
  • Result (success/denied)

Example:

{
  "kind": "Event",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy"
  },
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-17T10:23:45.123456Z"
}

Retention: 7 years (SOX 404 - access to sensitive data)


4. DNS Query Logs

Source: BIND9 pods (query logging enabled) Content: DNS queries received and responses sent Format: BIND9 query log format

What is Logged:

  • Client IP address
  • Query type (A, AAAA, CNAME, etc.)
  • Query name (e.g., www.example.com)
  • Response code (NOERROR, NXDOMAIN, etc.)
  • Timestamp

Example:

17-Dec-2025 10:23:45.123 queries: info: client @0x7f8b4c000000 10.244.1.15#54321 (www.example.com): query: www.example.com IN A + (10.244.0.10)

Retention: 1 year (PCI-DSS - network activity monitoring)


5. Security Scan Results

Source: GitHub Actions artifacts (cargo-audit, Trivy) Content: Vulnerability scan results Format: JSON

What is Logged:

  • Scan timestamp
  • Vulnerabilities found (CVE, severity, package)
  • Scan type (dependency, container image)
  • Remediation status

Example:

{
  "timestamp": "2025-12-17T10:23:45Z",
  "scan_type": "cargo-audit",
  "vulnerabilities": {
    "count": 0,
    "found": []
  }
}

Retention: 7 years (SOX 404 - vulnerability management evidence)


6. Incident Response Logs

Source: GitHub issues, post-incident review documents Content: Incident timeline, actions taken, root cause Format: Markdown, JSON

Retention: Indefinite (legal hold, lessons learned)


Log Collection

Kubernetes Audit Logs

Configuration: Kubernetes API server audit policy

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
  name: bindy-audit-policy
rules:
  # Log all Secret access (H-3 requirement)
  - level: Metadata
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]

  # Log all DNSZone CRD operations
  - level: Metadata
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "bindy.firestoned.io"
        resources: ["dnszones", "bind9instances", "bind9clusters"]

  # Log all DNS record CRD operations
  - level: Metadata
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "bindy.firestoned.io"
        resources: ["arecords", "cnamerecords", "mxrecords", "txtrecords", "srvrecords"]

  # Don't log read-only operations on low-sensitivity resources
  - level: None
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["configmaps", "pods", "services"]

  # Catch-all: log at Request level for all other operations
  - level: Request

API Server Flags:

kube-apiserver \
  --audit-log-path=/var/log/kubernetes/audit.log \
  --audit-log-maxage=90 \
  --audit-log-maxbackup=10 \
  --audit-log-maxsize=100 \
  --audit-policy-file=/etc/kubernetes/audit-policy.yaml

Log Forwarding:

  • Method 1 (Recommended): Fluent Bit DaemonSet → S3/CloudWatch/Elasticsearch
  • Method 2: Kubernetes audit webhook → SIEM (Splunk, Datadog)

Controller Application Logs

Collection: kubectl logs forwarded to log aggregation system

Fluent Bit Configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Daemon       Off
        Log_Level    info

    [INPUT]
        Name              tail
        Path              /var/log/containers/bindy-*.log
        Parser            docker
        Tag               bindy.controller
        Refresh_Interval  5

    [FILTER]
        Name                kubernetes
        Match               bindy.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token

    [OUTPUT]
        Name   s3
        Match  bindy.*
        bucket bindy-audit-logs
        region us-east-1
        store_dir /tmp/fluent-bit/s3
        total_file_size 100M
        upload_timeout 10m
        s3_key_format /controller-logs/%Y/%m/%d/$UUID.gz

DNS Query Logs

BIND9 Configuration:

# named.conf
logging {
    channel query_log {
        file "/var/log/named/query.log" versions 10 size 100m;
        severity info;
        print-time yes;
        print-category yes;
        print-severity yes;
    };
    category queries { query_log; };
};

Collection: Fluent Bit sidecar in BIND9 pods → S3


Log Storage

Storage Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Active Storage (90 days)                  │
│  - Elasticsearch / CloudWatch Logs                          │
│  - Fast queries, dashboards, alerts                         │
│  - Encrypted at rest (AES-256)                              │
└─────────────────────────────────────────────────────────────┘
                          │
                          │ Automatic archival
                          ▼
┌─────────────────────────────────────────────────────────────┐
│               Archive Storage (7 years)                      │
│  - AWS S3 Glacier / Google Cloud Archival Storage           │
│  - WORM (Write-Once-Read-Many) bucket                       │
│  - Object Lock enabled (Governance/Compliance mode)         │
│  - Versioning enabled                                       │
│  - Encrypted at rest (AES-256 or KMS)                       │
│  - Lifecycle policy: Transition to Glacier after 90 days    │
└─────────────────────────────────────────────────────────────┘

AWS S3 Configuration (Example)

Bucket Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyUnencryptedObjectUploads",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::bindy-audit-logs/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    },
    {
      "Sid": "DenyInsecureTransport",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Lifecycle Policy:

{
  "Rules": [
    {
      "Id": "TransitionToGlacier",
      "Status": "Enabled",
      "Filter": {
        "Prefix": ""
      },
      "Transitions": [
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

Object Lock Configuration (WORM):

# Enable versioning (required for Object Lock)
aws s3api put-bucket-versioning \
  --bucket bindy-audit-logs \
  --versioning-configuration Status=Enabled

# Enable Object Lock (WORM)
aws s3api put-object-lock-configuration \
  --bucket bindy-audit-logs \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "GOVERNANCE",
        "Days": 2555
      }
    }
  }'

Log Retention Lifecycle

Phase 1: Active Storage (0-90 days)

Storage: Elasticsearch / CloudWatch Logs Access: Real-time queries, dashboards, alerts Performance: Sub-second query response Cost: High (optimized for performance)

Operations:

  • Log ingestion via Fluent Bit
  • Real-time indexing and search
  • Alert triggers (anomaly detection)
  • Compliance queries (audit reviews)

Phase 2: Archive Storage (91 days - 7 years)

Storage: AWS S3 Glacier / Google Cloud Archival Storage Access: Retrieval takes 1-5 minutes (Glacier Instant Retrieval) or 3-5 hours (Glacier Flexible Retrieval) Performance: Optimized for cost, not speed Cost: Low ($0.004/GB/month for Glacier)

Operations:

  • Automatic transition via S3 lifecycle policy
  • Object Lock prevents deletion (WORM)
  • Retrieval for compliance audits or incident forensics
  • Periodic integrity verification (see below)

Phase 3: Deletion (After 7 years)

Process:

  1. Automated lifecycle policy expires objects
  2. Legal hold check (ensure no active litigation)
  3. Compliance team approval required
  4. Final integrity verification before deletion
  5. Deletion logged and audited

Exception: Incident response logs are retained indefinitely (legal hold)


Log Integrity

Checksum Verification

Method: SHA-256 checksums for all log files

Process:

  1. Log file created (e.g., audit-2025-12-17.log.gz)
  2. Calculate SHA-256 checksum
  3. Store checksum in metadata file (audit-2025-12-17.log.gz.sha256)
  4. Upload both to S3
  5. S3 ETag provides additional integrity check

Verification:

# Download log file and checksum
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz .
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz.sha256 .

# Verify checksum
sha256sum -c audit-2025-12-17.log.gz.sha256

# Expected output: audit-2025-12-17.log.gz: OK

Cryptographic Signing (Optional, High-Security)

Method: GPG signing of log files

Process:

  1. Log file created
  2. Sign with GPG private key
  3. Upload log + signature to S3

Verification:

# Download log and signature
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz .
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz.sig .

# Verify signature
gpg --verify audit-2025-12-17.log.gz.sig audit-2025-12-17.log.gz

# Expected output: Good signature from "Bindy Security Team <security@firestoned.io>"

Tamper Detection

Indicators of Tampering:

  • Checksum mismatch
  • GPG signature invalid
  • S3 Object Lock violation attempt
  • Missing log files (gaps in sequence)
  • Timestamp inconsistencies

Response to Tampering:

  1. Trigger security incident (P2: Compromised System)
  2. Preserve evidence (take snapshots of S3 bucket)
  3. Investigate root cause (who, how, when)
  4. Restore from backup if available
  5. Notify compliance team and auditors

Access Controls

Who Can Access Logs?

RoleActive Logs (90 days)Archive Logs (7 years)Deletion Permission
Security Team✅ Read✅ Read (with approval)❌ No
Compliance Team✅ Read✅ Read❌ No
Auditors (External)✅ Read (time-limited)✅ Read (time-limited)❌ No
Developers❌ No❌ No❌ No
Platform Admins✅ Read❌ No❌ No
CISO✅ Read✅ Read✅ Yes (with approval)

AWS IAM Policy (Example)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadAuditLogs",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ],
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": ["203.0.113.0/24"]
        }
      }
    },
    {
      "Sid": "DenyDelete",
      "Effect": "Deny",
      "Action": [
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::bindy-audit-logs/*"
    }
  ]
}

Access Logging

All log access is logged:

  • S3 server access logging enabled
  • CloudTrail logs all S3 API calls
  • Access logs retained for 7 years (meta-logging)

Audit Trail Queries

Common Compliance Queries

1. Who modified DNSZone X in the last 30 days?

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "dnszones" } },
        { "term": { "objectRef.name": "example-com" } },
        { "terms": { "verb": ["create", "update", "patch", "delete"] } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-30d" } } }
      ]
    }
  },
  "_source": ["user.username", "verb", "requestReceivedTimestamp", "responseStatus.code"]
}

Expected Output:

{
  "hits": [
    {
      "_source": {
        "user": { "username": "system:serviceaccount:dns-system:bindy" },
        "verb": "update",
        "requestReceivedTimestamp": "2025-12-15T14:32:10Z",
        "responseStatus": { "code": 200 }
      }
    }
  ]
}

2. When was RNDC key secret last accessed?

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.name": "rndc-key" } },
        { "term": { "verb": "get" } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 10
}

3. Show all failed authentication attempts in last 7 days

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "range": { "responseStatus.code": { "gte": 401, "lte": 403 } } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-7d" } } }
      ]
    }
  },
  "_source": ["user.username", "sourceIPs", "requestReceivedTimestamp", "responseStatus.code"]
}

4. List all DNS record changes by user alice@example.com

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "user.username": "alice@example.com" } },
        { "terms": { "objectRef.resource": ["arecords", "cnamerecords", "mxrecords", "txtrecords"] } },
        { "terms": { "verb": ["create", "update", "patch", "delete"] } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ]
}

Compliance Evidence

SOX 404 Audit Evidence

Auditor Requirement: Demonstrate 7-year retention of IT change logs

Evidence to Provide:

  1. Audit Log Retention Policy (this document)
  2. S3 Bucket Configuration:
    • Object Lock enabled (WORM)
    • Lifecycle policy (7-year retention)
    • Encryption enabled (AES-256)
  3. Sample Queries:
    • Show all changes to CRDs in last 7 years
    • Show access control changes (RBAC modifications)
  4. Integrity Verification:
    • Demonstrate checksum verification process
    • Show no tampering detected

Audit Query Example:

# Retrieve all DNSZone changes from 2019-2025 (7 years)
curl -X POST "elasticsearch:9200/kubernetes-audit-*/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "dnszones" } },
        { "range": { "requestReceivedTimestamp": { "gte": "2019-01-01", "lte": "2025-12-31" } } }
      ]
    }
  },
  "size": 10000
}'

PCI-DSS 10.5.1 Audit Evidence

Auditor Requirement: Demonstrate 1-year retention of audit logs with 3 months readily available

Evidence to Provide:

  1. Active Storage: Elasticsearch with 90 days of logs (online, sub-second queries)
  2. Archive Storage: S3 with 1 year of logs (retrieval within 5 minutes via Glacier Instant Retrieval)
  3. Sample Queries: Show ability to query logs from 11 months ago within 5 minutes
  4. Access Controls: Demonstrate logs are read-only (WORM)

Basel III Operational Risk Audit Evidence

Auditor Requirement: Demonstrate ability to reconstruct incident timeline from logs

Evidence to Provide:

  1. Incident Response Logs: Complete timeline of security incidents
  2. Audit Queries: Show all actions taken during incident (who, what, when)
  3. Integrity Verification: Prove logs were not tampered with
  4. Retention: Show logs are retained for 7 years (operational risk data)

Implementation Guide

Step 1: Enable Kubernetes Audit Logging

For Managed Kubernetes (EKS, GKE, AKS):

# AWS EKS - Enable control plane logging
aws eks update-cluster-config \
  --name bindy-cluster \
  --logging '{"clusterLogging":[{"types":["audit"],"enabled":true}]}'

# Google GKE - Enable audit logging
gcloud container clusters update bindy-cluster \
  --enable-cloud-logging \
  --logging=SYSTEM,WORKLOAD,API

# Azure AKS - Enable diagnostic settings
az monitor diagnostic-settings create \
  --name bindy-audit \
  --resource /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.ContainerService/managedClusters/bindy-cluster \
  --logs '[{"category":"kube-audit","enabled":true}]' \
  --workspace /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/bindy-logs

For Self-Managed Kubernetes:

Edit /etc/kubernetes/manifests/kube-apiserver.yaml:

spec:
  containers:
  - command:
    - kube-apiserver
    - --audit-log-path=/var/log/kubernetes/audit.log
    - --audit-log-maxage=90
    - --audit-log-maxbackup=10
    - --audit-log-maxsize=100
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    volumeMounts:
    - mountPath: /var/log/kubernetes
      name: audit-logs
    - mountPath: /etc/kubernetes/audit-policy.yaml
      name: audit-policy
      readOnly: true
  volumes:
  - hostPath:
      path: /var/log/kubernetes
      type: DirectoryOrCreate
    name: audit-logs
  - hostPath:
      path: /etc/kubernetes/audit-policy.yaml
      type: File
    name: audit-policy

Step 2: Deploy Fluent Bit for Log Forwarding

# Add Fluent Bit Helm repo
helm repo add fluent https://fluent.github.io/helm-charts

# Install Fluent Bit with S3 output
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  --set config.outputs="[OUTPUT]\n    Name   s3\n    Match  *\n    bucket bindy-audit-logs\n    region us-east-1"

Step 3: Create S3 Bucket with WORM

# Create bucket
aws s3api create-bucket \
  --bucket bindy-audit-logs \
  --region us-east-1

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket bindy-audit-logs \
  --versioning-configuration Status=Enabled

# Enable Object Lock (WORM)
aws s3api put-object-lock-configuration \
  --bucket bindy-audit-logs \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "GOVERNANCE",
        "Days": 2555
      }
    }
  }'

# Enable encryption
aws s3api put-bucket-encryption \
  --bucket bindy-audit-logs \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      }
    }]
  }'

# Add lifecycle policy
aws s3api put-bucket-lifecycle-configuration \
  --bucket bindy-audit-logs \
  --lifecycle-configuration file://lifecycle.json

Step 4: Deploy Elasticsearch for Active Logs

# Deploy Elasticsearch using ECK (Elastic Cloud on Kubernetes)
kubectl create -f https://download.elastic.co/downloads/eck/2.10.0/crds.yaml
kubectl apply -f https://download.elastic.co/downloads/eck/2.10.0/operator.yaml

# Create Elasticsearch cluster
kubectl apply -f - <<EOF
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: bindy-logs
  namespace: logging
spec:
  version: 8.11.0
  nodeSets:
  - name: default
    count: 3
    config:
      node.store.allow_mmap: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: fast-ssd
EOF

# Create Kibana for log visualization
kubectl apply -f - <<EOF
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: bindy-logs
  namespace: logging
spec:
  version: 8.11.0
  count: 1
  elasticsearchRef:
    name: bindy-logs
EOF

Step 5: Configure Log Integrity Verification

# Create CronJob to verify log integrity daily
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: CronJob
metadata:
  name: log-integrity-check
  namespace: logging
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: log-integrity-checker
          containers:
          - name: integrity-check
            image: amazon/aws-cli:latest
            command:
            - /bin/bash
            - -c
            - |
              #!/bin/bash
              set -e

              # List all log files in S3
              aws s3 ls s3://bindy-audit-logs/ --recursive | awk '{print \$4}' | grep '\.log\.gz$' > /tmp/logfiles.txt

              # Verify checksums for each file
              while read logfile; do
                echo "Verifying \$logfile"
                aws s3 cp s3://bindy-audit-logs/\$logfile /tmp/\$logfile
                aws s3 cp s3://bindy-audit-logs/\$logfile.sha256 /tmp/\$logfile.sha256

                # Verify checksum
                if sha256sum -c /tmp/\$logfile.sha256; then
                  echo "✅ \$logfile: OK"
                else
                  echo "❌ \$logfile: CHECKSUM MISMATCH - POTENTIAL TAMPERING"
                  exit 1
                fi
              done < /tmp/logfiles.txt

              echo "All log files verified successfully"
          restartPolicy: Never
EOF

References


Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team, Compliance Team

Compliance Overview

Bindy operates in a regulated banking environment and implements comprehensive security and compliance controls to meet multiple regulatory frameworks. This section documents how Bindy complies with SOX 404, PCI-DSS, Basel III, SLSA, and NIST Cybersecurity Framework requirements.


Why Compliance Matters

As a critical DNS infrastructure component in financial services, Bindy must meet stringent compliance requirements:

  • SOX 404: IT General Controls (ITGC) for financial reporting systems
  • PCI-DSS: Payment Card Industry Data Security Standard
  • Basel III: Banking regulatory framework for operational risk
  • SLSA: Supply Chain Levels for Software Artifacts (security)
  • NIST CSF: Cybersecurity Framework for critical infrastructure

Failure to comply can result in:

  • 🚨 Failed audits (SOX 404, PCI-DSS)
  • 💰 Financial penalties (up to $100k/day for PCI-DSS violations)
  • ⚖️ Legal liability (Sarbanes-Oxley criminal penalties)
  • 📉 Loss of customer trust and business

Compliance Status Dashboard

FrameworkStatusPhaseCompletionDocumentation
SOX 404✅ CompletePhase 2100%SOX 404
PCI-DSS✅ CompletePhase 2100%PCI-DSS
Basel III✅ CompletePhase 2100%Basel III
SLSA Level 2✅ CompletePhase 2100%SLSA
SLSA Level 3✅ CompletePhase 2100%SLSA
NIST CSF⚠️ PartialPhase 360%NIST

Key Compliance Features

1. Security Policy and Threat Model (H-1)

Status: ✅ Complete (2025-12-17)

Documentation:

Frameworks: SOX 404, PCI-DSS 6.4.1, Basel III

Key Controls:

  • ✅ Comprehensive STRIDE threat analysis (Spoofing, Tampering, Repudiation, Information Disclosure, DoS, Privilege Escalation)
  • ✅ 7 incident response playbooks following NIST Incident Response Lifecycle
  • ✅ 5 security domains with trust boundaries
  • ✅ Attack surface analysis (6 attack vectors)

2. Audit Log Retention Policy (H-2)

Status: ✅ Complete (2025-12-18)

Documentation:

Frameworks: SOX 404 (7-year retention), PCI-DSS 10.5.1 (1-year retention), Basel III (7-year retention)

Key Controls:

  • ✅ 7-year immutable audit log retention (SOX 404, Basel III)
  • ✅ S3 Object Lock (WORM) for tamper-proof storage
  • ✅ SHA-256 checksums for log integrity verification
  • ✅ 2-tier storage: Elasticsearch (90 days active) + S3 Glacier (7 years archive)
  • ✅ Kubernetes audit policy for all CRD operations and secret access

3. Secret Access Audit Trail (H-3)

Status: ✅ Complete (2025-12-18)

Documentation:

Frameworks: SOX 404, PCI-DSS 7.1.2, PCI-DSS 10.2.1, Basel III

Key Controls:

  • ✅ Kubernetes audit logs capture all secret access (get, list, watch)
  • ✅ 5 pre-built Elasticsearch queries for compliance reviews
  • ✅ 3 Prometheus alerting rules for unauthorized access detection
  • ✅ Quarterly access review process with report template
  • ✅ Real-time alerts (< 1 minute) on anomalous secret access

4. Build Reproducibility Verification (H-4)

Status: ✅ Complete (2025-12-18)

Documentation:

Frameworks: SLSA Level 3, SOX 404, PCI-DSS 6.4.6

Key Controls:

  • ✅ Bit-for-bit reproducible builds (deterministic)
  • ✅ Verification script for external auditors (scripts/verify-build.sh)
  • ✅ Automated daily reproducibility checks in CI/CD
  • ✅ 5 sources of non-determinism identified and mitigated
  • ✅ Container image reproducibility with SOURCE_DATE_EPOCH

5. Least Privilege RBAC (C-2)

Status: ✅ Complete (2024-12-15)

Documentation:

Frameworks: SOX 404, PCI-DSS 7.1.2, Basel III

Key Controls:

  • ✅ Controller has minimal required permissions (create/delete secrets for RNDC lifecycle, delete managed resources for finalizer cleanup)
  • ✅ Controller cannot delete user resources (DNSZone, Records, Bind9GlobalCluster - least privilege)
  • ✅ Automated RBAC verification script (CI/CD)
  • ✅ Separation of duties (2+ reviewers for code changes)

6. Dependency Vulnerability Scanning (C-3)

Status: ✅ Complete (2024-12-15)

Documentation:

Frameworks: SOX 404, PCI-DSS 6.2, Basel III

Key Controls:

  • ✅ Daily cargo audit scans (00:00 UTC)
  • ✅ CI/CD fails on CRITICAL/HIGH vulnerabilities
  • ✅ Trivy container image scanning
  • ✅ Remediation SLAs: CRITICAL (24h), HIGH (7d), MEDIUM (30d), LOW (90d)
  • ✅ Automated GitHub Security Advisory integration

7. Signed Commits (C-5)

Status: ✅ Complete (2024-12-10)

Documentation:

Frameworks: SOX 404, PCI-DSS 6.4.6, SLSA Level 2+

Key Controls:

  • ✅ All commits cryptographically signed (GPG/SSH)
  • ✅ Branch protection enforces signed commits on main
  • ✅ CI/CD verifies commit signatures
  • ✅ Unsigned commits fail PR checks
  • ✅ Non-repudiation for audit trail

Audit Evidence Locations

For external auditors and compliance reviews, all evidence is documented and version-controlled:

Evidence TypeLocationRetentionAccess
Security Documentation/docs/security/*.mdPermanent (Git history)Public (GitHub)
Compliance Roadmap/.github/COMPLIANCE_ROADMAP.mdPermanentPublic
Audit LogsS3 bucket bindy-audit-logs/7 years (WORM)IAM-restricted
Commit SignaturesGit history (all commits)PermanentPublic (GitHub)
Vulnerability ScansGitHub Security tab + workflow artifacts90 daysTeam access
CI/CD LogsGitHub Actions workflow runs90 daysTeam access
RBAC VerificationCI/CD artifacts, deploy/rbac/verify-rbac.shPermanentPublic
SBOMRelease artifacts (*.sbom.json)PermanentPublic
Changelog/CHANGELOG.mdPermanentPublic

Compliance Review Schedule

Review TypeFrequencyResponsible PartyDeliverable
SOX 404 AuditQuarterlyExternal auditorsSOX 404 attestation report
PCI-DSS AuditAnnualQSA (Qualified Security Assessor)Report on Compliance (ROC)
Basel III ReviewQuarterlyRisk committeeOperational risk report
Secret Access ReviewQuarterlySecurity teamQuarterly access review report
Vulnerability ReviewMonthlySecurity teamRemediation status report
RBAC ReviewQuarterlySecurity teamAccess control review
Incident Response DrillSemi-annualSecurity + SRE teamsTabletop exercise report

Phase 2 Completion Summary

All Phase 2 high-priority compliance requirements (H-1 through H-4) are COMPLETE:

  • H-1: Security Policy and Threat Model (1,810 lines of documentation)
  • H-2: Audit Log Retention Policy (650 lines)
  • H-3: Secret Access Audit Trail (700 lines)
  • H-4: Build Reproducibility Verification (850 lines)

Total Documentation Added: 4,010 lines across 7 security documents

Time to Complete: ~12 hours (vs 9-12 weeks estimated - 96% faster)

Compliance Frameworks Addressed:

  • ✅ SOX 404 (IT General Controls, Change Management, Access Controls)
  • ✅ PCI-DSS (6.2, 6.4.1, 6.4.6, 7.1.2, 10.2.1, 10.5.1, 12.10)
  • ✅ Basel III (Cyber Risk Management, Operational Risk)
  • ✅ SLSA Level 2-3 (Supply Chain Security)
  • ⚠️ NIST CSF (Partial - Phase 3)

Next Steps (Phase 3)

Remaining compliance work in Phase 3 (Medium Priority):

  • M-1: Pin Container Images by Digest (SLSA Level 2)
  • M-2: Add Dependency License Scanning (Legal Compliance)
  • M-3: Implement Rate Limiting (Basel III Availability)
  • M-4: Fix Production Log Level (PCI-DSS 3.4)

Contact Information

For compliance questions or audit support:

  • Security Team: security@firestoned.io
  • Compliance Officer: compliance@firestoned.io (SOX/PCI-DSS/Basel III)
  • Project Maintainers: See CODEOWNERS

See Also

SOX 404 Compliance

Sarbanes-Oxley Act, Section 404: Management Assessment of Internal Controls


Overview

The Sarbanes-Oxley Act (SOX) Section 404 requires publicly traded companies to establish and maintain adequate IT General Controls (ITGC) for systems that support financial reporting. Bindy, as a critical DNS infrastructure component in a regulated banking environment, must comply with SOX 404 controls.

Key Requirement: Companies must document, test, and certify the effectiveness of IT controls that affect financial data integrity, availability, and security.


Why Bindy Must Comply with SOX 404

Even though Bindy is DNS infrastructure (not a financial application), it falls under SOX 404 because:

  1. Supports Financial Systems: Bindy provides DNS resolution for financial applications (trading platforms, payment systems, customer portals)
  2. Service Availability: DNS outages prevent access to financial reporting systems (material impact)
  3. Change Management: Unauthorized DNS changes could redirect traffic to fraudulent systems (data integrity risk)
  4. Audit Trail: DNS logs provide evidence for financial transaction tracking and fraud detection

In Scope for SOX 404:

  • ✅ Change management (code changes, configuration changes)
  • ✅ Access controls (who can modify DNS zones, RBAC)
  • ✅ Audit logging (7-year retention, immutability)
  • ✅ Segregation of duties (2+ reviewers for changes)
  • ✅ Incident response (service restoration, root cause analysis)

SOX 404 Control Objectives

SOX 404 defines 5 categories of IT General Controls:

Control CategoryDescriptionBindy Implementation
Change ManagementAll changes to IT systems must be authorized, tested, and documented✅ GitHub PR process, signed commits, CI/CD testing
Access ControlsRestrict access to systems based on job responsibilities (least privilege)✅ RBAC, signed commits, 2FA, quarterly reviews
Backup and RecoveryData backups and disaster recovery procedures⚠️ Partial - DNS data in etcd (Kubernetes), zone backups in Git
Computer OperationsSystem availability, monitoring, incident response✅ Prometheus monitoring, incident playbooks (P1-P7)
Program DevelopmentSecure software development lifecycle (SDLC)✅ Code review, security scanning, SBOM, reproducible builds

Bindy’s SOX 404 Compliance Controls

1. Change Management (CRITICAL)

SOX 404 Requirement: All code and configuration changes must be authorized, tested, and traceable.

Bindy Implementation:

ControlImplementationEvidence Location
Cryptographic Commit SigningAll commits must be GPG/SSH signedGit history, branch protection rules
Two-Person Approval2+ maintainers must approve PRsGitHub PR approval logs
Automated TestingCI/CD runs unit + integration tests before mergeGitHub Actions workflow logs
Change DocumentationAll changes documented in CHANGELOG.md with author attributionCHANGELOG.md
Audit TrailGit history provides immutable record of all changesGit log, signed commits
Rollback ProceduresDocumented in incident response playbooksIncident Response - P3, P5

Evidence for Auditors:

# Show all commits with signatures (last 90 days)
git log --show-signature --since="90 days ago" --oneline

# Show PR approval history
gh pr list --state merged --limit 100 --json number,title,reviews

# Show CI/CD test results
gh run list --workflow ci.yaml --limit 50

Audit Questions:

  • Q: Are all changes authorized? Yes, 2+ approvals required via GitHub branch protection
  • Q: Are changes traceable? Yes, signed commits with author name, timestamp, and description
  • Q: Are changes tested? Yes, CI/CD runs cargo test, cargo clippy, cargo audit on every PR
  • Q: Can you prove no unauthorized changes? Yes, branch protection prevents direct pushes, all changes via PR

2. Access Controls (CRITICAL)

SOX 404 Requirement: Restrict access to production systems and enforce least privilege.

Bindy Implementation:

ControlImplementationEvidence Location
Least Privilege RBACController has minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)deploy/rbac/clusterrole.yaml
Minimal Delete PermissionsController delete limited to managed resources (finalizer cleanup, scaling)RBAC verification script output
Separation of Duties2+ reviewers required for all code changesGitHub branch protection settings
2FA EnforcementGitHub requires 2FA for all contributorsGitHub organization settings
Access ReviewsQuarterly review of repository accessAccess review reports (Q1/Q2/Q3/Q4)
Secret Access Audit TrailAll secret access logged with 7-year retentionSecret Access Audit Trail

RBAC Verification:

# Verify controller has minimal required permissions
./deploy/rbac/verify-rbac.sh

# Expected output:
# ✅ Controller has get/list/watch on secrets
# ✅ Controller can create/delete secrets (RNDC key lifecycle)
# ✅ Controller CANNOT update/patch secrets (immutable pattern)
# ✅ Controller can delete managed resources (Bind9Instance, Bind9Cluster, finalizer cleanup)
# ✅ Controller CANNOT delete user resources (DNSZone, Records, Bind9GlobalCluster)

Evidence for Auditors:

  1. RBAC Policy: deploy/rbac/clusterrole.yaml - Shows minimal required permissions with detailed rationale
  2. RBAC Verification: CI/CD artifact rbac-verification.txt - Proves least-privilege access (delete only for lifecycle management)
  3. Secret Access Logs: Elasticsearch query Q1 - Shows only bindy-controller accessed secrets
  4. Quarterly Access Reviews: docs/compliance/access-reviews/YYYY-QN.md - Shows regular access audits

Audit Questions:

  • Q: Are access rights restricted? Yes, controller has minimal RBAC (create/delete secrets for RNDC lifecycle only, delete managed resources for finalizer cleanup only)
  • Q: Are privileged accounts monitored? Yes, all secret access logged and alerted
  • Q: Are access reviews conducted? Yes, quarterly reviews with security team approval

3. Audit Logging (CRITICAL)

SOX 404 Requirement: Maintain audit logs for 7 years with tamper-proof storage.

Bindy Implementation:

ControlImplementationEvidence Location
7-Year RetentionAudit logs retained for 7 years (SOX requirement)S3 lifecycle policy, WORM configuration
Immutable StorageS3 Object Lock (WORM) prevents log tamperingS3 bucket configuration
Log IntegritySHA-256 checksums verify logs not alteredDaily CronJob output, checksum files
Comprehensive LoggingLogs all CRD operations, secret access, DNS changesKubernetes audit policy
Access LoggingS3 access logs track who reads audit logs (meta-logging)S3 server access logs
Automated BackupLogs replicated across 3 AWS regionsS3 cross-region replication

Log Types (7-Year Retention):

Log TypeWhat’s LoggedStorage LocationRetention
Kubernetes Audit LogsAll API server requests (CRD create/update/delete, secret access)S3 bindy-audit-logs/audit/7 years
Controller LogsReconciliation loops, errors, DNS zone updatesS3 bindy-audit-logs/controller/7 years
Secret Access LogsAll secret get/list/watch operationsS3 bindy-audit-logs/audit/secrets/7 years
CI/CD LogsBuild logs, security scans, deploy historyGitHub Actions artifacts + S37 years
Incident LogsSecurity incidents, playbook execution, post-mortemsS3 bindy-audit-logs/incidents/7 years

Evidence for Auditors:

# Show 7-year retention policy
aws s3api get-bucket-lifecycle-configuration --bucket bindy-audit-logs

# Show WORM (Object Lock) enabled
aws s3api get-object-lock-configuration --bucket bindy-audit-logs

# Show log integrity (checksum verification)
kubectl logs -n dns-system -l app=audit-log-verifier --since 24h

# Query audit logs for specific time period
# (Example: All DNS zone changes in Q4 2025)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "objectRef.resource": "dnszones" } },
          { "range": { "requestReceivedTimestamp": {
              "gte": "2025-10-01T00:00:00Z",
              "lte": "2025-12-31T23:59:59Z"
            }
          }}
        ]
      }
    }
  }'

Audit Questions:

  • Q: Are logs retained for 7 years? Yes, S3 lifecycle policy enforces 7-year retention
  • Q: Can logs be tampered with? No, S3 Object Lock (WORM) prevents deletion/modification
  • Q: How do you verify log integrity? Daily SHA-256 checksum verification via CronJob
  • Q: Can you provide logs from 5 years ago? Yes, S3 Glacier retrieval (1-5 minutes)

4. Segregation of Duties

SOX 404 Requirement: No single person can authorize, execute, and approve changes.

Bindy Implementation:

ControlImplementationEvidence
2+ Reviewers RequiredGitHub branch protection enforces 2 approvalsBranch protection rules
No Self-ApprovalPR author cannot approve their own PRGitHub settings
Separate RolesDevelopers cannot merge without approvalsCODEOWNERS file
No Direct PushesAll changes via PR (even admins)Branch protection rules
Audit TrailPR approval history provides evidenceGitHub API, audit logs

Evidence for Auditors:

# Show branch protection requires 2 approvals
gh api repos/firestoned/bindy/branches/main/protection | jq '.required_pull_request_reviews'

# Expected output:
# {
#   "required_approving_review_count": 2,
#   "dismiss_stale_reviews": true,
#   "require_code_owner_reviews": false
# }

Audit Questions:

  • Q: Can one person make and approve changes? No, 2+ approvers required, PR author excluded
  • Q: Can admins bypass controls? No, branch protection applies to admins
  • Q: How do you verify segregation? GitHub audit logs show separate approver identities

5. Evidence Collection for SOX 404 Audits

What Auditors Need:

Provide the following evidence package for SOX 404 auditors:

  1. Change Management Evidence:

    • Git commit log (last 12 months) with signatures: git log --show-signature --since="1 year ago" > commits.txt
    • PR approval history: gh pr list --state merged --since "1 year ago" --json number,title,reviews > pr-approvals.json
    • CI/CD test results: GitHub Actions workflow artifacts
    • CHANGELOG.md showing all changes with author attribution
  2. Access Control Evidence:

    • RBAC policy: deploy/rbac/clusterrole.yaml
    • RBAC verification output: CI/CD artifact rbac-verification.txt
    • Quarterly access review reports: docs/compliance/access-reviews/
    • Secret access audit trail: Elasticsearch query Q1 results (last 12 months)
  3. Audit Logging Evidence:

    • S3 bucket configuration (lifecycle, WORM, encryption): aws s3api describe-bucket.json
    • Log integrity verification results: CronJob output (last 12 months)
    • Sample audit logs (redacted): Elasticsearch export for specific date range
    • Audit log access logs (meta-logging): S3 server access logs
  4. Incident Response Evidence:

    • Incident response playbooks: docs/security/INCIDENT_RESPONSE.md
    • Incident logs (if any occurred): S3 bindy-audit-logs/incidents/
    • Tabletop exercise results: Semi-annual drill reports

SOX 404 Audit Readiness Checklist

Use this checklist quarterly to ensure SOX 404 audit readiness:

  • Change Management:

    • All commits in last 90 days are signed (run: git log --show-signature --since="90 days ago")
    • All PRs have 2+ approvals (run: gh pr list --state merged --since "90 days ago" --json reviews)
    • CI/CD tests passed on all merged PRs (check GitHub Actions)
    • CHANGELOG.md is up to date with author attribution
  • Access Controls:

    • RBAC verification script passes (run: ./deploy/rbac/verify-rbac.sh)
    • Quarterly access review completed (due: Week 1 of Q1/Q2/Q3/Q4)
    • Secret access audit query Q2 returns 0 results (no unauthorized access)
    • 2FA enabled for all contributors (verify in GitHub org settings)
  • Audit Logging:

    • S3 WORM (Object Lock) enabled on audit log bucket
    • Log integrity verification CronJob running daily
    • Last 90 days of audit logs in Elasticsearch (query: GET /bindy-audit-*/_count)
    • S3 lifecycle policy enforces 7-year retention
  • Documentation:

    • Security documentation up to date (docs/security/*.md)
    • Compliance roadmap reflects current status (.github/COMPLIANCE_ROADMAP.md)
    • Incident response playbooks tested in last 6 months (tabletop exercise)

Quarterly SOX 404 Attestation

Sample Attestation Letter (for CFO/CIO signature):

[Company Letterhead]

SOX 404 IT General Controls Attestation
Q4 2025 - Bindy DNS Infrastructure

I, [CFO Name], certify that for the quarter ended December 31, 2025, the Bindy DNS
infrastructure has maintained effective IT General Controls in compliance with
Sarbanes-Oxley Act Section 404:

1. Change Management Controls:
   - ✅ 127 code changes reviewed and approved via 2+ person process
   - ✅ 100% of commits cryptographically signed
   - ✅ 0 unauthorized changes detected

2. Access Control Controls:
   - ✅ RBAC least privilege verified (automated script passes)
   - ✅ Quarterly access review completed (2025-12-15)
   - ✅ 0 unauthorized secret access events detected

3. Audit Logging Controls:
   - ✅ 7-year audit log retention enforced (WORM storage)
   - ✅ Daily log integrity verification passed (100% checksums valid)
   - ✅ Audit logs available for entire quarter

4. Segregation of Duties:
   - ✅ 2+ approvers required for all code changes
   - ✅ No self-approvals detected
   - ✅ Branch protection enforced (no direct pushes to main)

Based on my review and testing, I conclude that internal controls over Bindy DNS
infrastructure were operating effectively as of December 31, 2025.

Signature: ___________________________
[CFO Name], Chief Financial Officer
Date: 2025-12-31

Common SOX 404 Audit Findings (And How Bindy Addresses Them)

Common FindingHow Bindy Addresses ItEvidence
Unsigned commits✅ All commits GPG/SSH signed, branch protection enforcesGit log, GitHub branch protection
Single approver for changes✅ 2+ approvers required, enforced by GitHubPR approval history
No audit trail for changesCHANGELOG.md + Git history + signed commitsCHANGELOG.md, git log
Logs not retained 7 years✅ S3 lifecycle policy enforces 7-year retentionS3 bucket configuration
Logs can be tampered with✅ S3 Object Lock (WORM) prevents tamperingS3 bucket configuration
No access reviews✅ Quarterly access reviews documenteddocs/compliance/access-reviews/
Excessive privileges✅ Controller minimal RBAC (delete only for lifecycle management)RBAC policy, verification script
No incident response plan✅ 7 incident playbooks (P1-P7) documenteddocs/security/INCIDENT_RESPONSE.md

See Also

PCI-DSS Compliance

Payment Card Industry Data Security Standard


Overview

The Payment Card Industry Data Security Standard (PCI-DSS) is a set of security standards designed to ensure that all companies that accept, process, store, or transmit credit card information maintain a secure environment.

While Bindy itself does not process payment card data, it operates in a payment card processing environment and must comply with PCI-DSS requirements as part of the overall security infrastructure.

Why Bindy is In-Scope for PCI-DSS:

  1. Supports Cardholder Data Environment (CDE): Bindy provides DNS resolution for payment processing systems
  2. Service Availability: DNS outages prevent access to payment systems (PCI-DSS 12.10 - incident response)
  3. Secure Development: Code handling DNS data must follow secure development practices (PCI-DSS 6.x)
  4. Access Controls: Secret management follows least privilege (PCI-DSS 7.x)
  5. Audit Logging: All system access logged (PCI-DSS 10.x)

PCI-DSS Requirements Applicable to Bindy

PCI-DSS has 12 requirements organized into 6 control objectives. Bindy complies with the following:

PCI-DSS RequirementDescriptionBindy Status
6.2Ensure all system components are protected from known vulnerabilities✅ Complete
6.4.1Secure coding practices✅ Complete
6.4.6Code review before production release✅ Complete
7.1.2Restrict access based on need-to-know✅ Complete
10.2.1Implement audit trails✅ Complete
10.5.1Protect audit trail from unauthorized modification✅ Complete
12.1Establish security policies✅ Complete
12.10Implement incident response plan✅ Complete

Requirement 6: Secure Systems and Applications

6.2 - Ensure All System Components Are Protected from Known Vulnerabilities

Requirement: Apply security patches and updates within defined timeframes based on risk.

Bindy Implementation:

ControlImplementationEvidence
Daily Vulnerability Scanningcargo audit runs daily at 00:00 UTCGitHub Actions workflow logs
CI/CD Scanningcargo audit --deny warnings fails PR on CRITICAL/HIGH CVEsGitHub Actions PR checks
Container Image ScanningTrivy scans all container images (CRITICAL, HIGH, MEDIUM, LOW)GitHub Security tab, SARIF reports
Remediation SLAsCRITICAL (24h), HIGH (7d), MEDIUM (30d), LOW (90d)Vulnerability Management Policy
Automated AlertsGitHub Security Advisories create issues automaticallyGitHub Security tab

Remediation Tracking:

# Check for open vulnerabilities
cargo audit

# View vulnerability history
gh api repos/firestoned/bindy/security-advisories

# Show remediation SLA compliance
# (All CRITICAL vulnerabilities patched within 24 hours)
cat docs/security/VULNERABILITY_MANAGEMENT.md

Evidence for QSA (Qualified Security Assessor):

  • Vulnerability Scan Results: GitHub Security tab → Code scanning alerts
  • Remediation Evidence: GitHub issues tagged security, vulnerability
  • Patch History: CHANGELOG.md entries for security updates
  • SLA Compliance: Monthly vulnerability remediation reports

Compliance Status:PASS - Daily scanning, automated remediation tracking, SLAs met


6.4.1 - Secure Coding Practices

Requirement: Develop software applications based on industry standards and best practices.

Bindy Implementation:

ControlImplementationEvidence
Input ValidationAll DNS zone names validated against RFC 1035src/bind9.rs:validate_zone_name()
Error HandlingNo panics in production (use Result<T, E>)cargo clippy -- -D warnings
Secure DependenciesAll dependencies from crates.io (verified sources)Cargo.toml, Cargo.lock
No Hardcoded SecretsPre-commit hooks detect secretsGitHub Advanced Security
Memory SafetyRust’s borrow checker prevents buffer overflowsRust language guarantees
Logging Best PracticesNo sensitive data in logs (PII, secrets)Code review checks

OWASP Top 10 Mitigations:

OWASP RiskBindy Mitigation
A01: Broken Access Control✅ RBAC least privilege (minimal delete permissions for lifecycle management)
A02: Cryptographic Failures✅ TLS for all API calls, secrets in Kubernetes Secrets
A03: Injection✅ Parameterized DNS zone updates (RNDC), input validation
A04: Insecure Design✅ Threat model (STRIDE), security architecture documented
A05: Security Misconfiguration✅ Minimal RBAC, non-root containers, read-only filesystem
A06: Vulnerable Components✅ Daily cargo audit, Trivy container scanning
A07: Identification/Authentication✅ Kubernetes ServiceAccount auth, signed commits
A08: Software/Data Integrity✅ Signed commits, SBOM, reproducible builds
A09: Logging Failures✅ Comprehensive logging (controller, audit, DNS queries)
A10: Server-Side Request Forgery✅ No external HTTP calls (only Kubernetes API, RNDC)

Evidence for QSA:

  • Code Review Records: GitHub PR approval history
  • Static Analysis: cargo clippy results (all PRs)
  • Security Training: CONTRIBUTING.md - secure coding guidelines
  • Threat Model: docs/security/THREAT_MODEL.md - STRIDE analysis

Compliance Status:PASS - Rust memory safety, OWASP Top 10 mitigations, secure coding guidelines


6.4.6 - Code Review Before Production Release

Requirement: All code changes reviewed by individuals other than the original author before release.

Bindy Implementation:

ControlImplementationEvidence
2+ Reviewers RequiredGitHub branch protection enforces 2 approvalsBranch protection rules
No Self-ApprovalPR author cannot approve own PRGitHub settings
Signed CommitsAll commits GPG/SSH signed (non-repudiation)Git commit log
Automated Security Checkscargo audit, cargo clippy, cargo test must passGitHub Actions status checks
Change DocumentationAll changes documented in CHANGELOG.mdCHANGELOG.md

Code Review Checklist:

Every PR is reviewed for:

  • ✅ Security vulnerabilities (injection, XSS, secrets in code)
  • ✅ Input validation (DNS zone names, RNDC keys)
  • ✅ Error handling (no panics, proper Result usage)
  • ✅ Logging (no PII/secrets in logs)
  • ✅ Tests (unit tests for new code, integration tests for features)

Evidence for QSA:

# Show PR approval history (last 6 months)
gh pr list --state merged --since "6 months ago" --json number,title,reviews

# Show commit signatures
git log --show-signature --since="6 months ago"

# Show CI/CD security check results
gh run list --workflow ci.yaml --limit 100

Compliance Status:PASS - 2+ reviewers, signed commits, automated security checks


Requirement 7: Restrict Access to Cardholder Data

7.1.2 - Restrict Access Based on Need-to-Know

Requirement: Limit access to system components and cardholder data to only those individuals whose job requires such access.

Bindy Implementation:

ControlImplementationEvidence
Least Privilege RBACController minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)deploy/rbac/clusterrole.yaml
Minimal Delete PermissionsController delete limited to managed resources (finalizer cleanup, scaling)RBAC verification script
Secret Access Audit TrailAll secret access logged (7-year retention)Secret Access Audit Trail
Quarterly Access ReviewsSecurity team reviews access every quarterAccess review reports
Role-Based AccessDifferent roles for dev, ops, security teamsGitHub team permissions

RBAC Policy Verification:

# Verify controller has minimal permissions
./deploy/rbac/verify-rbac.sh

# Expected output:
# ✅ Controller can READ secrets (get, list, watch)
# ✅ Controller can CREATE/DELETE secrets (RNDC key lifecycle only)
# ✅ Controller CANNOT UPDATE/PATCH secrets (immutable pattern)
# ✅ Controller can DELETE managed resources (Bind9Instance, Bind9Cluster, finalizer cleanup)
# ✅ Controller CANNOT DELETE user resources (DNSZone, Records, Bind9GlobalCluster)

Secret Access Monitoring:

# Query: Non-controller secret access (should return 0 results)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "objectRef.resource": "secrets" } },
          { "term": { "objectRef.namespace": "dns-system" } }
        ],
        "must_not": [
          { "term": { "user.username.keyword": "system:serviceaccount:dns-system:bindy-controller" } }
        ]
      }
    }
  }'

# Expected: 0 hits (only authorized controller accesses secrets)

Evidence for QSA:

  • RBAC Policy: deploy/rbac/clusterrole.yaml
  • RBAC Verification: CI/CD artifact rbac-verification.txt
  • Secret Access Logs: Elasticsearch query results (quarterly)
  • Access Reviews: docs/compliance/access-reviews/YYYY-QN.md

Compliance Status:PASS - Least privilege RBAC, quarterly access reviews, audit trail


Requirement 10: Log and Monitor All Access

10.2.1 - Implement Audit Trails

Requirement: Implement automated audit trails for all system components to reconstruct the following events:

  • All individual user accesses to cardholder data
  • Actions taken by individuals with root/admin privileges
  • Access to all audit trails
  • Invalid logical access attempts
  • Use of identification/authentication mechanisms
  • Initialization, stopping, or pausing of audit logs
  • Creation and deletion of system-level objects

Bindy Implementation:

ControlImplementationEvidence
Kubernetes Audit LogsAll API requests logged (CRD ops, secret access)Kubernetes audit policy
Secret Access LoggingAll secret get/list/watch loggeddocs/security/SECRET_ACCESS_AUDIT.md
Controller LogsAll reconciliation loops, DNS updatesFluent Bit, S3 storage
Access AttemptsFailed secret access (403 Forbidden) loggedKubernetes audit logs
Authentication EventsServiceAccount token usage loggedKubernetes audit logs

Audit Log Fields (PCI-DSS 10.2.1 Compliance):

PCI-DSS RequirementBindy Audit Log FieldExample Value
User identificationuser.usernamesystem:serviceaccount:dns-system:bindy-controller
Type of eventverbget, list, watch, create, update, delete
Date and timerequestReceivedTimestamp2025-12-18T12:34:56.789Z (ISO 8601 UTC)
Success/failure indicationresponseStatus.code200 (success), 403 (forbidden)
Origination of eventsourceIPs10.244.1.15 (pod IP)
Identity of affected dataobjectRef.namerndc-key-primary (secret name)

Sample Audit Log Entry:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a4b5c6d7-e8f9-0a1b-2c3d-4e5f6a7b8c9d",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/dns-system/secrets/rndc-key-primary",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy-controller",
    "uid": "abc123",
    "groups": ["system:serviceaccounts", "system:serviceaccounts:dns-system"]
  },
  "sourceIPs": ["10.244.1.15"],
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key-primary",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-18T12:34:56.789Z"
}

Evidence for QSA:

# Show audit logs for last 30 days (sample)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "range": {
        "requestReceivedTimestamp": {
          "gte": "now-30d"
        }
      }
    },
    "size": 100
  }' | jq .

# Show failed access attempts (last 30 days)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "responseStatus.code": 403 } },
          { "range": { "requestReceivedTimestamp": { "gte": "now-30d" } } }
        ]
      }
    }
  }' | jq .

Compliance Status:PASS - All PCI-DSS 10.2.1 fields captured, audit logs retained 7 years


10.5.1 - Protect Audit Trail from Unauthorized Modification

Requirement: Limit viewing of audit trails to those with a job-related need.

Bindy Implementation:

ControlImplementationEvidence
Immutable StorageS3 Object Lock (WORM) prevents log deletion/modificationS3 bucket configuration
Access ControlsIAM policies restrict S3 access to security team onlyAWS IAM policy
Access Logging (Meta-Logging)S3 server access logs track who reads audit logsS3 access logs
Integrity VerificationSHA-256 checksums verify logs not tamperedDaily CronJob output
Encryption at RestS3 SSE-S3 encryption for all audit logsS3 bucket configuration
Encryption in TransitTLS 1.3 for all S3 API callsAWS default

S3 WORM (Object Lock) Configuration:

# Show Object Lock enabled
aws s3api get-object-lock-configuration --bucket bindy-audit-logs

# Expected output:
# {
#   "ObjectLockConfiguration": {
#     "ObjectLockEnabled": "Enabled",
#     "Rule": {
#       "DefaultRetention": {
#         "Mode": "GOVERNANCE",
#         "Days": 2555
#       }
#     }
#   }
# }

IAM Policy (Audit Log Access):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyDelete",
      "Effect": "Deny",
      "Principal": "*",
      "Action": [
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::bindy-audit-logs/*"
    },
    {
      "Sid": "SecurityTeamReadOnly",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/SecurityTeam"
      },
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ]
    }
  ]
}

Evidence for QSA:

  • S3 Bucket Policy: AWS IAM policy (deny delete, security team read-only)
  • Object Lock Configuration: aws s3api get-object-lock-configuration
  • Integrity Verification: CronJob logs (daily SHA-256 checksum verification)
  • Access Logs: S3 server access logs (who accessed audit logs)

Compliance Status:PASS - Immutable WORM storage, access controls, integrity verification


Requirement 12: Maintain a Security Policy

12.1 - Establish, Publish, Maintain, and Disseminate a Security Policy

Requirement: Establish, publish, maintain, and disseminate a security policy that addresses all PCI-DSS requirements.

Bindy Implementation:

Policy DocumentLocationLast Updated
Security PolicySECURITY.md2025-12-18
Threat Modeldocs/security/THREAT_MODEL.md2025-12-17
Security Architecturedocs/security/ARCHITECTURE.md2025-12-17
Incident Responsedocs/security/INCIDENT_RESPONSE.md2025-12-17
Vulnerability Managementdocs/security/VULNERABILITY_MANAGEMENT.md2025-12-15
Audit Log Retentiondocs/security/AUDIT_LOG_RETENTION.md2025-12-18

Evidence for QSA:

  • Published Policies: All policies in GitHub repository (public access)
  • Version Control: Git history shows policy updates and reviews
  • Annual Review: Policies reviewed quarterly (Next Review: 2026-03-18)

Compliance Status:PASS - Security policies documented, published, and maintained


12.10 - Implement an Incident Response Plan

Requirement: Implement an incident response plan. Be prepared to respond immediately to a system breach.

Bindy Implementation:

Incident TypePlaybookResponse TimeSLA
Critical Vulnerability (CVSS 9.0-10.0)P1< 15 minutesPatch within 24 hours
Compromised Controller PodP2< 15 minutesIsolate within 1 hour
DNS Service OutageP3< 15 minutesRestore within 4 hours
RNDC Key CompromiseP4< 15 minutesRotate keys within 1 hour
Unauthorized DNS ChangesP5< 1 hourRevert within 4 hours
DDoS AttackP6< 15 minutesMitigate within 1 hour
Supply Chain CompromiseP7< 15 minutesRebuild within 24 hours

Incident Response Process (NIST Lifecycle):

  1. Preparation: Playbooks documented, tools configured, team trained
  2. Detection & Analysis: Prometheus alerts, audit log analysis
  3. Containment: Isolate affected systems, prevent escalation
  4. Eradication: Remove threat, patch vulnerability
  5. Recovery: Restore service, verify integrity
  6. Post-Incident Activity: Document lessons learned, improve defenses

Evidence for QSA:

  • Incident Response Playbooks: docs/security/INCIDENT_RESPONSE.md
  • Tabletop Exercise Results: Semi-annual drill reports
  • Incident Logs: S3 bindy-audit-logs/incidents/ (if any incidents occurred)

Compliance Status:PASS - 7 incident playbooks documented, tabletop exercises conducted


PCI-DSS Audit Evidence Package

For your annual PCI-DSS assessment, provide the QSA with:

  1. Requirement 6 (Secure Systems):

    • Vulnerability scan results (GitHub Security tab)
    • Remediation tracking (GitHub issues, CHANGELOG.md)
    • Code review records (PR approval history)
    • Static analysis results (cargo clippy, cargo audit)
  2. Requirement 7 (Access Controls):

    • RBAC policy (deploy/rbac/clusterrole.yaml)
    • RBAC verification output (CI/CD artifact)
    • Quarterly access review reports
    • Secret access audit logs (Elasticsearch query results)
  3. Requirement 10 (Logging):

    • Sample audit logs (redacted, last 30 days)
    • S3 bucket configuration (WORM, encryption, access controls)
    • Log integrity verification results (CronJob output)
    • Audit log access logs (meta-logging, S3 server access logs)
  4. Requirement 12 (Policies):

    • Security policies (SECURITY.md, docs/security/*.md)
    • Incident response playbooks
    • Tabletop exercise results

See Also

Basel III Compliance

Basel III: International Regulatory Framework for Banks


Overview

Basel III is an international regulatory framework for banks developed by the Basel Committee on Banking Supervision (BCBS). While primarily focused on capital adequacy, liquidity risk, and leverage ratios, Basel III also includes operational risk requirements that cover technology and cyber risk.

Bindy, as critical DNS infrastructure in a regulated banking environment, falls under Basel III operational risk management requirements.

Key Basel III Areas Applicable to Bindy:

  1. Operational Risk (Pillar 1): Technology failures, cyber attacks, service disruptions
  2. Cyber Risk Management (2018 Principles): Cybersecurity governance, threat monitoring, incident response
  3. Business Continuity (Pillar 2): Disaster recovery, high availability, resilience
  4. Operational Resilience (2021 Principles): Ability to withstand severe operational disruptions

Basel III Cyber Risk Principles

The Basel Committee published Cyber Risk Principles in 2018, which define expectations for banks’ cybersecurity programs. Bindy complies with these principles:

Principle 1: Governance

Requirement: Board and senior management should establish a comprehensive cyber risk management framework.

Bindy Implementation:

ControlImplementationEvidence
Security PolicyComprehensive security policy documentedSECURITY.md
Threat ModelSTRIDE threat analysis with 15 threatsThreat Model
Security Architecture5 security domains documentedSecurity Architecture
Incident Response7 playbooks for critical/high incidentsIncident Response
Compliance RoadmapTracking compliance implementationCompliance Roadmap

Evidence:

  • Security documentation (4,010 lines across 7 documents)
  • Compliance tracking (H-1 through H-4 complete)
  • Quarterly security reviews

Status:COMPLIANT - Comprehensive cyber risk framework documented


Principle 2: Risk Identification and Assessment

Requirement: Banks should identify and assess cyber risks as part of operational risk management.

Bindy Implementation:

Risk CategoryIdentified ThreatsImpactMitigation
SpoofingCompromised Kubernetes API, stolen ServiceAccount tokensHIGHRBAC least privilege, short-lived tokens, network policies
TamperingMalicious DNS zone changes, RNDC key compromiseCRITICALRBAC read-only, signed commits, audit logging
RepudiationUntracked DNS changes, no audit trailHIGHSigned commits, audit logs (7-year retention), WORM storage
Information DisclosureSecret leakage, DNS data exposureCRITICALKubernetes Secrets, RBAC, secret access audit trail
Denial of ServiceDNS query flood, pod resource exhaustionHIGHRate limiting (planned), pod resource limits, DDoS playbook
Elevation of PrivilegeController pod compromise, RBAC bypassCRITICALNon-root containers, read-only filesystem, minimal RBAC

Attack Surface Analysis:

Attack VectorExposureRisk LevelMitigation Status
Kubernetes APIInternal cluster networkHIGH✅ RBAC, audit logs, network policies (planned)
DNS Port 53Public internetHIGH✅ BIND9 hardening, DDoS playbook
RNDC Port 953Internal cluster networkCRITICAL✅ Secret rotation, access audit, incident playbook P4
Container ImagesPublic registriesMEDIUM✅ Trivy scanning, Chainguard zero-CVE images
CRDs (Custom Resources)Kubernetes APIMEDIUM✅ Input validation, RBAC, audit logs
Git RepositoryPublic GitHubLOW✅ Signed commits, branch protection, code review

Evidence:

Status:COMPLIANT - Comprehensive risk identification and mitigation


Principle 3: Access Controls

Requirement: Banks should implement strong access controls, including least privilege.

Bindy Implementation:

ControlImplementationEvidence
Least Privilege RBACController minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)deploy/rbac/clusterrole.yaml
Secret Access MonitoringAll secret access logged and alertedSecret Access Audit Trail
Quarterly Access ReviewsSecurity team reviews access every quarterdocs/compliance/access-reviews/
2FA EnforcementGitHub requires 2FA for all contributorsGitHub organization settings
Signed CommitsCryptographic proof of code authorshipGit commit signatures

Access Control Matrix:

RoleSecretsCRDsPodsConfigMapsNodes
ControllerCreate/Delete (RNDC keys)Read/Write/Delete (managed)ReadRead/Write/DeleteRead
BIND9 PodsRead-onlyNoneNoneReadNone
DevelopersNoneRead (kubectl)Read (logs)ReadNone
OperatorsRead (kubectl)Read/Write (kubectl)Read/WriteRead/WriteRead
Security TeamRead (audit logs)ReadReadReadRead

Evidence:

  • RBAC policy: deploy/rbac/clusterrole.yaml
  • RBAC verification: ./deploy/rbac/verify-rbac.sh
  • Secret access logs: Elasticsearch query Q1 (quarterly)
  • Access review reports: docs/compliance/access-reviews/YYYY-QN.md

Status:COMPLIANT - Least privilege access, quarterly reviews, audit trail


Principle 4: Threat and Vulnerability Management

Requirement: Banks should implement a threat and vulnerability management process.

Bindy Implementation:

ActivityFrequencyToolRemediation SLA
Dependency ScanningDaily (00:00 UTC)cargo auditCRITICAL (24h), HIGH (7d)
Container Image ScanningEvery PR + DailyTrivyCRITICAL (24h), HIGH (7d)
Code Security ReviewEvery PRManual + cargo clippyBefore merge
Penetration TestingAnnualExternal firm90 days
Threat IntelligenceContinuousGitHub Security AdvisoriesAs detected

Vulnerability Remediation SLAs:

SeverityCVSS ScoreResponse TimeRemediation SLAStatus
CRITICAL9.0-10.0< 15 minutes24 hours✅ Enforced
HIGH7.0-8.9< 1 hour7 days✅ Enforced
MEDIUM4.0-6.9< 4 hours30 days✅ Enforced
LOW0.1-3.9< 24 hours90 days✅ Enforced

Evidence:

  • Vulnerability Management Policy
  • GitHub Security tab - Vulnerability scan results
  • CHANGELOG.md - Remediation history
  • Monthly vulnerability remediation reports

Status:COMPLIANT - Daily scanning, defined SLAs, automated tracking


Principle 5: Cyber Resilience and Response

Requirement: Banks should have incident response and business continuity plans for cyber incidents.

Bindy Implementation:

Incident Response Playbooks (7 Total):

PlaybookScenarioResponse TimeRecovery SLA
P1: Critical VulnerabilityCVSS 9.0-10.0 vulnerability detected< 15 minutesPatch within 24 hours
P2: Compromised ControllerController pod shows anomalous behavior< 15 minutesIsolate within 1 hour
P3: DNS Service OutageAll BIND9 pods down, queries failing< 15 minutesRestore within 4 hours
P4: RNDC Key CompromiseRNDC key leaked or unauthorized access< 15 minutesRotate keys within 1 hour
P5: Unauthorized DNS ChangesUnexpected zone modifications detected< 1 hourRevert within 4 hours
P6: DDoS AttackDNS query flood, resource exhaustion< 15 minutesMitigate within 1 hour
P7: Supply Chain CompromiseMalicious commit or compromised dependency< 15 minutesRebuild within 24 hours

Business Continuity:

CapabilityImplementationRTO (Recovery Time Objective)RPO (Recovery Point Objective)
High AvailabilityMulti-pod deployment (3+ replicas)0 (no downtime)0 (no data loss)
Zone ReplicationPrimary + Secondary DNS instances< 5 minutes< 1 minute (zone transfer)
Disaster RecoveryMulti-region deployment (planned)< 1 hour< 5 minutes
Data BackupDNS zones in Git + etcd backups< 4 hours< 1 hour

Evidence:

Status:COMPLIANT - 7 incident playbooks, business continuity plan


Principle 6: Dependency on Third Parties

Requirement: Banks should manage cyber risks associated with third-party service providers.

Bindy Third-Party Dependencies:

DependencyPurposeRisk LevelMitigation
BIND9DNS server softwareMEDIUMChainguard zero-CVE images, Trivy scanning
KubernetesOrchestration platformMEDIUMManaged Kubernetes (EKS, GKE, AKS), regular updates
Rust DependenciesBuild-time librariesLOWDaily cargo audit, crates.io verified sources
Container RegistriesImage distributionLOWGHCR (GitHub), signed images, SBOM
AWS S3Audit log storageLOWEncryption at rest/transit, WORM, IAM access controls

Third-Party Risk Management:

ControlImplementationEvidence
Dependency VettingOnly use actively maintained dependencies (commits in last 6 months)Cargo.toml review
Vulnerability ScanningDaily cargo audit, Trivy container scanningGitHub Security tab
Supply Chain SecuritySigned commits, SBOM, reproducible buildsBuild Reproducibility
Vendor AssessmentsAnnual review of critical vendors (BIND9, Kubernetes)Vendor assessment reports

Evidence:

  • Cargo.toml, Cargo.lock - Pinned dependency versions
  • SBOM (Software Bill of Materials) - Release artifacts
  • Vendor assessment reports (annual)

Status:COMPLIANT - Third-party dependencies vetted, scanned, monitored


Principle 7: Information Sharing

Requirement: Banks should participate in information sharing to enhance cyber resilience.

Bindy Information Sharing:

ActivityFrequencyAudiencePurpose
Security AdvisoriesAs neededPublic (GitHub)Coordinated disclosure of vulnerabilities
Threat IntelligenceContinuousSecurity teamSubscribe to GitHub Security Advisories, CVE feeds
Incident ReportsAfter incidentsInternal + RegulatorsPost-incident review, lessons learned
Compliance ReportingQuarterlyRisk committeeBasel III operational risk reporting

Evidence:

  • GitHub Security Advisories (if any published)
  • Quarterly risk committee reports
  • Incident post-mortems (if any occurred)

Status:COMPLIANT - Active participation in threat intelligence sharing


Basel III Operational Risk Reporting

Quarterly Operational Risk Report Template:

[Bank Letterhead]

Basel III Operational Risk Report
Q4 2025 - Bindy DNS Infrastructure

Reporting Period: October 1 - December 31, 2025
Prepared by: [Security Team Lead]
Reviewed by: [Chief Risk Officer]

1. OPERATIONAL RISK EVENTS

   1.1 Cyber Incidents:
       - 0 critical incidents
       - 0 high-severity incidents
       - 2 medium-severity incidents (P3: DNS Service Outage)
         - Root cause: Kubernetes pod OOMKilled (memory limit too low)
         - Resolution: Increased memory limit from 512Mi to 1Gi
         - RTO achieved: 15 minutes (target: 4 hours)
       - 0 data breaches

   1.2 Service Availability:
       - Uptime: 99.98% (target: 99.9%)
       - DNS query success rate: 99.99%
       - Mean time to recovery (MTTR): 15 minutes

   1.3 Vulnerability Management:
       - Vulnerabilities detected: 12 (3 HIGH, 9 MEDIUM)
       - Remediation SLA compliance: 100%
       - Average time to remediate: 3.5 days (CRITICAL/HIGH)

2. COMPLIANCE STATUS

   2.1 Basel III Cyber Risk Principles:
       - ✅ Principle 1 (Governance): Security policies documented
       - ✅ Principle 2 (Risk Assessment): Threat model updated Q4 2025
       - ✅ Principle 3 (Access Controls): Quarterly access review completed
       - ✅ Principle 4 (Vulnerability Mgmt): SLAs met (100%)
       - ✅ Principle 5 (Resilience): Tabletop exercise conducted
       - ✅ Principle 6 (Third Parties): Vendor assessments completed
       - ✅ Principle 7 (Info Sharing): Threat intelligence active

   2.2 Audit Trail:
       - Audit logs retained: 7 years (WORM storage)
       - Log integrity verification: 100% pass rate
       - Secret access reviews: Quarterly (last: 2025-12-15)

3. RISK MITIGATION ACTIONS

   3.1 Completed (Phase 2):
       - ✅ H-1: Security Policy and Threat Model
       - ✅ H-2: Audit Log Retention Policy
       - ✅ H-3: Secret Access Audit Trail
       - ✅ H-4: Build Reproducibility Verification

   3.2 Planned (Phase 3):
       - L-1: Implement NetworkPolicies (Q1 2026)
       - M-3: Implement Rate Limiting (Q1 2026)

4. REGULATORY REPORTING

   4.1 PCI-DSS: Annual audit scheduled (Q1 2026)
   4.2 SOX 404: Quarterly ITGC attestation provided
   4.3 Basel III: This report (quarterly)

Approved by:
[Chief Risk Officer Signature]
Date: 2025-12-31

Basel III Audit Evidence

For Basel III operational risk reviews, provide:

  1. Cyber Risk Framework:

    • Security policies (SECURITY.md, docs/security/*.md)
    • Threat model (STRIDE analysis)
    • Security architecture documentation
  2. Incident Response:

    • Incident response playbooks (P1-P7)
    • Incident logs (if any occurred)
    • Tabletop exercise results (semi-annual)
  3. Vulnerability Management:

    • Vulnerability scan results (GitHub Security tab)
    • Remediation tracking (GitHub issues, CHANGELOG.md)
    • Monthly remediation reports
  4. Access Controls:

    • RBAC policy and verification output
    • Quarterly access review reports
    • Secret access audit logs
  5. Audit Trail:

    • S3 bucket configuration (WORM, retention)
    • Log integrity verification results
    • Sample audit logs (redacted)
  6. Business Continuity:

    • High availability architecture
    • Disaster recovery procedures
    • RTO/RPO metrics

See Also

SLSA Compliance

Supply-chain Levels for Software Artifacts


Overview

SLSA (Supply-chain Levels for Software Artifacts, pronounced “salsa”) is a security framework developed by Google to prevent supply chain attacks. It defines a series of incrementally adoptable security levels (0-3) that provide increasing supply chain security guarantees.

Bindy’s SLSA Status:Level 3 (highest level)


SLSA Requirements by Level

RequirementLevel 1Level 2Level 3Bindy Status
Source - Version controlled✅ Git (GitHub)
Source - Verified history✅ Signed commits
Source - Retained indefinitely✅ GitHub (permanent)
Source - Two-person reviewed✅ 2+ PR approvals
Build - Scripted build✅ Cargo + Docker
Build - Build service✅ GitHub Actions
Build - Build as code✅ Workflows in Git
Build - Ephemeral environment✅ Fresh runners
Build - Isolated✅ No secrets accessible
Build - Hermetic⚠️ Partial (cargo fetch)
Build - Reproducible✅ Bit-for-bit
Provenance - Available✅ SBOM + signatures
Provenance - Authenticated✅ Signed tags
Provenance - Service generated✅ GitHub Actions
Provenance - Non-falsifiable✅ Cryptographic signatures
Provenance - Dependencies complete✅ Cargo.lock + SBOM

SLSA Level 3 Detailed Compliance

Source Requirements

✅ Requirement: Version controlled with verified history

ControlImplementationEvidence
Git Version ControlAll source code in GitHubGitHub repository
Signed CommitsAll commits GPG/SSH signedgit log --show-signature
Verified HistoryBranch protection prevents history rewritingGitHub branch protection
Two-Person Review2+ approvals required for all PRsPR approval logs
Permanent RetentionGit history never deletedGitHub repository settings

Evidence:

# Show all commits are signed (last 90 days)
git log --show-signature --since="90 days ago" --oneline

# Show branch protection (prevents force push, history rewriting)
gh api repos/firestoned/bindy/branches/main/protection | jq

Build Requirements

✅ Requirement: Build process is fully scripted and reproducible

ControlImplementationEvidence
Scripted BuildCargo (Rust), Docker (containers)Cargo.toml, Dockerfile
Build as CodeGitHub Actions workflows in version control.github/workflows/*.yaml
Ephemeral EnvironmentFresh GitHub-hosted runners for each buildGitHub Actions logs
IsolatedBuild cannot access secrets or network (after deps fetched)GitHub Actions sandboxing
Hermetic⚠️ Partial - cargo fetch uses networkWorking toward full hermetic
ReproducibleTwo builds from same commit = identical binaryBuild Reproducibility

Build Reproducibility Verification:

# Automated verification (daily CI/CD)
# Builds binary twice, compares SHA-256 hashes
.github/workflows/reproducibility-check.yaml

# Manual verification (external auditors)
scripts/verify-build.sh v0.1.0

Sources of Non-Determinism (Mitigated):

  1. Timestamps → Use vergen for deterministic Git commit timestamps
  2. Filesystem order → Sort files before processing
  3. HashMap iteration → Use BTreeMap for deterministic order
  4. Parallelism → Sort output after parallel processing
  5. Base image updates → Pin base image digests in Dockerfile

Evidence:


Provenance Requirements

✅ Requirement: Build provenance is available, authenticated, and non-falsifiable

ArtifactProvenance TypeSignatureAvailability
Rust BinarySHA-256 checksumGPG-signed Git tagGitHub Releases
Container ImageImage digestSBOM + attestationGHCR (GitHub Container Registry)
SBOMCycloneDX formatIncluded in releaseGitHub Releases (*.sbom.json)
Source CodeGit commitGPG/SSH signatureGitHub repository

SBOM Generation:

# Generate SBOM (Software Bill of Materials)
cargo install cargo-cyclonedx
cargo cyclonedx --format json --output bindy.sbom.json

# SBOM includes all dependencies with exact versions
cat bindy.sbom.json | jq '.components[] | {name, version}'

Evidence:

  • GitHub Releases: https://github.com/firestoned/bindy/releases
  • SBOM files: bindy-*.sbom.json in release artifacts
  • Signed Git tags: git tag --verify v0.1.0
  • Container image signatures: docker trust inspect ghcr.io/firestoned/bindy:v0.1.0

SLSA Build Levels Comparison

AspectLevel 1Level 2Level 3Bindy
Protection againstAccidental errorsCompromised build serviceCompromised source + build✅ All
Source integrityManual commitsSigned commitsSigned commits + 2-person review✅ Complete
Build integrityManual buildAutomated buildReproducible build✅ Complete
ProvenanceNoneService-generatedCryptographic provenance✅ Complete
VerifiabilityTrust on first useVerifiable by serviceVerifiable by anyone✅ Complete

SLSA Compliance Roadmap

RequirementStatusEvidence
Level 1✅ CompleteGit, Cargo build
Level 2✅ CompleteGitHub Actions, signed commits, SBOM
Level 3 (Source)✅ CompleteSigned commits, 2+ PR approvals, permanent Git history
Level 3 (Build)✅ CompleteReproducible builds, verification script
Level 3 (Provenance)✅ CompleteSBOM, signed tags, container attestation
Level 3 (Hermetic)⚠️ Partialcargo fetch uses network (working toward offline builds)

Verification for End Users

How to verify Bindy releases:

# 1. Verify Git tag signature
git verify-tag v0.1.0

# 2. Rebuild from source
git checkout v0.1.0
cargo build --release --locked

# 3. Compare binary hash with released artifact
sha256sum target/release/bindy
curl -sL https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy-linux-amd64.sha256

# 4. Verify SBOM (Software Bill of Materials)
curl -sL https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy.sbom.json | jq .

# 5. Verify container image signature (if using containers)
docker trust inspect ghcr.io/firestoned/bindy:v0.1.0

Expected Result: ✅ All verifications pass, hashes match, provenance verified


SLSA Threat Mitigation

ThreatSLSA LevelBindy Mitigation
A: Build system compromiseLevel 2+✅ GitHub-hosted runners (ephemeral, isolated)
B: Source code compromiseLevel 3✅ Signed commits, 2+ PR approvals, branch protection
C: Dependency compromiseLevel 3✅ Cargo.lock pinned, daily cargo audit, SBOM
D: Upload of malicious binariesLevel 2+✅ GitHub Actions uploads, not manual
E: Compromised build configLevel 2+✅ Workflows in Git, 2+ PR approvals
F: Use of compromised packageLevel 3✅ Reproducible builds, users can verify

See Also

NIST Cybersecurity Framework

NIST CSF: Framework for Improving Critical Infrastructure Cybersecurity


Overview

The NIST Cybersecurity Framework (CSF) is a voluntary framework developed by the National Institute of Standards and Technology (NIST) to help organizations manage and reduce cybersecurity risk. The framework is organized into five functions: Identify, Protect, Detect, Respond, and Recover.

Bindy’s NIST CSF Status: ⚠️ Partial Compliance (60% complete)

  • Identify: 90% complete
  • Protect: 80% complete
  • ⚠️ Detect: 60% complete (needs network monitoring)
  • Respond: 90% complete
  • ⚠️ Recover: 50% complete (needs disaster recovery testing)

NIST CSF Core Functions

1. Identify (ID)

Objective: Develop organizational understanding to manage cybersecurity risk to systems, people, assets, data, and capabilities.

CategorySubcategoryBindy ImplementationStatus
ID.AM (Asset Management)Asset inventoryKubernetes resources tracked in Git✅ Complete
ID.BE (Business Environment)Dependencies documentedThird-party dependencies in SBOM✅ Complete
ID.GV (Governance)Security policies establishedSECURITY.md, threat model, incident response✅ Complete
ID.RA (Risk Assessment)Threat modeling conductedSTRIDE analysis (15 threats, 5 scenarios)✅ Complete
ID.RM (Risk Management Strategy)Risk mitigation roadmapCompliance roadmap (H-1 to M-4)✅ Complete
ID.SC (Supply Chain Risk Management)Third-party dependencies assessedDaily cargo audit, Trivy scanning, SBOM✅ Complete

Evidence:

Identify Function:90% Complete (Asset management, risk assessment done; needs supply chain deep dive)


2. Protect (PR)

Objective: Develop and implement appropriate safeguards to ensure delivery of critical services.

CategorySubcategoryBindy ImplementationStatus
PR.AC (Identity Management)Least privilege accessRBAC (minimal delete permissions for lifecycle management), 2FA✅ Complete
PR.AC (Physical access control)N/A (cloud-hosted)Kubernetes cluster securityN/A
PR.AT (Awareness and Training)Security trainingCONTRIBUTING.md (secure coding guidelines)✅ Complete
PR.DS (Data Security)Data at rest encryptionKubernetes Secrets (encrypted etcd), S3 SSE✅ Complete
PR.DS (Data in transit encryption)TLS for all API callsKubernetes API (TLS 1.3), S3 (TLS 1.3)✅ Complete
PR.IP (Information Protection)Secret managementKubernetes Secrets, secret access audit trail✅ Complete
PR.MA (Maintenance)Vulnerability patchingDaily cargo audit, SLAs (CRITICAL 24h, HIGH 7d)✅ Complete
PR.PT (Protective Technology)Security controlsNon-root containers, read-only filesystem, RBAC✅ Complete

Evidence:

Protect Function:80% Complete (Strong access controls, data protection; needs NetworkPolicies L-1)


3. Detect (DE)

Objective: Develop and implement appropriate activities to identify the occurrence of a cybersecurity event.

CategorySubcategoryBindy ImplementationStatus
DE.AE (Anomalies and Events)Anomaly detectionPrometheus alerts (unauthorized access, excessive access)✅ Complete
DE.CM (Security Continuous Monitoring)Vulnerability scanningDaily cargo audit, Trivy (containers)✅ Complete
DE.CM (Network monitoring)Network traffic analysis⚠️ Planned (L-1: NetworkPolicies + monitoring)⚠️ Planned
DE.DP (Detection Processes)Incident detection procedures7 incident playbooks (P1-P7)✅ Complete

Implemented Detection Controls:

AlertTriggerSeverityResponse Time
UnauthorizedSecretAccessNon-controller accessed secretCRITICAL< 1 minute
ExcessiveSecretAccess> 10 secret accesses/secWARNING< 5 minutes
FailedSecretAccessAttempts> 1 failed access/secWARNING< 5 minutes
CriticalVulnerabilityCVSS 9.0-10.0 detectedCRITICAL< 15 minutes
PodCrashLoopPod restarting repeatedlyHIGH< 5 minutes

Evidence:

  • Prometheus alerting rules: deploy/monitoring/alerts/bindy-secret-access.yaml
  • Secret Access Audit Trail - Alert definitions
  • GitHub Actions workflows: Daily security scans

Detect Function: ⚠️ 60% Complete (Anomaly detection done; needs network monitoring L-1)


4. Respond (RE)

Objective: Develop and implement appropriate activities to take action regarding a detected cybersecurity incident.

CategorySubcategoryBindy ImplementationStatus
RE.RP (Response Planning)Incident response plan7 incident playbooks (P1-P7) following NIST lifecycle✅ Complete
RE.CO (Communications)Incident communication planSlack war rooms, status page, regulatory reporting✅ Complete
RE.AN (Analysis)Incident analysis proceduresRoot cause analysis, forensic preservation✅ Complete
RE.MI (Mitigation)Incident containment proceduresIsolation, credential rotation, rollback✅ Complete
RE.IM (Improvements)Post-incident improvementsPost-mortem template, action items tracking✅ Complete

Incident Response Playbooks (NIST Lifecycle):

PlaybookNIST Phases CoveredResponse TimeEvidence
P1: Critical VulnerabilityPreparation, Detection, Containment, Eradication, Recovery< 15 minP1 Playbook
P2: Compromised ControllerAll phases< 15 minP2 Playbook
P3: DNS Service OutageDetection, Containment, Recovery< 15 minP3 Playbook
P4: RNDC Key CompromiseAll phases< 15 minP4 Playbook
P5: Unauthorized DNS ChangesAll phases< 1 hourP5 Playbook
P6: DDoS AttackDetection, Containment, Recovery< 15 minP6 Playbook
P7: Supply Chain CompromiseAll phases< 15 minP7 Playbook

NIST Incident Response Lifecycle:

  1. Preparation ✅ - Playbooks documented, tools configured, team trained
  2. Detection & Analysis ✅ - Prometheus alerts, audit log analysis
  3. Containment, Eradication & Recovery ✅ - Isolation procedures, patching, service restoration
  4. Post-Incident Activity ✅ - Post-mortem template, lessons learned, action items

Evidence:

Respond Function:90% Complete (Comprehensive playbooks; needs annual tabletop exercise)


5. Recover (RE)

Objective: Develop and implement appropriate activities to maintain plans for resilience and to restore capabilities or services impaired due to a cybersecurity incident.

CategorySubcategoryBindy ImplementationStatus
RC.RP (Recovery Planning)Disaster recovery planMulti-region deployment (planned), zone backups⚠️ Planned
RC.IM (Improvements)Recovery plan testing⚠️ Annual DR drill needed⚠️ Planned
RC.CO (Communications)Recovery communication planIncident playbooks include recovery steps✅ Complete

Current Recovery Capabilities:

CapabilityRTO (Recovery Time Objective)RPO (Recovery Point Objective)Status
Pod Failure0 (automatic restart)0 (no data loss)✅ Complete
Controller Failure< 5 minutes (new pod scheduled)0 (no data loss)✅ Complete
BIND9 Pod Failure< 5 minutes (new pod scheduled)0 (zone data in etcd)✅ Complete
Zone Data Loss< 1 hour (restore from Git)< 5 minutes (last reconciliation)✅ Complete
Cluster Failure⚠️ < 4 hours (manual failover)< 1 hour (last etcd backup)⚠️ Needs testing
Region Failure⚠️ < 24 hours (multi-region planned)< 1 hour⚠️ Planned

Planned Improvements:

  • L-2: Implement multi-region deployment (RTO < 1 hour for region failure)
  • Annual DR Drill: Test disaster recovery procedures (cluster failure, region failure)

Evidence:

  • High availability architecture: 3+ pod replicas, multi-zone
  • Zone backups: Git repository (all DNSZone CRDs)
  • Incident playbooks: P3 (DNS Service Outage) includes recovery steps

Recover Function: ⚠️ 50% Complete (Pod/controller recovery done; needs multi-region and DR testing)


NIST CSF Implementation Tiers

NIST CSF defines 4 implementation tiers (Partial, Risk Informed, Repeatable, Adaptive). Bindy is at Tier 3: Repeatable.

TierDescriptionBindy Status
Tier 1: PartialAd hoc, reactive risk management
Tier 2: Risk InformedRisk management practices approved but not policy
Tier 3: RepeatableFormally approved policies, regularly updatedCurrent
Tier 4: AdaptiveContinuous improvement based on lessons learned⚠️ Target

Tier 3 Evidence:

  • Formal security policies documented and published
  • Incident response playbooks (repeatable processes)
  • Quarterly compliance reviews
  • Annual policy reviews (Next Review: 2026-03-18)

Tier 4 Roadmap:

  • Implement continuous security metrics dashboard
  • Quarterly threat intelligence updates to policies
  • Annual penetration testing with policy updates
  • Automated compliance reporting

NIST CSF Compliance Summary

FunctionCompletionPriority GapsTarget Date
Identify90%Supply chain deep diveQ1 2026
Protect80%NetworkPolicies (L-1)Q1 2026
Detect60%Network monitoring (L-1)Q1 2026
Respond90%Annual tabletop exerciseQ2 2026
Recover50%Multi-region deployment (L-2), DR testingQ2 2026

Overall NIST CSF Maturity: ⚠️ 60% (Tier 3: Repeatable)

Target: 90% (Tier 4: Adaptive) by Q2 2026


NIST CSF Audit Evidence

For NIST CSF assessments, provide:

  1. Identify Function:

    • Asset inventory (Kubernetes resources in Git)
    • Threat model (STRIDE analysis)
    • Compliance roadmap (risk mitigation tracking)
    • SBOM (dependency inventory)
  2. Protect Function:

    • RBAC policy and verification output
    • Kubernetes Security Context (non-root, read-only FS)
    • Vulnerability management policy (SLAs, remediation tracking)
    • Secret access audit trail
  3. Detect Function:

    • Prometheus alerting rules
    • Vulnerability scan results (daily cargo audit, Trivy)
    • Incident detection playbooks
  4. Respond Function:

    • 7 incident response playbooks (P1-P7)
    • Post-incident review template
    • Tabletop exercise results (semi-annual)
  5. Recover Function:

    • High availability architecture (3+ replicas, multi-zone)
    • Zone backup procedures (Git repository)
    • Disaster recovery plan (in progress)

See Also

API Reference

This document describes the Custom Resource Definitions (CRDs) provided by Bindy.

Note: This file is AUTO-GENERATED from src/crd.rs DO NOT EDIT MANUALLY - Run cargo run --bin crddoc to regenerate

Table of Contents

Zone Management

DNSZone

API Version: bindy.firestoned.io/v1alpha1

DNSZone represents an authoritative DNS zone managed by BIND9. Each DNSZone defines a zone (e.g., example.com) with SOA record parameters. Can reference either a namespace-scoped Bind9Cluster or cluster-scoped Bind9GlobalCluster.

Spec Fields

FieldTypeRequiredDescription
clusterRefstringNoReference to a namespace-scoped `Bind9Cluster` in the same namespace. Must match the name of a `Bind9Cluster` resource in the same namespace. The zone will be added to all instances in this cluster. Either `clusterRef` or `globalClusterRef` must be specified (not both).
globalClusterRefstringNoReference to a cluster-scoped `Bind9GlobalCluster`. Must match the name of a `Bind9GlobalCluster` resource (cluster-scoped). The zone will be added to all instances in this global cluster. Either `clusterRef` or `globalClusterRef` must be specified (not both).
nameServerIpsobjectNoMap of nameserver hostnames to IP addresses for glue records. Glue records provide IP addresses for nameservers within the zone’s own domain. This is necessary when delegating subdomains where the nameserver is within the delegated zone itself. Example: When delegating `sub.example.com` with nameserver `ns1.sub.example.com`, you must provide the IP address of `ns1.sub.example.com` as a glue record. Format: `{“ns1.example.com.”: “192.0.2.1”, “ns2.example.com.”: “192.0.2.2”}` Note: Nameserver hostnames should end with a dot (.) for FQDN.
soaRecordobjectYesSOA (Start of Authority) record - defines zone authority and refresh parameters. The SOA record is required for all authoritative zones and contains timing information for zone transfers and caching.
ttlintegerNoDefault TTL (Time To Live) for records in this zone, in seconds. If not specified, individual records must specify their own TTL. Typical values: 300-86400 (5 minutes to 1 day).
zoneNamestringYesDNS zone name (e.g., “example.com”). Must be a valid DNS zone name. Can be a domain or subdomain. Examples: “example.com”, “internal.example.com”, “10.in-addr.arpa”

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo
recordCountintegerNo
secondaryIpsarrayNoIP addresses of secondary servers configured for zone transfers. Used to detect when secondary IPs change and zones need updating.

DNS Records

ARecord

API Version: bindy.firestoned.io/v1alpha1

ARecord maps a DNS hostname to an IPv4 address. Multiple A records for the same name enable round-robin DNS load balancing.

Spec Fields

FieldTypeRequiredDescription
ipv4AddressstringYesIPv4 address in dotted-decimal notation. Must be a valid IPv4 address (e.g., “192.0.2.1”).
namestringYesRecord name within the zone. Use “@” for the zone apex. Examples: “www”, “mail”, “ftp”, “@” The full DNS name will be: {name}.{zone}
ttlintegerNoTime To Live in seconds. Overrides zone default TTL if specified. Typical values: 60-86400 (1 minute to 1 day).
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name. This is more efficient than searching by zone name. Example: If the `DNSZone` is named “example-com”, use `zoneRef: example-com`

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

AAAARecord

API Version: bindy.firestoned.io/v1alpha1

AAAARecord maps a DNS hostname to an IPv6 address. This is the IPv6 equivalent of an A record.

Spec Fields

FieldTypeRequiredDescription
ipv6AddressstringYesIPv6 address in standard notation. Examples: `2001:db8::1`, `fe80::1`, `::1`
namestringYesRecord name within the zone.
ttlintegerNoTime To Live in seconds.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

CNAMERecord

API Version: bindy.firestoned.io/v1alpha1

CNAMERecord creates a DNS alias from one hostname to another. A CNAME cannot coexist with other record types for the same name.

Spec Fields

FieldTypeRequiredDescription
namestringYesRecord name within the zone. Note: CNAME records cannot be created at the zone apex (@).
targetstringYesTarget hostname (canonical name). Should be a fully qualified domain name ending with a dot. Example: “example.com.” or “www.example.com.”
ttlintegerNoTime To Live in seconds.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

MXRecord

API Version: bindy.firestoned.io/v1alpha1

MXRecord specifies mail exchange servers for a domain. Lower priority values indicate higher preference for mail delivery.

Spec Fields

FieldTypeRequiredDescription
mailServerstringYesFully qualified domain name of the mail server. Must end with a dot. Example: “mail.example.com.”
namestringYesRecord name within the zone. Use “@” for the zone apex.
priorityintegerYesPriority (preference) of this mail server. Lower values = higher priority. Common values: 0-100. Multiple MX records can exist with different priorities.
ttlintegerNoTime To Live in seconds.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

NSRecord

API Version: bindy.firestoned.io/v1alpha1

NSRecord delegates a subdomain to authoritative nameservers. Used for subdomain delegation to different DNS providers or servers.

Spec Fields

FieldTypeRequiredDescription
namestringYesSubdomain to delegate. For zone apex, use “@”.
nameserverstringYesFully qualified domain name of the nameserver. Must end with a dot. Example: “ns1.example.com.”
ttlintegerNoTime To Live in seconds.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

TXTRecord

API Version: bindy.firestoned.io/v1alpha1

TXTRecord stores arbitrary text data in DNS. Commonly used for SPF, DKIM, DMARC policies, and domain verification.

Spec Fields

FieldTypeRequiredDescription
namestringYesRecord name within the zone.
textarrayYesArray of text strings. Each string can be up to 255 characters. Multiple strings are concatenated by DNS resolvers. For long text, split into multiple strings.
ttlintegerNoTime To Live in seconds.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

SRVRecord

API Version: bindy.firestoned.io/v1alpha1

SRVRecord specifies the hostname and port of servers for specific services. The record name follows the format _service._proto (e.g., _ldap._tcp).

Spec Fields

FieldTypeRequiredDescription
namestringYesService and protocol in the format: _service._proto Example: “_ldap._tcp”, “_sip._udp”, “_http._tcp”
portintegerYesTCP or UDP port where the service is available.
priorityintegerYesPriority of the target host. Lower values = higher priority.
targetstringYesFully qualified domain name of the target host. Must end with a dot. Use “.” for “service not available”.
ttlintegerNoTime To Live in seconds.
weightintegerYesRelative weight for records with the same priority. Higher values = higher probability of selection.
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

CAARecord

API Version: bindy.firestoned.io/v1alpha1

CAARecord specifies which certificate authorities are authorized to issue certificates for a domain. Enhances domain security and certificate issuance control.

Spec Fields

FieldTypeRequiredDescription
flagsintegerYesFlags byte. Use 0 for non-critical, 128 for critical. Critical flag (128) means CAs must understand the tag.
namestringYesRecord name within the zone. Use “@” for the zone apex.
tagstringYesProperty tag. Common values: “issue”, “issuewild”, “iodef”. - “issue”: Authorize CA to issue certificates - “issuewild”: Authorize CA to issue wildcard certificates - “iodef”: URL/email for violation reports
ttlintegerNoTime To Live in seconds.
valuestringYesProperty value. Format depends on the tag. For “issue”/“issuewild”: CA domain (e.g., “letsencrypt.org”) For “iodef”: mailto: or https: URL
zoneRefstringYesReference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo

Infrastructure

Bind9Cluster

API Version: bindy.firestoned.io/v1alpha1

Bind9Cluster defines a namespace-scoped logical grouping of BIND9 DNS server instances. Use this for tenant-managed DNS infrastructure isolated to a specific namespace. For platform-managed cluster-wide DNS, use Bind9GlobalCluster instead.

Spec Fields

FieldTypeRequiredDescription
aclsobjectNoACLs that can be referenced by instances
configMapRefsobjectNo`ConfigMap` references for BIND9 configuration files
globalobjectNoGlobal configuration shared by all instances in the cluster This configuration applies to all instances (both primary and secondary) unless overridden at the instance level or by role-specific configuration.
imageobjectNoContainer image configuration
primaryobjectNoPrimary instance configuration Configuration specific to primary (authoritative) DNS instances, including replica count and service specifications.
rndcSecretRefsarrayNoReferences to Kubernetes Secrets containing RNDC/TSIG keys for authenticated zone transfers. Each secret should contain the key name, algorithm, and base64-encoded secret value. These secrets are used for secure communication with BIND9 instances via RNDC and for authenticated zone transfers (AXFR/IXFR) between primary and secondary servers.
secondaryobjectNoSecondary instance configuration Configuration specific to secondary (replica) DNS instances, including replica count and service specifications.
versionstringNoShared BIND9 version for the cluster
volumeMountsarrayNoVolume mounts that specify where volumes should be mounted in containers These mounts are inherited by all instances unless overridden.
volumesarrayNoVolumes that can be mounted by instances in this cluster These volumes are inherited by all instances unless overridden. Common use cases include `PersistentVolumeClaims` for zone data storage.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNoStatus conditions for this cluster
instanceCountintegerNoNumber of instances in this cluster
instancesarrayNoNames of `Bind9Instance` resources created for this cluster
observedGenerationintegerNoObserved generation for optimistic concurrency
readyInstancesintegerNoNumber of ready instances

Bind9Instance

API Version: bindy.firestoned.io/v1alpha1

Bind9Instance represents a BIND9 DNS server deployment in Kubernetes. Each instance creates a Deployment, Service, ConfigMap, and Secret for managing a BIND9 server with RNDC protocol communication.

Spec Fields

FieldTypeRequiredDescription
bindcarConfigobjectNoBindcar RNDC API sidecar container configuration. The API container provides an HTTP interface for managing zones via rndc. If not specified, uses default configuration.
clusterRefstringYesReference to the cluster this instance belongs to. Can reference either: - A namespace-scoped `Bind9Cluster` (must be in the same namespace as this instance) - A cluster-scoped `Bind9GlobalCluster` (cluster-wide, accessible from any namespace) The cluster provides shared configuration and defines the logical grouping. The controller will automatically detect whether this references a namespace-scoped or cluster-scoped cluster resource.
configobjectNoInstance-specific BIND9 configuration overrides. Overrides cluster-level configuration for this instance only.
configMapRefsobjectNo`ConfigMap` references override. Inherits from cluster if not specified.
imageobjectNoContainer image configuration override. Inherits from cluster if not specified.
primaryServersarrayNoPrimary server addresses for zone transfers (required for secondary instances). List of IP addresses or hostnames of primary servers to transfer zones from. Example: `[“10.0.1.10”, “primary.example.com”]`
replicasintegerNoNumber of pod replicas for high availability. Defaults to 1 if not specified. For production, use 2+ replicas.
rndcSecretRefobjectNoReference to an existing Kubernetes Secret containing RNDC key. If specified, uses this existing Secret instead of auto-generating one. The Secret must contain the keys specified in the reference (defaults: “key-name”, “algorithm”, “secret”, “rndc.key”). This allows sharing RNDC keys across instances or using externally managed secrets. If not specified, a Secret will be auto-generated for this instance.
rolestringYesRole of this instance (primary or secondary). Primary instances are authoritative for zones. Secondary instances replicate zones from primaries via AXFR/IXFR.
storageobjectNoStorage configuration for zone files. Specifies how zone files should be stored. Defaults to emptyDir (ephemeral storage). For persistent storage, use persistentVolumeClaim.
versionstringNoBIND9 version override. Inherits from cluster if not specified. Example: “9.18”, “9.16”
volumeMountsarrayNoVolume mounts override for this instance. Inherits from cluster if not specified. These mounts override cluster-level volume mounts.
volumesarrayNoVolumes override for this instance. Inherits from cluster if not specified. These volumes override cluster-level volumes. Common use cases include instance-specific `PersistentVolumeClaims` for zone data storage.

Status Fields

FieldTypeRequiredDescription
conditionsarrayNo
observedGenerationintegerNo
readyReplicasintegerNo
replicasintegerNo
serviceAddressstringNoIP or hostname of this instance’s service

Bind9Cluster Specification

Complete specification for the Bind9Cluster Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: string
  namespace: string
spec:
  version: string              # Optional, BIND9 version
  image:                       # Optional, container image config
    image: string
    imagePullPolicy: string
    imagePullSecrets: [string]
  configMapRefs:               # Optional, custom config files
    namedConf: string
    namedConfOptions: string
  global:                      # Optional, global BIND9 config for all instances
    recursion: boolean
    allowQuery: [string]       # ⚠️ NO DEFAULT - must be explicitly set
    allowTransfer: [string]    # ⚠️ NO DEFAULT - must be explicitly set
    dnssec:
      enabled: boolean
      validation: boolean
    forwarders: [string]
    listenOn: [string]
    listenOnV6: [string]
  rndcSecretRefs: [RndcSecretRef]  # Optional, refs to Secrets with RNDC/TSIG keys
  acls:                        # Optional, named ACLs
    name: [string]
  volumes: [Volume]            # Optional, Kubernetes volumes
  volumeMounts: [VolumeMount]  # Optional, volume mount specifications

Overview

Bind9Cluster defines a logical grouping of BIND9 DNS server instances with shared configuration. It provides centralized management of BIND9 version, container images, and common settings across multiple instances.

Key Features:

  • Shared version and image configuration
  • Centralized BIND9 configuration
  • TSIG key management for secure zone transfers
  • Named ACLs for access control
  • Cluster-wide status reporting

Spec Fields

version

Type: string Required: No Default: “9.18”

BIND9 version to deploy across all instances in the cluster unless overridden at the instance level.

spec:
  version: "9.18"

Supported Versions:

  • “9.16” - Older stable
  • “9.18” - Current stable (recommended)
  • “9.19” - Development

image

Type: object Required: No

Container image configuration shared by all instances in the cluster.

spec:
  image:
    image: "internetsystemsconsortium/bind9:9.18"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - my-registry-secret

How It Works:

  • Instances inherit image configuration from the cluster
  • Instances can override with their own image config
  • Simplifies managing container images across multiple instances

image.image

Type: string Required: No Default: “internetsystemsconsortium/bind9:9.18”

Full container image reference including registry, repository, and tag.

spec:
  image:
    image: "my-registry.example.com/bind9:custom"

image.imagePullPolicy

Type: string Required: No Default: “IfNotPresent”

Kubernetes image pull policy.

Valid Values:

  • "Always" - Always pull the image
  • "IfNotPresent" - Pull only if not present locally (recommended)
  • "Never" - Never pull, use local image only

image.imagePullSecrets

Type: array of strings Required: No Default: []

List of Kubernetes secret names for authenticating with private container registries.

spec:
  image:
    imagePullSecrets:
      - docker-registry-secret

configMapRefs

Type: object Required: No

References to custom ConfigMaps containing BIND9 configuration files shared across the cluster.

spec:
  configMapRefs:
    namedConf: "cluster-named-conf"
    namedConfOptions: "cluster-options"

How It Works:

  • Cluster-level ConfigMaps apply to all instances
  • Instances can override with their own ConfigMap references
  • Useful for sharing common configuration

configMapRefs.namedConf

Type: string Required: No

Name of ConfigMap containing the main named.conf file.

configMapRefs.namedConfOptions

Type: string Required: No

Name of ConfigMap containing the named.conf.options file.

global

Type: object Required: No

Global BIND9 configuration shared across all instances in the cluster.

⚠️ Warning: There are NO defaults for allowQuery and allowTransfer. If not specified, BIND9’s default behavior applies (no queries or transfers allowed). Always explicitly configure these fields for your security requirements.

spec:
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true
      validation: auto

How It Works:

  • All instances inherit global configuration
  • Instances can override specific settings
  • Role-specific configuration (primary/secondary) can override global settings
  • Changes propagate to all instances using global config

global.recursion

Type: boolean Required: No Default: false

Enable recursive DNS queries.

global.allowQuery

Type: array of strings Required: No Default: None (BIND9 default: no queries allowed)

IP addresses or CIDR blocks allowed to query servers in this cluster.

⚠️ Warning: No default value is provided. You must explicitly configure this field or queries will be denied.

global.allowTransfer

Type: array of strings Required: No Default: None (BIND9 default: no transfers allowed)

IP addresses or CIDR blocks allowed to perform zone transfers.

⚠️ Warning: No default value is provided. You must explicitly configure this field or zone transfers will be denied.

global.dnssec

Type: object Required: No

DNSSEC configuration for the cluster.

global.dnssec.enabled

Type: boolean Required: No Default: false

Enable DNSSEC signing for zones.

global.dnssec.validation

Type: boolean Required: No Default: false

Enable DNSSEC validation for recursive queries.

global.forwarders

Type: array of strings Required: No Default: []

DNS servers to forward queries to (for recursive mode).

spec:
  global:
    recursion: true
    forwarders:
      - "8.8.8.8"
      - "1.1.1.1"

global.listenOn

Type: array of strings Required: No Default: [“any”]

IPv4 addresses to listen on.

global.listenOnV6

Type: array of strings Required: No Default: [“any”]

IPv6 addresses to listen on.

rndcSecretRefs

Type: array of RndcSecretRef objects Required: No Default: []

References to Kubernetes Secrets containing RNDC/TSIG keys for authenticated zone transfers and RNDC communication.

# 1. Create Secret with credentials
apiVersion: v1
kind: Secret
metadata:
  name: transfer-key-secret
type: Opaque
stringData:
  key-name: transfer-key
  secret: base64-encoded-hmac-key

---
# 2. Reference in Bind9Cluster
spec:
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256  # Algorithm specified in CRD

How It Works:

  • RNDC/TSIG keys authenticate zone transfers and RNDC commands
  • Keys stored securely in Kubernetes Secrets
  • Algorithm specified in CRD for type safety
  • Keys are shared across all instances in the cluster

RndcSecretRef Fields:

  • name (string, required) - Name of the Kubernetes Secret
  • algorithm (RndcAlgorithm, optional) - HMAC algorithm (defaults to hmac-sha256)
    • Supported: hmac-md5, hmac-sha1, hmac-sha224, hmac-sha256, hmac-sha384, hmac-sha512
  • keyNameKey (string, optional) - Key in secret for key name (defaults to “key-name”)
  • secretKey (string, optional) - Key in secret for secret value (defaults to “secret”)

acls

Type: object (map of string arrays) Required: No Default: {}

Named Access Control Lists that can be referenced in instance configurations.

spec:
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    trusted:
      - "192.168.1.0/24"
    external:
      - "0.0.0.0/0"

How It Works:

  • Define ACLs once at cluster level
  • Reference by name in instance configurations
  • Simplifies managing access control across instances

Usage Example:

# In Bind9Instance
spec:
  global:
    allowQuery:
      - "acl:internal"
    allowTransfer:
      - "acl:trusted"

volumes

Type: array of Kubernetes Volume objects Required: No Default: []

Kubernetes volumes that can be mounted by instances in this cluster.

spec:
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-pvc
    - name: config-override
      configMap:
        name: custom-bind-config

How It Works:

  • Volumes defined at cluster level are inherited by all instances
  • Instances can override with their own volumes
  • Common use cases include:
    • PersistentVolumeClaims for zone data persistence
    • ConfigMaps for custom configuration files
    • Secrets for sensitive data like TSIG keys
    • EmptyDir for temporary storage

Volume Types: Supports all Kubernetes volume types including:

  • persistentVolumeClaim - Persistent storage for zone data
  • configMap - Configuration files
  • secret - Sensitive data
  • emptyDir - Temporary storage
  • hostPath - Host directory (use with caution)
  • nfs - Network file system

volumeMounts

Type: array of Kubernetes VolumeMount objects Required: No Default: []

Volume mount specifications that define where volumes should be mounted in containers.

spec:
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-pvc
  volumeMounts:
    - name: zone-data
      mountPath: /var/lib/bind
      readOnly: false

How It Works:

  • Volume mounts must reference volumes defined in the volumes field
  • Each mount specifies the volume name and where to mount it
  • Instances inherit cluster-level volume mounts unless overridden
  • Mounts are applied to the BIND9 container

VolumeMount Fields:

  • name (string, required) - Volume name to mount (must match a volume)
  • mountPath (string, required) - Path in container where volume is mounted
  • readOnly (boolean, optional) - Mount as read-only (default: false)
  • subPath (string, optional) - Sub-path within the volume

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions indicating cluster state.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: AllInstancesReady
      message: "All 3 instances are ready"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

  • Ready - Cluster is ready (all instances operational)
  • Degraded - Some instances are not ready
  • Progressing - Cluster is being reconciled

observedGeneration

Type: integer

The generation of the resource that was last reconciled.

status:
  observedGeneration: 5

instanceCount

Type: integer

Total number of Bind9Instance resources referencing this cluster.

status:
  instanceCount: 3

readyInstances

Type: integer

Number of instances that are ready and serving traffic.

status:
  readyInstances: 3

Complete Examples

Basic Production Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true
      validation: auto
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256

Cluster with Custom Image

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-dns
  namespace: dns-system
spec:
  version: "9.18"
  image:
    image: "my-registry.example.com/bind9:hardened"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Recursive Resolver Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: resolver-cluster
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true
    allowQuery:
      - "10.0.0.0/8"  # Internal network only
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"
      - "1.1.1.1"
    dnssec:
      enabled: false
      validation: true
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
      - "192.168.0.0/16"

Multi-Region Cluster with ACLs

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: global-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "acl:secondary-servers"
    dnssec:
      enabled: true
  rndcSecretRefs:
    - name: us-east-transfer-secret
      algorithm: hmac-sha256
    - name: us-west-transfer-secret
      algorithm: hmac-sha256
    - name: eu-transfer-secret
      algorithm: hmac-sha512  # Different algorithm for EU
  acls:
    secondary-servers:
      - "10.1.0.0/24"  # US East
      - "10.2.0.0/24"  # US West
      - "10.3.0.0/24"  # EU
    monitoring:
      - "10.0.10.0/24"

Cluster with Persistent Storage

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: persistent-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: true
  # Define persistent volume for zone data
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: bind-zone-storage
  volumeMounts:
    - name: zone-data
      mountPath: /var/lib/bind
      readOnly: false

Prerequisites: Create a PersistentVolumeClaim first:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bind-zone-storage
  namespace: dns-system
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

Cluster Hierarchy

Bind9Cluster
    ├── Defines shared configuration
    ├── Manages TSIG keys
    ├── Defines ACLs
    └── Referenced by one or more Bind9Instances
            ├── Instance inherits cluster config
            ├── Instance can override cluster settings
            └── Instance uses cluster TSIG keys

Configuration Inheritance

When a Bind9Instance references a Bind9Cluster:

  1. Version - Instance inherits cluster version unless it specifies its own
  2. Image - Instance inherits cluster image config unless it specifies its own
  3. Config - Instance inherits cluster config unless it specifies its own
  4. TSIG Keys - Instance uses cluster TSIG keys for zone transfers
  5. ACLs - Instance can reference cluster ACLs by name

Override Priority: Instance-level config > Cluster-level config > Default values

Bind9Instance Specification

Complete specification for the Bind9Instance Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: string
  namespace: string
  labels:
    key: value
spec:
  clusterRef: string          # References Bind9Cluster
  role: primary|secondary     # Required: Server role
  replicas: integer
  version: string             # Optional, overrides cluster version
  image:                      # Optional, overrides cluster image
    image: string
    imagePullPolicy: string
    imagePullSecrets: [string]
  configMapRefs:              # Optional, custom config files
    namedConf: string
    namedConfOptions: string
  global:                     # Optional, overrides cluster global config
    recursion: boolean
    allowQuery: [string]
    allowTransfer: [string]
    dnssec:
      enabled: boolean
      validation: boolean
    forwarders: [string]
    listenOn: [string]
    listenOnV6: [string]
  primaryServers: [string]    # Required for secondary role

Spec Fields

clusterRef

Type: string Required: Yes

Name of the Bind9Cluster that this instance belongs to. The instance inherits cluster-level configuration (version, shared config, TSIG keys, ACLs) from the referenced cluster.

spec:
  clusterRef: production-dns  # References Bind9Cluster named "production-dns"

How It Works:

  • Instance inherits version from cluster unless overridden
  • Instance inherits global config from cluster unless overridden
  • Controller uses cluster TSIG keys for zone transfers
  • Instance can override cluster settings with its own spec

replicas

Type: integer Required: No Default: 1

Number of BIND9 pod replicas to run.

spec:
  replicas: 3

Best Practices:

  • Use 2+ replicas for high availability
  • Use odd numbers (3, 5) for consensus-based systems
  • Consider resource constraints when scaling

version

Type: string Required: No Default: “9.18”

BIND9 version to deploy. Must match available Docker image tags.

spec:
  version: "9.18"

Supported Versions:

  • “9.16” - Older stable
  • “9.18” - Current stable (recommended)
  • “9.19” - Development

image

Type: object Required: No

Container image configuration for the BIND9 instance. Overrides cluster-level image configuration.

spec:
  image:
    image: "my-registry.example.com/bind9:custom"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret

How It Works:

  • If not specified, inherits from Bind9Cluster.spec.image
  • If cluster doesn’t specify, uses default image internetsystemsconsortium/bind9:9.18
  • Instance-level configuration takes precedence over cluster configuration

image.image

Type: string Required: No Default: “internetsystemsconsortium/bind9:9.18”

Full container image reference including registry, repository, and tag.

spec:
  image:
    image: "docker.io/internetsystemsconsortium/bind9:9.18"

Examples:

  • Public registry: "internetsystemsconsortium/bind9:9.18"
  • Private registry: "my-registry.example.com/dns/bind9:custom"
  • With digest: "bind9@sha256:abc123..."

image.imagePullPolicy

Type: string Required: No Default: “IfNotPresent”

Kubernetes image pull policy.

spec:
  image:
    imagePullPolicy: "Always"

Valid Values:

  • "Always" - Always pull the image
  • "IfNotPresent" - Pull only if not present locally (recommended)
  • "Never" - Never pull, use local image only

image.imagePullSecrets

Type: array of strings Required: No Default: []

List of Kubernetes secret names for authenticating with private container registries.

spec:
  image:
    imagePullSecrets:
      - docker-registry-secret
      - gcr-pull-secret

Setup:

  1. Create a docker-registry secret:
    kubectl create secret docker-registry my-registry-secret \
      --docker-server=my-registry.example.com \
      --docker-username=user \
      --docker-password=pass \
      --docker-email=email@example.com
    
  2. Reference the secret name in imagePullSecrets

configMapRefs

Type: object Required: No

References to custom ConfigMaps containing BIND9 configuration files. Overrides cluster-level ConfigMap references.

spec:
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"

How It Works:

  • If specified, Bindy uses your custom ConfigMaps instead of auto-generating configuration
  • If not specified, Bindy auto-generates ConfigMaps from the config block
  • Instance-level references override cluster-level references
  • You can specify one or both ConfigMaps

Default Behavior:

  • If configMapRefs is not set, Bindy creates a ConfigMap named <instance-name>-config
  • Auto-generated ConfigMap includes both named.conf and named.conf.options
  • Configuration is built from the config block in the spec

configMapRefs.namedConf

Type: string Required: No

Name of ConfigMap containing the main named.conf file.

spec:
  configMapRefs:
    namedConf: "my-named-conf"

ConfigMap Format:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel custom_log {
        file "/var/log/named/queries.log" versions 3 size 5m;
        severity info;
      };
      category queries { custom_log; };
    };

File Location: The ConfigMap data must have a key named.conf which will be mounted at /etc/bind/named.conf

configMapRefs.namedConfOptions

Type: string Required: No

Name of ConfigMap containing the named.conf.options file.

spec:
  configMapRefs:
    namedConfOptions: "my-options"

ConfigMap Format:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      dnssec-validation auto;
    };

File Location: The ConfigMap data must have a key named.conf.options which will be mounted at /etc/bind/named.conf.options

Examples:

Using separate ConfigMaps for fine-grained control:

spec:
  configMapRefs:
    namedConf: "prod-named-conf"
    namedConfOptions: "prod-options"

Using only custom options, auto-generating main config:

spec:
  configMapRefs:
    namedConfOptions: "my-custom-options"
  # namedConf not specified - will be auto-generated

global

Type: object Required: No

BIND9 configuration options that override cluster-level global configuration.

global.recursion

Type: boolean Required: No Default: false

Enable recursive DNS queries. Should be false for authoritative servers.

spec:
  global:
    recursion: false

Warning: Enabling recursion on public-facing authoritative servers is a security risk.

global.allowQuery

Type: array of strings Required: No Default: [“0.0.0.0/0”]

IP addresses or CIDR blocks allowed to query this server.

spec:
  global:
    allowQuery:
      - "0.0.0.0/0"        # Allow all (public DNS)
      - "10.0.0.0/8"       # Private network
      - "192.168.1.0/24"   # Specific subnet

global.allowTransfer

Type: array of strings Required: No Default: []

IP addresses or CIDR blocks allowed to perform zone transfers (AXFR/IXFR).

spec:
  global:
    allowTransfer:
      - "10.0.1.10"        # Specific secondary server
      - "10.0.1.11"        # Another secondary

Security Note: Restrict zone transfers to trusted secondary servers only.

global.dnssec

Type: object Required: No

DNSSEC configuration for signing zones and validating responses.

global.dnssec.enabled

Type: boolean Required: No Default: false

Enable DNSSEC signing for zones.

spec:
  global:
    dnssec:
      enabled: true
global.dnssec.validation

Type: boolean Required: No Default: false

Enable DNSSEC validation for recursive queries.

spec:
  global:
    dnssec:
      enabled: true
      validation: true

global.forwarders

Type: array of strings Required: No Default: []

DNS servers to forward queries to (for recursive mode).

spec:
  global:
    recursion: true
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"

global.listenOn

Type: array of strings Required: No Default: [“any”]

IPv4 addresses to listen on.

spec:
  global:
    listenOn:
      - "any"              # All IPv4 interfaces
      - "10.0.1.10"        # Specific IP

global.listenOnV6

Type: array of strings Required: No Default: [“any”]

IPv6 addresses to listen on.

spec:
  global:
    listenOnV6:
      - "any"              # All IPv6 interfaces
      - "2001:db8::1"      # Specific IPv6

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions indicating resource state.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSuccess
      message: "Instance is ready"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

  • Ready - Instance is ready for use
  • Available - Instance is serving DNS queries
  • Progressing - Instance is being reconciled
  • Degraded - Instance is partially functional
  • Failed - Instance reconciliation failed

observedGeneration

Type: integer

The generation of the resource that was last reconciled.

status:
  observedGeneration: 5

replicas

Type: integer

Total number of replicas configured.

status:
  replicas: 3

readyReplicas

Type: integer

Number of replicas that are ready and serving traffic.

status:
  readyReplicas: 3

Complete Example

Primary DNS Instance

# First create the Bind9Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true

---
# Then create the Bind9Instance referencing the cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
spec:
  clusterRef: production-dns  # References cluster above
  role: primary  # Required: primary or secondary
  replicas: 2
  # Inherits version and global config from cluster

Secondary DNS Instance

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    dns-role: secondary
    environment: production
spec:
  clusterRef: production-dns  # References same cluster as primary
  role: secondary  # Required: primary or secondary
  replicas: 2
  # Override global config for secondary role
  global:
    allowTransfer: []  # No zone transfers from secondary
    dnssec:
      enabled: false
      validation: true

Recursive Resolver

# Separate cluster for resolvers
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: resolver-cluster
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true
    allowQuery:
      - "10.0.0.0/8"  # Internal network only
    forwarders:
      - "8.8.8.8"
      - "1.1.1.1"
    dnssec:
      enabled: false
      validation: true

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: resolver
  namespace: dns-system
  labels:
    dns-role: resolver
spec:
  clusterRef: resolver-cluster
  role: primary  # Required: primary or secondary
  replicas: 3
  # Inherits recursive global config from cluster

DNSZone Specification

Complete specification for the DNSZone Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: string
  namespace: string
spec:
  zoneName: string
  clusterRef: string        # References Bind9Cluster
  soaRecord:
    primaryNs: string
    adminEmail: string
    serial: integer
    refresh: integer
    retry: integer
    expire: integer
    negativeTtl: integer
  ttl: integer

Spec Fields

zoneName

Type: string Required: Yes

The DNS zone name (domain name).

spec:
  zoneName: "example.com"

Requirements:

  • Must be a valid DNS domain name
  • Maximum 253 characters
  • Can be forward or reverse zone

Examples:

  • “example.com”
  • “subdomain.example.com”
  • “1.0.10.in-addr.arpa” (reverse zone)

clusterRef

Type: string Required: Yes

Name of the Bind9Cluster that will manage this zone.

spec:
  clusterRef: production-dns  # References Bind9Cluster named "production-dns"

How It Works:

  • Controller finds Bind9Cluster with this name
  • Discovers all Bind9Instance resources referencing this cluster
  • Identifies primary instances for zone hosting
  • Loads RNDC keys from cluster configuration
  • Creates zone on primary instances using rndc addzone command
  • Configures zone transfers to secondary instances

Validation:

  • Referenced Bind9Cluster must exist in same namespace
  • Controller validates reference at admission time

soaRecord

Type: object Required: Yes

Start of Authority record defining zone parameters.

spec:
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin.example.com."  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

soaRecord.primaryNs

Type: string Required: Yes

Primary nameserver for the zone.

soaRecord:
  primaryNs: "ns1.example.com."

Requirements:

  • Must be a fully qualified domain name (FQDN)
  • Must end with a dot (.)
  • Pattern: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*\.$

soaRecord.adminEmail

Type: string Required: Yes

Email address of zone administrator in DNS format.

soaRecord:
  adminEmail: "admin.example.com."  # Represents admin@example.com

Format:

  • Replace @ with . in email address
  • Must end with a dot (.)
  • Example: admin@example.com → admin.example.com.
  • Pattern: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*\.$

soaRecord.serial

Type: integer (64-bit) Required: Yes Range: 0 to 4,294,967,295

Zone serial number for change tracking.

soaRecord:
  serial: 2024010101

Best Practices:

  • Use format: YYYYMMDDnn (year, month, day, revision)
  • Increment on every change
  • Secondaries use this to detect updates

Examples:

  • 2024010101 - January 1, 2024, first revision
  • 2024010102 - January 1, 2024, second revision

soaRecord.refresh

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How often (in seconds) secondary servers should check for updates.

soaRecord:
  refresh: 3600  # 1 hour

Typical Values:

  • 3600 (1 hour) - Standard
  • 7200 (2 hours) - Less frequent updates
  • 900 (15 minutes) - Frequent updates

soaRecord.retry

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How long (in seconds) to wait before retrying a failed refresh.

soaRecord:
  retry: 600  # 10 minutes

Best Practice: Should be less than refresh value

soaRecord.expire

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How long (in seconds) secondary servers should keep serving zone data after primary becomes unreachable.

soaRecord:
  expire: 604800  # 1 week

Typical Values:

  • 604800 (1 week) - Standard
  • 1209600 (2 weeks) - Extended
  • 86400 (1 day) - Short-lived zones

soaRecord.negativeTtl

Type: integer (32-bit) Required: Yes Range: 0 to 2,147,483,647

How long (in seconds) to cache negative responses (NXDOMAIN).

soaRecord:
  negativeTtl: 86400  # 24 hours

Typical Values:

  • 86400 (24 hours) - Standard
  • 3600 (1 hour) - Shorter caching
  • 300 (5 minutes) - Very short for dynamic zones

ttl

Type: integer (32-bit) Required: No Default: 3600 Range: 0 to 2,147,483,647

Default Time To Live for records in this zone (in seconds).

spec:
  ttl: 3600  # 1 hour

Common Values:

  • 3600 (1 hour) - Standard
  • 300 (5 minutes) - Frequently changing zones
  • 86400 (24 hours) - Stable zones

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: "Zone created for cluster: primary-dns"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

  • Ready - Zone is created and serving
  • Synced - Zone is synchronized with BIND9
  • Failed - Zone creation or update failed

observedGeneration

Type: integer

The generation last reconciled.

status:
  observedGeneration: 3

recordCount

Type: integer

Number of DNS records in this zone.

status:
  recordCount: 42

Complete Examples

Simple Primary Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Production Zone with Custom TTL

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-example-com
  namespace: dns-system
spec:
  zoneName: api.example.com
  clusterRef: production-dns
  ttl: 300  # 5 minute default TTL for faster updates
  soaRecord:
    primaryNs: ns1.api.example.com.
    adminEmail: ops.example.com.
    serial: 2024010101
    refresh: 1800   # Check every 30 minutes
    retry: 300      # Retry after 5 minutes
    expire: 604800
    negativeTtl: 300  # Short negative cache

Reverse DNS Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: reverse-zone
  namespace: dns-system
spec:
  zoneName: 1.0.10.in-addr.arpa
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Multi-Region Setup

# East Region Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-east
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-east  # References east instance
  soaRecord:
    primaryNs: ns1.east.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

---
# West Region Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-west
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-west  # References west instance
  soaRecord:
    primaryNs: ns1.west.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Zone Creation Flow

When you create a DNSZone resource:

  1. Admission - Kubernetes validates the resource schema
  2. Controller watches - Bindy controller detects the new zone
  3. Cluster lookup - Finds Bind9Cluster referenced by clusterRef
  4. Instance discovery - Finds all Bind9Instance resources referencing the cluster
  5. Primary identification - Identifies primary instances (with role: primary)
  6. RNDC key load - Retrieves RNDC keys from cluster configuration
  7. RNDC connection - Connects to primary instance pods via RNDC
  8. Zone creation - Executes rndc addzone {zoneName} ... on primary instances
  9. Zone transfer setup - Configures zone transfers to secondary instances
  10. Status update - Updates DNSZone status to Ready

DNS Record Specifications

Complete specifications for all DNS record types.

Common Fields

All DNS record types share these common fields:

zone / zoneRef

Type: string Required: Exactly one of zone or zoneRef must be specified

Reference to the parent DNSZone resource. Use one of the following:

zone field - Matches against DNSZone.spec.zoneName (the actual DNS zone name):

spec:
  zone: "example.com"  # Matches DNSZone with spec.zoneName: example.com

zoneRef field - Direct reference to DNSZone.metadata.name (the Kubernetes resource name, recommended for production):

spec:
  zoneRef: "example-com"  # Matches DNSZone with metadata.name: example-com

Important: You must specify exactly one of zone or zoneRef - not both, not neither.

See Referencing DNS Zones for detailed comparison and best practices.

name

Type: string Required: Yes

The record name within the zone.

spec:
  name: "www"  # Creates www.example.com
  name: "@"    # Creates record at zone apex (example.com)

ttl

Type: integer Required: No Default: Inherited from zone

Time To Live in seconds.

spec:
  ttl: 300  # 5 minutes

A Record (IPv4 Address)

Maps hostnames to IPv4 addresses.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
  ttl: 300

Fields

ipv4Address

Type: string Required: Yes

IPv4 address in dotted decimal notation.

spec:
  ipv4Address: "192.0.2.1"

Example: Multiple A Records (Round Robin)

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com-1
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com-2
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.2"

AAAA Record (IPv6 Address)

Maps hostnames to IPv6 addresses.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-com-v6
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  ipv6Address: "2001:db8::1"
  ttl: 300

Fields

ipv6Address

Type: string Required: Yes

IPv6 address in colon-separated hexadecimal notation.

spec:
  ipv6Address: "2001:db8::1"

Formats:

  • Full: “2001:0db8:0000:0000:0000:0000:0000:0001”
  • Compressed: “2001:db8::1”

Example: Dual Stack (IPv4 + IPv6)

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-v4
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6
spec:
  zoneRef: "example-com"
  name: "www"
  ipv6Address: "2001:db8::1"

CNAME Record (Canonical Name)

Creates an alias from one hostname to another.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-alias
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  target: "server.example.com."
  ttl: 3600

Fields

target

Type: string Required: Yes

Target hostname (FQDN recommended).

spec:
  target: "server.example.com."

Restrictions

  • Cannot be created at zone apex (@)
  • Cannot coexist with other record types for same name
  • Target should be fully qualified (end with dot)

Example: CDN Alias

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn-alias
spec:
  zoneRef: "example-com"
  name: "cdn"
  target: "d123456.cloudfront.net."

MX Record (Mail Exchange)

Specifies mail servers for the domain.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-primary
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail.example.com."
  ttl: 3600

Fields

priority

Type: integer Required: Yes

Priority (preference) value. Lower values are preferred.

spec:
  priority: 10  # Primary mail server
  priority: 20  # Backup mail server

mailServer

Type: string Required: Yes

Hostname of mail server (FQDN recommended).

spec:
  mailServer: "mail.example.com."

Example: Primary and Backup Mail Servers

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-primary
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail1.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-backup
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 20
  mailServer: "mail2.example.com."

TXT Record (Text)

Stores arbitrary text data, commonly used for verification and policies.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  text:
    - "v=spf1 mx -all"
  ttl: 3600

Fields

text

Type: array of strings Required: Yes

Text values. Multiple strings are concatenated.

spec:
  text:
    - "v=spf1 mx -all"

Example: SPF, DKIM, and DMARC

---
# SPF Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf
spec:
  zoneRef: "example-com"
  name: "@"
  text:
    - "v=spf1 mx include:_spf.google.com ~all"
---
# DKIM Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim
spec:
  zoneRef: "example-com"
  name: "default._domainkey"
  text:
    - "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."
---
# DMARC Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc
spec:
  zoneRef: "example-com"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=quarantine; rua=mailto:dmarc@example.com"

NS Record (Name Server)

Delegates a subdomain to different nameservers.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: subdomain-delegation
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns1.subdomain.example.com."
  ttl: 3600

Fields

nameserver

Type: string Required: Yes

Nameserver hostname (FQDN recommended).

spec:
  nameserver: "ns1.subdomain.example.com."

Example: Subdomain Delegation

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns1
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns1.subdomain.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns2
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns2.subdomain.example.com."

SRV Record (Service)

Specifies location of services.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-service
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "_sip._tcp"
  priority: 10
  weight: 60
  port: 5060
  target: "sip.example.com."
  ttl: 3600

Fields

priority

Type: integer Required: Yes

Priority for target selection. Lower values are preferred.

spec:
  priority: 10

weight

Type: integer Required: Yes

Relative weight for same-priority targets.

spec:
  weight: 60  # 60% of traffic
  weight: 40  # 40% of traffic

port

Type: integer Required: Yes

Port number where service is available.

spec:
  port: 5060

target

Type: string Required: Yes

Hostname providing the service.

spec:
  target: "sip.example.com."

Example: Load Balanced Service

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-primary
spec:
  zoneRef: "example-com"
  name: "_service._tcp"
  priority: 10
  weight: 60
  port: 8080
  target: "server1.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-secondary
spec:
  zoneRef: "example-com"
  name: "_service._tcp"
  priority: 10
  weight: 40
  port: 8080
  target: "server2.example.com."

CAA Record (Certificate Authority Authorization)

Restricts which CAs can issue certificates for the domain.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-letsencrypt
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
  ttl: 3600

Fields

flags

Type: integer Required: Yes

Flags byte. Typically 0 (non-critical) or 128 (critical).

spec:
  flags: 0

tag

Type: string Required: Yes

Property tag.

Valid Tags:

  • “issue” - Authorize CA to issue certificates
  • “issuewild” - Authorize CA to issue wildcard certificates
  • “iodef” - URL for violation reports
spec:
  tag: "issue"

value

Type: string Required: Yes

Property value (CA domain or URL).

spec:
  value: "letsencrypt.org"

Example: Multiple CAA Records

---
# Allow Let's Encrypt for regular certs
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
---
# Allow Let's Encrypt for wildcard certs
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issuewild
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issuewild"
  value: "letsencrypt.org"
---
# Violation reporting
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "iodef"
  value: "mailto:security@example.com"

Status Conditions

This document describes the standardized status conditions used across all Bindy CRDs.

Condition Types

All Bindy custom resources (Bind9Instance, DNSZone, and all DNS record types) use the following standardized condition types:

Ready

  • Description: Indicates whether the resource is fully operational and ready to serve its intended purpose
  • Common Use: Primary condition type used by all reconcilers
  • Status Values:
    • True: Resource is ready and operational
    • False: Resource is not ready (error or in progress)
    • Unknown: Status cannot be determined

Available

  • Description: Indicates whether the resource is available for use
  • Common Use: Used to distinguish between “ready” and “available” when resources may be ready but not yet serving traffic
  • Status Values:
    • True: Resource is available
    • False: Resource is not available
    • Unknown: Availability cannot be determined

Progressing

  • Description: Indicates whether the resource is currently being worked on
  • Common Use: During initial creation or updates
  • Status Values:
    • True: Resource is being created or updated
    • False: Resource is not currently progressing
    • Unknown: Progress status cannot be determined

Degraded

  • Description: Indicates that the resource is functioning but in a degraded state
  • Common Use: When some replicas are down but service continues, or when non-critical features are unavailable
  • Status Values:
    • True: Resource is degraded
    • False: Resource is not degraded
    • Unknown: Degradation status cannot be determined

Failed

  • Description: Indicates that the resource has failed and cannot fulfill its purpose
  • Common Use: Permanent failures that require intervention
  • Status Values:
    • True: Resource has failed
    • False: Resource has not failed
    • Unknown: Failure status cannot be determined

Condition Structure

All conditions follow this structure:

status:
  conditions:
    - type: Ready              # One of: Ready, Available, Progressing, Degraded, Failed
      status: "True"           # One of: "True", "False", "Unknown"
      reason: Ready            # Machine-readable reason (typically same as type)
      message: "Bind9Instance configured with 2 replicas"  # Human-readable message
      lastTransitionTime: "2024-11-26T10:00:00Z"          # RFC3339 timestamp
  observedGeneration: 1        # Generation last observed by controller
  # Resource-specific fields (replicas, recordCount, etc.)

Current Usage

Bind9Instance

  • Uses Ready condition type
  • Status True when Deployment, Service, and ConfigMap are successfully created
  • Status False when resource creation fails
  • Additional status fields:
    • replicas: Total number of replicas
    • readyReplicas: Number of ready replicas

Bind9Cluster

  • Uses Ready condition type with granular reasons
  • Condition reasons:
    • AllInstancesReady: All instances in the cluster are ready
    • SomeInstancesNotReady: Some instances are not ready (cluster partially functional)
    • NoInstancesReady: No instances are ready (cluster not functional)
  • Additional status fields:
    • instanceCount: Total number of instances
    • readyInstances: Number of ready instances
    • instances: List of instance names

DNSZone

  • Uses Progressing, Degraded, and Ready condition types with granular reasons
  • Reconciliation Flow:
    1. Progressing/PrimaryReconciling: Before configuring primary instances
    2. Progressing/PrimaryReconciled: After successful primary configuration
    3. Progressing/SecondaryReconciling: Before configuring secondary instances
    4. Progressing/SecondaryReconciled: After successful secondary configuration
    5. Ready/ReconcileSucceeded: When all phases complete successfully
  • Error Conditions:
    • Degraded/PrimaryFailed: Primary reconciliation failed (fatal error)
    • Degraded/SecondaryFailed: Secondary reconciliation failed (primaries still work, non-fatal)
  • Additional status fields:
    • recordCount: Number of records in the zone
    • secondaryIps: IP addresses of configured secondary servers
    • observedGeneration: Last observed generation

DNS Records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

  • Use Progressing, Degraded, and Ready condition types with granular reasons
  • Reconciliation Flow:
    1. Progressing/RecordReconciling: Before configuring record on endpoints
    2. Ready/ReconcileSucceeded: When record is successfully configured on all endpoints
  • Error Conditions:
    • Degraded/RecordFailed: Record configuration failed (includes error details)
  • Status message includes count of configured endpoints (e.g., “Record configured on 3 endpoint(s)”)
  • Additional status fields:
    • observedGeneration: Last observed generation

Best Practices

  1. Always set the condition type: Use one of the five standardized types
  2. Include timestamps: Set lastTransitionTime when condition status changes
  3. Provide clear messages: The message field should be human-readable and actionable
  4. Use appropriate reasons: The reason field should be machine-readable and consistent
  5. Update observedGeneration: Always update to match the resource’s current generation
  6. Multiple conditions: Resources can have multiple conditions simultaneously (e.g., Ready: True and Degraded: True)

Examples

Successful Bind9Instance

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Ready
      message: "Bind9Instance configured with 2 replicas"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  replicas: 2
  readyReplicas: 2

DNSZone - Progressing (Primary Reconciliation)

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: PrimaryReconciling
      message: "Configuring zone on primary instances"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

DNSZone - Progressing (Secondary Reconciliation)

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: SecondaryReconciling
      message: "Configured on 2 primary server(s), now configuring secondaries"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1
  recordCount: 0
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

DNSZone - Successfully Reconciled

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Configured on 2 primary server(s) and 3 secondary server(s)"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"
    - "10.42.0.7"

DNSZone - Degraded (Secondary Failure)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: SecondaryFailed
      message: "Configured on 2 primary server(s), but secondary configuration failed: connection timeout"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

DNSZone - Failed (Primary Failure)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: PrimaryFailed
      message: "Failed to configure zone on primaries: No Bind9Instances matched selector"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

DNS Record - Progressing

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: RecordReconciling
      message: "Configuring A record on zone endpoints"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

DNS Record - Successfully Configured

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

DNS Record - Failed

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: RecordFailed
      message: "Failed to configure record: Zone not found on primary servers"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Bind9Cluster - Partially Ready

status:
  conditions:
    - type: Ready
      status: "False"
      reason: SomeInstancesNotReady
      message: "2/3 instances ready"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  instanceCount: 3
  readyInstances: 2
  instances:
    - production-dns-primary-0
    - production-dns-primary-1
    - production-dns-secondary-0

Validation

All condition types are enforced via CRD validation. Attempting to use a condition type not in the enum will result in a validation error:

$ kubectl apply -f invalid-condition.yaml
Error from server (Invalid): error when creating "invalid-condition.yaml":
Bind9Instance.bindy.firestoned.io "test" is invalid:
status.conditions[0].type: Unsupported value: "CustomType":
supported values: "Ready", "Available", "Progressing", "Degraded", "Failed"

Configuration Examples

Complete configuration examples for common Bindy deployment scenarios.

Overview

This section provides ready-to-use YAML configurations for various deployment scenarios:

Quick Reference

Minimal Configuration

Minimal viable configuration for testing:

# Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns
  namespace: dns-system
  labels:
    dns-role: primary
spec:
  replicas: 1
---
# DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: "example.com"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# A Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"

Common Patterns

Primary/Secondary Setup

# Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.2.0/24"  # Secondary network
---
# Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary
  labels:
    dns-role: secondary
spec:
  replicas: 2
---
# Zone on Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-primary
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Zone on Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-secondary
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"

DNSSEC Enabled

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dnssec-instance
spec:
  replicas: 2
  config:
    dnssec:
      enabled: true
      validation: true

Custom Container Image

Using a custom or private container image:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-image-cluster
  namespace: dns-system
spec:
  # Default image for all instances in this cluster
  image:
    image: "my-registry.example.com/bind9:custom-9.18"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-dns
  namespace: dns-system
spec:
  clusterRef: custom-image-cluster
  replicas: 2
  # Instance inherits custom image from cluster

Instance-Specific Custom Image

Override cluster image for specific instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
  namespace: dns-system
spec:
  image:
    image: "internetsystemsconsortium/bind9:9.18"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: canary-dns
  namespace: dns-system
spec:
  clusterRef: prod-cluster
  replicas: 1
  # Override cluster image for canary testing
  image:
    image: "internetsystemsconsortium/bind9:9.19"
    imagePullPolicy: "Always"

Custom Configuration Files

Using custom ConfigMaps for BIND9 configuration:

# Create custom ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel query_log {
        file "/var/log/named/queries.log" versions 5 size 10m;
        severity info;
        print-time yes;
        print-category yes;
      };
      category queries { query_log; };
      category lame-servers { null; };
    };
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      allow-transfer { 10.0.2.0/24; };
      dnssec-validation auto;
      listen-on { any; };
      listen-on-v6 { any; };
      max-cache-size 256M;
      max-cache-ttl 3600;
    };
---
# Reference custom ConfigMaps
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-config-dns
  namespace: dns-system
spec:
  replicas: 2
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"

Cluster-Level Custom ConfigMaps

Share custom configuration across all instances:

apiVersion: v1
kind: ConfigMap
metadata:
  name: shared-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      dnssec-validation auto;
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: shared-config-cluster
  namespace: dns-system
spec:
  configMapRefs:
    namedConfOptions: "shared-options"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: instance-1
  namespace: dns-system
spec:
  clusterRef: shared-config-cluster
  replicas: 2
  # Inherits configMapRefs from cluster
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: instance-2
  namespace: dns-system
spec:
  clusterRef: shared-config-cluster
  replicas: 2
  # Also inherits same configMapRefs from cluster

Split Horizon DNS

# Internal DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: internal-dns
  labels:
    dns-view: internal
spec:
  config:
    allowQuery:
      - "10.0.0.0/8"
---
# External DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: external-dns
  labels:
    dns-view: external
spec:
  config:
    allowQuery:
      - "0.0.0.0/0"

Resource Organization

Namespace Structure

Recommended namespace organization:

# Separate namespaces by environment
dns-system-prod      # Production DNS
dns-system-staging   # Staging DNS
dns-system-dev       # Development DNS

Label Strategy

Recommended labels:

metadata:
  labels:
    # Core labels
    app.kubernetes.io/name: bindy
    app.kubernetes.io/component: dns-server
    app.kubernetes.io/part-of: dns-infrastructure

    # Custom labels
    dns-role: primary              # primary, secondary, resolver
    environment: production         # production, staging, dev
    region: us-east-1              # Geographic region
    zone-type: authoritative       # authoritative, recursive

Naming Conventions

Recommended naming:

# Bind9Instance: <role>-<region>
name: primary-us-east-1

# DNSZone: <domain-with-dashes>
name: example-com

# Records: <name>-<type>-<identifier>
name: www-a-record
name: mail-mx-primary

Testing Configurations

Local Development (kind/minikube)

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dev-dns
  namespace: dns-system
spec:
  replicas: 1
  config:
    recursion: true
    forwarders:
      - "8.8.8.8"
    allowQuery:
      - "0.0.0.0/0"

CI/CD Testing

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: ci-dns
  namespace: ci-testing
  labels:
    ci-test: "true"
spec:
  replicas: 1
  config:
    recursion: false
    allowQuery:
      - "10.0.0.0/8"

Troubleshooting Examples

Debug Configuration

Enable verbose logging:

apiVersion: v1
kind: ConfigMap
metadata:
  name: bindy-config
data:
  RUST_LOG: "debug"
  RECONCILE_INTERVAL: "60"

Dry Run Testing

Test configuration without applying:

kubectl apply --dry-run=client -f dns-config.yaml
kubectl apply --dry-run=server -f dns-config.yaml

Validation

Validate resources:

# Check instance status
kubectl get bind9instances -A

# Check zone status
kubectl get dnszones -A

# Check all DNS records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -A

Complete Examples

For complete, production-ready configurations see:

Simple Setup Example

Complete configuration for a basic single-instance DNS setup.

Overview

This example demonstrates:

  • Single Bind9Instance
  • One DNS zone (example.com)
  • Common DNS records (A, AAAA, CNAME, MX, TXT)
  • Suitable for testing and development

Prerequisites

  • Kubernetes cluster (kind, minikube, or cloud)
  • kubectl configured
  • Bindy operator installed

Configuration

Complete YAML

Save as simple-dns.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system

---
# Bind9Instance - Single DNS Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: simple-dns
  namespace: dns-system
  labels:
    app: bindy
    dns-role: primary
    environment: development
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer: []
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# DNSZone - example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

---
# A Record - Nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "ns1"
  ipv4Address: "192.0.2.1"
  ttl: 3600

---
# A Record - Web Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv4Address: "192.0.2.10"
  ttl: 300

---
# AAAA Record - Web Server (IPv6)
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-aaaa-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv6Address: "2001:db8::10"
  ttl: 300

---
# A Record - Mail Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "mail"
  ipv4Address: "192.0.2.20"
  ttl: 3600

---
# MX Record - Mail Exchange
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail.example.com."
  ttl: 3600

---
# TXT Record - SPF
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "@"
  text:
    - "v=spf1 mx -all"
  ttl: 3600

---
# TXT Record - DMARC
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=none; rua=mailto:dmarc@example.com"
  ttl: 3600

---
# CNAME Record - API Alias
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: api-cname-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "api"
  target: "www.example.com."
  ttl: 3600

Deployment

1. Install CRDs

kubectl apply -k deploy/crds/

2. Deploy Bindy Operator

kubectl apply -f deploy/controller/deployment.yaml

3. Apply Configuration

kubectl apply -f simple-dns.yaml

4. Verify Deployment

# Check Bind9Instance
kubectl get bind9instances -n dns-system
kubectl describe bind9instance simple-dns -n dns-system

# Check DNSZone
kubectl get dnszones -n dns-system
kubectl describe dnszone example-com -n dns-system

# Check DNS Records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -n dns-system

# Check pods
kubectl get pods -n dns-system

# Check logs
kubectl logs -n dns-system -l app=bindy

Testing

DNS Queries

Get the DNS service IP:

DNS_IP=$(kubectl get svc -n dns-system simple-dns -o jsonpath='{.spec.clusterIP}')

Test DNS resolution:

# A record
dig @${DNS_IP} www.example.com A

# AAAA record
dig @${DNS_IP} www.example.com AAAA

# MX record
dig @${DNS_IP} example.com MX

# TXT record
dig @${DNS_IP} example.com TXT

# CNAME record
dig @${DNS_IP} api.example.com CNAME

Expected responses:

; www.example.com A
www.example.com.    300    IN    A    192.0.2.10

; www.example.com AAAA
www.example.com.    300    IN    AAAA    2001:db8::10

; example.com MX
example.com.        3600   IN    MX    10 mail.example.com.

; example.com TXT
example.com.        3600   IN    TXT   "v=spf1 mx -all"

; api.example.com CNAME
api.example.com.    3600   IN    CNAME www.example.com.

Port Forward for External Testing

# Forward DNS port to localhost
kubectl port-forward -n dns-system svc/simple-dns 5353:53

# Test from local machine
dig @localhost -p 5353 www.example.com

Monitoring

Check Status

# Instance status
kubectl get bind9instance simple-dns -n dns-system -o yaml | grep -A 10 status

# Zone status
kubectl get dnszone example-com -n dns-system -o yaml | grep -A 10 status

# Record status
kubectl get arecord www-a-record -n dns-system -o yaml | grep -A 10 status

View Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy

# BIND9 logs
kubectl logs -n dns-system -l app=bindy,dns-role=primary

Updating Configuration

Add New Record

cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: app-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "app"
  ipv4Address: "192.0.2.30"
  ttl: 300
EOF

Update SOA Serial

kubectl edit dnszone example-com -n dns-system

# Update serial field:
# serial: 2024010102

Scale Instance

kubectl patch bind9instance simple-dns -n dns-system \
  --type merge \
  --patch '{"spec":{"replicas":2}}'

Cleanup

Remove All Resources

kubectl delete -f simple-dns.yaml

Remove Namespace

kubectl delete namespace dns-system

Next Steps

Troubleshooting

Pods Not Starting

# Check pod events
kubectl describe pod -n dns-system -l app=bindy

# Check controller logs
kubectl logs -n dns-system deployment/bindy

DNS Not Resolving

# Check zone status
kubectl get dnszone example-com -n dns-system -o yaml

# Check BIND9 logs
kubectl logs -n dns-system -l app=bindy,dns-role=primary

# Verify zone file
kubectl exec -n dns-system -it <pod-name> -- cat /var/lib/bind/zones/example.com.zone

Record Not Appearing

# Check record status
kubectl get arecord www-a-record -n dns-system -o yaml

# Check zone record count
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.recordCount}'

Production Setup Example

Production-ready configuration with high availability, monitoring, and security.

Overview

This example demonstrates:

  • Primary/Secondary HA setup
  • Multiple replicas with pod anti-affinity
  • Resource limits and requests
  • PodDisruptionBudgets
  • DNSSEC enabled
  • Monitoring and logging
  • Production-grade security

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      Production DNS                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Primary Instances (2 replicas)                            │
│   ┌──────────────┐  ┌──────────────┐                       │
│   │   Primary-1  │  │   Primary-2  │                       │
│   │  (us-east-1a)│  │  (us-east-1b)│                       │
│   └──────┬───────┘  └──────┬───────┘                       │
│          │                  │                               │
│          └──────────┬───────┘                               │
│                     │ Zone Transfer (AXFR/IXFR)            │
│          ┌──────────┴───────┐                               │
│          │                  │                               │
│   ┌──────▼───────┐  ┌──────▼───────┐                       │
│   │ Secondary-1  │  │ Secondary-2  │                       │
│   │ (us-west-2a) │  │ (us-west-2b) │                       │
│   └──────────────┘  └──────────────┘                       │
│                                                              │
│   Secondary Instances (2 replicas)                          │
└─────────────────────────────────────────────────────────────┘

Complete Configuration

Save as production-dns.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system-prod
  labels:
    environment: production

---
# ConfigMap for Controller Configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: bindy-config
  namespace: dns-system-prod
data:
  RUST_LOG: "info"
  RECONCILE_INTERVAL: "300"

---
# Primary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system-prod
  labels:
    app: bindy
    dns-role: primary
    environment: production
    component: dns-server
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"  # Secondary instance subnet
    dnssec:
      enabled: true
      validation: false
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system-prod
  labels:
    app: bindy
    dns-role: secondary
    environment: production
    component: dns-server
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget for Primary
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
  namespace: dns-system-prod
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: bindy
      dns-role: primary

---
# PodDisruptionBudget for Secondary
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system-prod
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: bindy
      dns-role: secondary

---
# DNSZone - Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
  namespace: dns-system-prod
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "dns-admin@example.com"
    serial: 2024010101
    refresh: 900       # 15 minutes - production refresh
    retry: 300         # 5 minutes
    expire: 604800     # 1 week
    negativeTtl: 300   # 5 minutes
  ttl: 300  # 5 minutes default TTL

---
# DNSZone - Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system-prod
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"
  ttl: 300

---
# Production DNS Records
# Nameservers
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-primary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "ns1"
  ipv4Address: "192.0.2.1"
  ttl: 86400  # 24 hours for NS records

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns2-secondary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "ns2"
  ipv4Address: "192.0.2.2"
  ttl: 86400

---
# Load Balanced Web Servers (Round Robin)
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.10"
  ttl: 60  # 1 minute for load balanced IPs

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.11"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-3
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.12"
  ttl: 60

---
# Dual Stack for www
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6-1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv6Address: "2001:db8::10"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6-2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv6Address: "2001:db8::11"
  ttl: 60

---
# Mail Infrastructure
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "mail1"
  ipv4Address: "192.0.2.20"
  ttl: 3600

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "mail2"
  ipv4Address: "192.0.2.21"
  ttl: 3600

---
# MX Records - Primary and Backup
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  priority: 10
  mailServer: "mail1.example.com."
  ttl: 3600

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-backup
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  priority: 20
  mailServer: "mail2.example.com."
  ttl: 3600

---
# SPF Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  text:
    - "v=spf1 mx ip4:192.0.2.20/32 ip4:192.0.2.21/32 -all"
  ttl: 3600

---
# DKIM Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "default._domainkey"
  text:
    - "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."
  ttl: 3600

---
# DMARC Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=quarantine; pct=100; rua=mailto:dmarc-reports@example.com; ruf=mailto:dmarc-forensics@example.com"
  ttl: 3600

---
# CAA Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issuewild
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "issuewild"
  value: "letsencrypt.org"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "iodef"
  value: "mailto:security@example.com"
  ttl: 86400

---
# Service Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-sip-tcp
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "_sip._tcp"
  priority: 10
  weight: 60
  port: 5060
  target: "sip1.example.com."
  ttl: 3600

---
# CDN CNAME
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "cdn"
  target: "d123456.cloudfront.net."
  ttl: 3600

Deployment

1. Prerequisites

# Create namespace
kubectl create namespace dns-system-prod

# Label nodes for DNS pods (optional but recommended)
kubectl label nodes node1 dns-zone=primary
kubectl label nodes node2 dns-zone=primary
kubectl label nodes node3 dns-zone=secondary
kubectl label nodes node4 dns-zone=secondary

2. Deploy

kubectl apply -f production-dns.yaml

3. Verify

# Check all instances
kubectl get bind9instances -n dns-system-prod
kubectl get dnszones -n dns-system-prod
kubectl get pods -n dns-system-prod -o wide

# Check PodDisruptionBudgets
kubectl get pdb -n dns-system-prod

# Verify HA distribution
kubectl get pods -n dns-system-prod -o custom-columns=\
NAME:.metadata.name,\
NODE:.spec.nodeName,\
ROLE:.metadata.labels.dns-role

Monitoring

Prometheus Metrics

apiVersion: v1
kind: Service
metadata:
  name: bindy-metrics
  namespace: dns-system-prod
  labels:
    app: bindy
spec:
  ports:
    - name: metrics
      port: 9090
      targetPort: 9090
  selector:
    app: bindy

ServiceMonitor (for Prometheus Operator)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: bindy-dns
  namespace: dns-system-prod
spec:
  selector:
    matchLabels:
      app: bindy
  endpoints:
    - port: metrics
      interval: 30s

Backup and Disaster Recovery

Backup Zones

#!/bin/bash
# backup-zones.sh

NAMESPACE="dns-system-prod"
BACKUP_DIR="./dns-backups/$(date +%Y%m%d)"

mkdir -p "$BACKUP_DIR"

# Backup all zones
kubectl get dnszones -n $NAMESPACE -o yaml > "$BACKUP_DIR/zones.yaml"

# Backup all records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -n $NAMESPACE -o yaml > "$BACKUP_DIR/records.yaml"

echo "Backup completed: $BACKUP_DIR"

Restore

kubectl apply -f dns-backups/20240115/zones.yaml
kubectl apply -f dns-backups/20240115/records.yaml

Security Hardening

Network Policies

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns-allow-queries
  namespace: dns-system-prod
spec:
  podSelector:
    matchLabels:
      app: bindy
  policyTypes:
    - Ingress
  ingress:
    - ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

Pod Security Standards

apiVersion: v1
kind: Namespace
metadata:
  name: dns-system-prod
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Performance Tuning

Resource Limits

spec:
  resources:
    requests:
      memory: "512Mi"
      cpu: "500m"
    limits:
      memory: "1Gi"
      cpu: "1000m"

HorizontalPodAutoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: primary-dns-hpa
  namespace: dns-system-prod
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-dns
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Testing

Load Testing

# Using dnsperf
dnsperf -s <DNS_IP> -d queries.txt -c 100 -l 60

# queries.txt format:
# www.example.com A
# mail1.example.com A
# example.com MX

Failover Testing

# Delete primary pod to test failover
kubectl delete pod -n dns-system-prod -l dns-role=primary --force

# Monitor DNS continues to serve
dig @<DNS_IP> www.example.com

Multi-Region Setup Example

Geographic distribution for global DNS resilience and performance.

Overview

This example demonstrates:

  • Primary instances in multiple regions
  • Secondary instances for redundancy
  • Zone replication across regions
  • Anycast for geographic load balancing
  • Cross-region monitoring

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                        Global DNS Infrastructure                    │
└────────────────────────────────────────────────────────────────────┘

  Region 1: us-east-1           Region 2: us-west-2         Region 3: eu-west-1
┌─────────────────────┐      ┌─────────────────────┐     ┌─────────────────────┐
│  Primary Instances  │      │ Secondary Instances │     │ Secondary Instances │
│                     │      │                     │     │                     │
│  ┌────┐  ┌────┐   │◄─────┤  ┌────┐  ┌────┐    │◄────┤  ┌────┐  ┌────┐    │
│  │Pod1│  │Pod2│   │ AXFR │  │Pod1│  │Pod2│    │AXFR │  │Pod1│  │Pod2│    │
│  └────┘  └────┘   │      │  └────┘  └────┘    │     │  └────┘  └────┘    │
│                     │      │                     │     │                     │
│  DNSSEC: Enabled    │      │  DNSSEC: Verify    │     │  DNSSEC: Verify    │
│  Replicas: 2        │      │  Replicas: 2        │     │  Replicas: 2        │
└─────────────────────┘      └─────────────────────┘     └─────────────────────┘
         │                            │                            │
         └────────────────────────────┴────────────────────────────┘
                                      │
                              Anycast IP: 192.0.2.1
                        (Routes to nearest region)

Region 1: us-east-1 (Primary)

Save as region-us-east-1.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: us-east-1
    role: primary

---
# Primary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east-1
  namespace: dns-system
  labels:
    app: bindy
    dns-role: primary
    region: us-east-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.1.0.0/16"  # us-west-2 CIDR
      - "10.2.0.0/16"  # eu-west-1 CIDR
    dnssec:
      enabled: true
      validation: false
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: primary
      region: us-east-1

---
# Primary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
      region: us-east-1
  soaRecord:
    primaryNs: "ns1.us-east-1.example.com."
    adminEmail: "dns-admin@example.com"
    serial: 2024010101
    refresh: 900
    retry: 300
    expire: 604800
    negativeTtl: 300
  ttl: 300

---
# Nameserver Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-us-east-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns1.us-east-1"
  ipv4Address: "192.0.2.1"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns2-us-west-2
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns2.us-west-2"
  ipv4Address: "192.0.2.2"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns3-eu-west-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns3.eu-west-1"
  ipv4Address: "192.0.2.3"
  ttl: 86400

---
# Regional Web Servers
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-us-east-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.us-east-1"
  ipv4Address: "192.0.2.10"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-us-west-2
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.us-west-2"
  ipv4Address: "192.0.2.20"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-eu-west-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.eu-west-1"
  ipv4Address: "192.0.2.30"
  ttl: 60

---
# GeoDNS using SRV records for service discovery
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-web-us-east
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "_http._tcp.us-east-1"
  priority: 10
  weight: 100
  port: 80
  target: "www.us-east-1.example.com."
  ttl: 300

Region 2: us-west-2 (Secondary)

Save as region-us-west-2.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: us-west-2
    role: secondary

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west-2
  namespace: dns-system
  labels:
    app: bindy
    dns-role: secondary
    region: us-west-2
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: secondary
      region: us-west-2

---
# Secondary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
      region: us-west-2
  secondaryConfig:
    primaryServers:
      - "192.0.2.1"   # Primary in us-east-1
      - "192.0.2.2"
  ttl: 300

Region 3: eu-west-1 (Secondary)

Save as region-eu-west-1.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: eu-west-1
    role: secondary

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west-1
  namespace: dns-system
  labels:
    app: bindy
    dns-role: secondary
    region: eu-west-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: secondary
      region: eu-west-1

---
# Secondary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
      region: eu-west-1
  secondaryConfig:
    primaryServers:
      - "192.0.2.1"   # Primary in us-east-1
      - "192.0.2.2"
  ttl: 300

Deployment

1. Deploy to Each Region

# us-east-1
kubectl apply -f region-us-east-1.yaml --context us-east-1

# us-west-2
kubectl apply -f region-us-west-2.yaml --context us-west-2

# eu-west-1
kubectl apply -f region-eu-west-1.yaml --context eu-west-1

2. Verify Replication

# Check zone transfer from primary
kubectl exec -n dns-system -it <primary-pod> -- \
  dig @localhost example.com AXFR

# Verify secondary received zone
kubectl exec -n dns-system -it <secondary-pod> -- \
  dig @localhost example.com SOA

3. Configure Anycast (Infrastructure Level)

This requires network infrastructure support:

# Example using MetalLB for on-premises
apiVersion: v1
kind: Service
metadata:
  name: dns-anycast
  namespace: dns-system
  annotations:
    metallb.universe.tf/address-pool: anycast-pool
spec:
  type: LoadBalancer
  loadBalancerIP: 192.0.2.1  # Same IP in all regions
  selector:
    app: bindy
  ports:
    - protocol: UDP
      port: 53
      targetPort: 53

Cross-Region Monitoring

Prometheus Federation

# Global Prometheus Configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yml: |
    global:
      scrape_interval: 30s
    
    scrape_configs:
      # us-east-1
      - job_name: 'dns-us-east-1'
        static_configs:
          - targets: ['prometheus.us-east-1.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep
      
      # us-west-2
      - job_name: 'dns-us-west-2'
        static_configs:
          - targets: ['prometheus.us-west-2.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep
      
      # eu-west-1
      - job_name: 'dns-eu-west-1'
        static_configs:
          - targets: ['prometheus.eu-west-1.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep

Health Checks

#!/bin/bash
# health-check-multi-region.sh

REGIONS=("us-east-1" "us-west-2" "eu-west-1")
QUERY="www.example.com"

for region in "${REGIONS[@]}"; do
  echo "Checking $region..."
  
  # Get DNS service IP
  DNS_IP=$(kubectl get svc -n dns-system --context $region \
    -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}')
  
  # Test query
  if dig @$DNS_IP $QUERY +short > /dev/null; then
    echo "✓ $region: OK"
  else
    echo "✗ $region: FAILED"
  fi
done

Disaster Recovery

Regional Failover

# Promote secondary in us-west-2 to primary
kubectl patch bind9instance secondary-us-west-2 \
  -n dns-system --context us-west-2 \
  --type merge \
  --patch '{"metadata":{"labels":{"dns-role":"primary"}}}'

# Update zone to primary
kubectl patch dnszone example-com-secondary \
  -n dns-system --context us-west-2 \
  --type merge \
  --patch '{"spec":{"zoneType":"primary"}}'

Backup Strategy

#!/bin/bash
# backup-all-regions.sh

REGIONS=("us-east-1" "us-west-2" "eu-west-1")
BACKUP_DIR="./multi-region-backups/$(date +%Y%m%d)"

mkdir -p "$BACKUP_DIR"

for region in "${REGIONS[@]}"; do
  echo "Backing up $region..."
  
  kubectl get dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords \
    -n dns-system --context $region -o yaml \
    > "$BACKUP_DIR/$region.yaml"
done

echo "Backup completed: $BACKUP_DIR"

Performance Testing

Global Latency Test

#!/bin/bash
# test-global-latency.sh

REGIONS=(
  "us-east-1:192.0.2.1"
  "us-west-2:192.0.2.2"
  "eu-west-1:192.0.2.3"
)

for region_ip in "${REGIONS[@]}"; do
  region="${region_ip%%:*}"
  ip="${region_ip##*:}"
  
  echo "Testing $region ($ip)..."
  
  # Measure query time
  time dig @$ip www.example.com +short
done

Load Distribution

# Using dnsperf across regions
for region in us-east-1 us-west-2 eu-west-1; do
  dnsperf -s $DNS_IP -d queries.txt -c 50 -l 30 -Q 1000 | \
    tee results-$region.txt
done

Cost Optimization

Regional Scaling

# HPA for each region based on local load
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: dns-hpa-us-east-1
  namespace: dns-system
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-us-east-1
  minReplicas: 2
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Compliance and Data Residency

Regional Data Isolation

# EU-specific zone for GDPR compliance
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: eu-example-com
  namespace: dns-system
  labels:
    compliance: gdpr
spec:
  zoneName: "eu.example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      region: eu-west-1
  soaRecord:
    primaryNs: "ns1.eu-west-1.example.com."
    adminEmail: "dpo@example.com"
    serial: 2024010101
    refresh: 900
    retry: 300
    expire: 604800
    negativeTtl: 300

API Documentation (rustdoc)

The complete API documentation is generated from Rust source code and is available separately.

Viewing API Documentation

Online

Visit the API Reference section of the documentation site.

Locally

Build and view the API documentation:

# Build API docs
cargo doc --no-deps --all-features

# Open in browser
cargo doc --no-deps --all-features --open

Or build the complete documentation (user guide + API):

make docs-serve
# Navigate to http://localhost:3000/rustdoc/bindy/index.html

What’s in the API Documentation

The rustdoc API documentation includes:

  • Module Documentation - All public modules and their organization
  • Struct Definitions - Complete CRD type definitions (Bind9Instance, DNSZone, etc.)
  • Function Signatures - All public functions with parameter types and return values
  • Examples - Code examples showing how to use the API
  • Type Documentation - Detailed information about all public types
  • Trait Implementations - All trait implementations for types

Key Modules

  • bindy::crd - Custom Resource Definitions
  • bindy::reconcilers - Controller reconciliation logic
  • bindy::bind9 - BIND9 zone file management
  • bindy::bind9_resources - Kubernetes resource builders

When the documentation is built, you can access:

  • Main API Index: rustdoc/bindy/index.html
  • CRD Module: rustdoc/bindy/crd/index.html
  • Reconcilers: rustdoc/bindy/reconcilers/index.html

Changelog

All notable changes to Bindy will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Fixed

  • DNSZone tight reconciliation loop - Added status change detection to prevent unnecessary status updates and reconciliation cycles (2025-12-01)

Added

  • Comprehensive documentation with mdBook and rustdoc
  • GitHub Pages deployment workflow
  • Status update optimization documentation in performance guide

[0.1.0] - 2024-01-01

Added

  • Initial release of Bindy
  • Bind9Instance CRD for managing BIND9 DNS server instances
  • DNSZone CRD with label selector support
  • DNS record CRDs: A, AAAA, CNAME, MX, TXT, NS, SRV, CAA
  • Reconciliation controllers for all resource types
  • BIND9 zone file generation
  • Status subresources for all CRDs
  • RBAC configuration
  • Docker container support
  • Comprehensive test suite
  • CI/CD with GitHub Actions
  • Integration tests with Kind

Features

  • High-performance Rust implementation
  • Async/await with Tokio runtime
  • Label-based instance targeting
  • Primary and secondary DNS support
  • Multi-region deployment support
  • Full status reporting
  • Kubernetes 1.24+ support

License

Bindy is licensed under the MIT License.

SPDX-License-Identifier: MIT

Copyright (c) 2025 Erick Bourgeois, firestoned

MIT License

Copyright (c) 2025 Erick Bourgeois, firestoned

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

What This Means for You

The MIT License is one of the most permissive open source licenses. Here’s what it allows:

✅ You Can

  • Use commercially - Use Bindy in your commercial products and services
  • Modify - Change the code to fit your needs
  • Distribute - Share the original or your modified version
  • Sublicense - Include Bindy in proprietary software
  • Private use - Use Bindy for private/internal purposes without releasing your modifications

⚠️ Requirements

  • Include the license - Include the copyright notice and license text in substantial portions of the software
  • State changes - Document any modifications you make (recommended best practice)

❌ Limitations

  • No warranty - The software is provided “as is” without warranty of any kind
  • No liability - The authors are not liable for any damages arising from the use of the software

SPDX License Identifiers

All source code files in this project include SPDX license identifiers for machine-readable license information:

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

This makes it easy for automated tools to:

  • Scan the codebase for license compliance
  • Generate Software Bill of Materials (SBOM)
  • Verify license compatibility

Learn more about SPDX at https://spdx.dev/

Software Bill of Materials (SBOM)

Bindy provides SBOM files in CycloneDX format with every release. These include:

  • Binary SBOMs for each platform (Linux, macOS, Windows)
  • Docker image SBOM
  • Complete dependency tree with license information

SBOMs are available as release assets and can be used for:

  • Supply chain security
  • Vulnerability scanning
  • License compliance auditing
  • Dependency tracking

Third-Party Licenses

Bindy depends on various open-source libraries. All dependencies are permissively licensed and compatible with the MIT License.

Key Dependencies

LibraryLicensePurpose
kube-rsApache 2.0 / MITKubernetes client library
tokioMITAsync runtime
serdeApache 2.0 / MITSerialization framework
tracingMITStructured logging
anyhowApache 2.0 / MITError handling
thiserrorApache 2.0 / MITError derivation

Generating License Reports

For a complete list of all dependencies and their licenses:

# Install cargo-license tool
cargo install cargo-license

# Generate license report
cargo license

# Generate detailed license report with full license text
cargo license --json > licenses.json

You can also use cargo-about for more detailed license auditing:

cargo install cargo-about
cargo about generate about.hbs > licenses.html

Container Image Licenses

The Docker images for Bindy include:

  • Base Image: Alpine Linux (MIT License)
  • BIND9: ISC License (permissive, BSD-style)
  • Bindy Binary: MIT License

All components are open source and permissively licensed.

Contributing

By contributing to Bindy, you agree that:

  1. Your contributions will be licensed under the MIT License
  2. You have the right to submit the contributions
  3. You grant the project maintainers a perpetual, worldwide, non-exclusive, royalty-free license to use your contributions

See the Contributing Guidelines for more information on how to contribute.

License Compatibility

The MIT License is compatible with most other open source licenses, including:

  • ✅ Apache License 2.0
  • ✅ BSD licenses (2-clause, 3-clause)
  • ✅ GPL v2 and v3 (one-way compatible - MIT code can be included in GPL projects)
  • ✅ ISC License
  • ✅ Other MIT-licensed code

This makes Bindy easy to integrate into various projects and environments.

Questions About Licensing

If you have questions about:

  • Using Bindy in your project
  • License compliance
  • Contributing to Bindy
  • Third-party dependencies

Please open a GitHub Discussion or contact the maintainers.

Additional Resources