Introduction

Bindy is a high-performance Kubernetes controller written in Rust that manages BIND9 DNS infrastructure through Custom Resource Definitions (CRDs). It enables you to manage DNS zones and records as native Kubernetes resources, bringing the declarative Kubernetes paradigm to DNS management.

What is Bindy?

Bindy watches for DNS-related Custom Resources in your Kubernetes cluster and automatically generates and manages BIND9 zone configurations. It replaces traditional manual DNS management with a declarative, GitOps-friendly approach.

Key Features

High Performance - Native Rust implementation with async/await and zero-copy operations
RNDC Protocol - Native BIND9 management via Remote Name Daemon Control (RNDC) with TSIG authentication
Label Selectors - Target specific BIND9 instances using Kubernetes label selectors
Dynamic Zone Management - Automatically create and manage DNS zones using RNDC commands
Multi-Record Types - Support for A, AAAA, CNAME, MX, TXT, NS, SRV, and CAA records
Declarative DNS - Manage DNS as Kubernetes resources with full GitOps support
Security First - TSIG-authenticated RNDC communication, non-root containers, RBAC-ready
Status Tracking - Complete status subresources for all resources
Primary/Secondary Support - Built-in support for primary and secondary DNS architectures with zone transfers

Why Bindy?

Traditional DNS management involves:

Manual editing of zone files
SSH access to DNS servers
No audit trail or version control
Difficult disaster recovery
Complex multi-region setups

Bindy transforms this by:

Managing DNS as Kubernetes resources
Full GitOps workflow support
Native RNDC protocol for direct BIND9 control
Built-in audit trail via Kubernetes events
Simple disaster recovery (backup your CRDs)
Seamless multi-region DNS distribution with zone transfers

Who Should Use Bindy?

Bindy is ideal for:

Platform Engineers building internal DNS infrastructure
DevOps Teams managing DNS alongside their Kubernetes workloads
SREs requiring automated, auditable DNS management
Organizations running self-hosted BIND9 DNS servers
Multi-region Deployments needing distributed DNS infrastructure

Quick Example

Here’s how simple it is to create a DNS zone with records:

# Create a DNS zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
spec:
  zoneName: example.com
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
  ttl: 3600

---
# Add an A record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

Apply it to your cluster:

kubectl apply -f dns-config.yaml

Bindy automatically:

Finds matching BIND9 instances using pod discovery
Connects to BIND9 via RNDC protocol (port 953)
Creates zones and records using native RNDC commands
Tracks status and conditions in real-time

Next Steps

Installation - Get started with Bindy
Quick Start - Deploy your first DNS zone
RNDC-Based Architecture - Learn about the RNDC protocol architecture
Architecture Overview - Understand how Bindy works
API Reference - Complete API documentation

Performance Characteristics

Startup Time: <1 second
Memory Usage: ~50MB baseline
Zone Creation Latency: <500ms per zone (via RNDC)
Record Addition Latency: <200ms per record (via RNDC)
RNDC Command Execution: <100ms typical
Controller Overhead: Negligible CPU when idle

Project Status

Bindy is actively developed and used in production environments. The project follows semantic versioning and maintains backward compatibility within major versions.

Current version: v0.1.0

Support & Community

GitHub Issues: Report bugs or request features
GitHub Discussions: Ask questions and share ideas
Documentation: You’re reading it!

License

Bindy is open-source software licensed under the MIT License.

Installation

This section guides you through installing Bindy in your Kubernetes cluster.

Overview

Installing Bindy involves these steps:

Prerequisites - Ensure your environment meets the requirements
Install CRDs - Deploy Custom Resource Definitions
Create RBAC - Set up service accounts and permissions
Deploy Controller - Install the Bindy controller
Create BIND9 Instances - Deploy your DNS servers

Installation Methods

Standard Installation

The standard installation uses kubectl to apply YAML manifests:

# Create namespace
kubectl create namespace dns-system

# Install CRDs (use kubectl create to avoid annotation size limits)
kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

# Install RBAC
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

# Deploy controller
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

Development Installation

For development or testing, you can build and deploy from source:

# Clone the repository
git clone https://github.com/firestoned/bindy.git
cd bindy

# Build the controller
cargo build --release

# Build Docker image
docker build -t bindy:dev .

# Deploy with your custom image
kubectl apply -f deploy/

Verification

After installation, verify that all components are running:

# Check CRDs are installed
kubectl get crd | grep bindy.firestoned.io

# Check controller is running
kubectl get pods -n dns-system

# Check controller logs
kubectl logs -n dns-system -l app=bind9-controller

You should see output similar to:

NAME                                READY   STATUS    RESTARTS   AGE
bind9-controller-7d4b8c4f9b-x7k2m   1/1     Running   0          1m

Next Steps

Quick Start - Deploy your first DNS zone
Prerequisites - Detailed system requirements
Installing CRDs - Understanding the Custom Resources
Deploying the Controller - Controller configuration options

Prerequisites

Before installing Bindy, ensure your environment meets these requirements.

Kubernetes Cluster

Kubernetes Version: 1.24 or later
Access Level: Cluster admin access (for CRD and RBAC installation)
Namespace: Ability to create namespaces (recommended: dns-system)

Supported Kubernetes Distributions

Bindy has been tested on:

Kubernetes (vanilla)
k0s
MKE
k0RDENT
Amazon EKS
Google GKE
Azure AKS
Red Hat OpenShift
k3s
kind (for development/testing)

Client Tools

Required

kubectl: 1.24+ - Install kubectl

Optional (for development)

Rust: 1.70+ - Install Rust
Cargo: Included with Rust
Docker: For building images - Install Docker

Cluster Resources

Minimum Requirements

CPU: 100m per controller pod
Memory: 128Mi per controller pod
Storage:
- Minimal for controller (configuration only)
- StorageClass: Required for persistent zone data (optional but recommended)

Recommended for Production

CPU: 500m per controller pod (2 replicas)
Memory: 512Mi per controller pod
High Availability: 3 controller replicas across different nodes

BIND9 Infrastructure

Bindy manages existing BIND9 servers. You’ll need:

BIND9 version 9.16 or later (9.18+ recommended)
Network connectivity from Bindy controller to BIND9 pods
Shared volume for zone files (ConfigMap, PVC, or similar)

Network Requirements

Controller to API Server

Outbound HTTPS (443) to Kubernetes API server
Required for watching resources and updating status

Controller to BIND9 Pods

Access to BIND9 configuration volumes
Typical setup uses Kubernetes ConfigMaps or PersistentVolumes

BIND9 to Network

UDP/TCP port 53 for DNS queries
Port 953 for RNDC (if using remote name daemon control)
Zone transfer ports (configured in BIND9)

Permissions

Cluster-Level Permissions Required

The person installing Bindy needs:

# Ability to create CRDs
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: ["create", "get", "list"]

# Ability to create ClusterRoles and ClusterRoleBindings
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["clusterroles", "clusterrolebindings"]
  verbs: ["create", "get", "list"]

Namespace Permissions Required

For the DNS system namespace:

Create ServiceAccounts
Create Deployments
Create ConfigMaps
Create Services

Storage Provisioner

For persistent zone data storage across pod restarts, you need a StorageClass configured in your cluster.

Production Environments

Use your cloud provider’s StorageClass:

AWS: EBS (gp3 or gp2)
GCP: Persistent Disk (pd-standard or pd-ssd)
Azure: Azure Disk (managed-premium or managed)
On-Premises: NFS, Ceph, or other storage solutions

Verify a default StorageClass exists:

kubectl get storageclass

Development/Testing (Kind, k3s, local clusters)

For local development, install the local-path provisioner:

# Install local-path provisioner
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.28/deploy/local-path-storage.yaml

# Wait for provisioner to be ready
kubectl wait --for=condition=available --timeout=60s \
  deployment/local-path-provisioner -n local-path-storage

# Check if local-path StorageClass was created
if kubectl get storageclass local-path &>/dev/null; then
  # Set local-path as default if no default exists
  kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
else
  # Create a default StorageClass using local-path provisioner
  cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: default
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
EOF
fi

# Verify installation
kubectl get storageclass

Expected output (either local-path or default will be marked as default):

NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  1m

Or:

NAME                PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
default (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  1m

Note: The local-path provisioner stores data on the node’s local disk. It’s not suitable for production but works well for development and testing.

Optional Components

For Production Deployments

Monitoring: Prometheus for metrics collection
Logging: Elasticsearch/Loki for log aggregation
GitOps: ArgoCD or Flux for declarative management
Backup: Velero for disaster recovery

For Development

kind: Local Kubernetes for testing
tilt: For rapid development cycles
k9s: Terminal UI for Kubernetes

Verification

Check your cluster meets the requirements:

# Check Kubernetes version
kubectl version --short

# Check you have cluster-admin access
kubectl auth can-i create customresourcedefinitions

# Check available resources
kubectl top nodes

# Verify connectivity
kubectl cluster-info

Expected output:

Client Version: v1.28.0
Server Version: v1.27.3

yes

Next Steps

Once your environment meets these prerequisites:

Install CRDs
Deploy the Controller
Quick Start Guide

Quick Start

Get Bindy running in 5 minutes with this quick start guide.

Step 1: Install Storage Provisioner (Optional)

For persistent zone data storage, install a storage provisioner. For Kind clusters or local development:

# Install local-path provisioner
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.28/deploy/local-path-storage.yaml

# Wait for provisioner to be ready
kubectl wait --for=condition=available --timeout=60s \
  deployment/local-path-provisioner -n local-path-storage

# Set as default StorageClass (or create one if it doesn't exist)
if kubectl get storageclass local-path &>/dev/null; then
  kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
else
  # Create default StorageClass if local-path wasn't created
  cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: default
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: rancher.io/local-path
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
EOF
fi

# Verify StorageClass is available
kubectl get storageclass

Note: For production clusters, use your cloud provider’s StorageClass (AWS EBS, GCP PD, Azure Disk, etc.)

Step 2: Install Bindy

# Create namespace
kubectl create namespace dns-system

# Install CRDs (use kubectl create to avoid annotation size limits)
kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

# Install RBAC
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

# Deploy controller
kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

# Wait for controller to be ready
kubectl wait --for=condition=available --timeout=300s \
  deployment/bind9-controller -n dns-system

Step 3: Create a BIND9 Cluster

First, create a cluster configuration that defines shared settings:

Create a file bind9-cluster.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"

⚠️ Warning: There are NO defaults for allowQuery and allowTransfer. If you don’t specify these fields, BIND9’s default behavior applies (no queries or transfers allowed). Always explicitly configure these fields for your security requirements.

Apply it:

kubectl apply -f bind9-cluster.yaml

Optional: Add Persistent Storage

To persist zone data across pod restarts, you can add PersistentVolumeClaims to your Bind9Cluster or Bind9Instance.

First, create a PVC for zone data storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bind9-zones-pvc
  namespace: dns-system
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  # Uses default StorageClass if not specified
  # storageClassName: local-path

Then update your Bind9Cluster to use the PVC:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
  # Add persistent storage for zones
  volumes:
    - name: zones
      persistentVolumeClaim:
        claimName: bind9-zones-pvc
  volumeMounts:
    - name: zones
      mountPath: /var/cache/bind

Or add storage to a specific Bind9Instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns
  role: primary  # Required: primary or secondary
  replicas: 1
  # Instance-specific storage (overrides cluster-level)
  volumes:
    - name: zones
      persistentVolumeClaim:
        claimName: bind9-primary-zones-pvc
  volumeMounts:
    - name: zones
      mountPath: /var/cache/bind

Note: When using PVCs with accessMode: ReadWriteOnce, each replica needs its own PVC since the volume can only be mounted by one pod at a time. For multi-replica setups, use ReadWriteMany if your storage class supports it, or create separate PVCs per instance.

Step 4: Create a BIND9 Instance

Now create an instance that references the cluster:

Create a file bind9-instance.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References the Bind9Cluster
  role: primary  # Required: primary or secondary
  replicas: 1

Apply it:

kubectl apply -f bind9-instance.yaml

Step 5: Create a DNS Zone

Create a file dns-zone.yaml:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production-dns  # References the Bind9Cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Apply it:

kubectl apply -f dns-zone.yaml

Step 6: Add DNS Records

Create a file dns-records.yaml:

Note: DNS records reference zones using zoneRef, which is the Kubernetes resource name of the DNSZone (e.g., example-com for a DNSZone named example-com).

# Web server A record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

---
# Blog CNAME record
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.
  ttl: 300

---
# Mail server MX record
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
  ttl: 3600

---
# SPF TXT record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: "@"
  text:
    - "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Apply them:

kubectl apply -f dns-records.yaml

Step 7: Verify Your DNS Configuration

Check the status of your resources:

# Check BIND9 cluster
kubectl get bind9clusters -n dns-system

# Check BIND9 instance
kubectl get bind9instances -n dns-system

# Check DNS zone
kubectl get dnszones -n dns-system

# Check DNS records
kubectl get arecords,cnamerecords,mxrecords,txtrecords -n dns-system

# View detailed status
kubectl describe dnszone example-com -n dns-system

You should see output like:

NAME          ZONE          STATUS   AGE
example-com   example.com   Ready    1m

Step 8: Test DNS Resolution

If your BIND9 instance is exposed (via LoadBalancer or NodePort):

# Get the BIND9 service IP
kubectl get svc -n dns-system

# Test DNS query (replace <BIND9-IP> with actual IP)
dig @<BIND9-IP> www.example.com
dig @<BIND9-IP> blog.example.com
dig @<BIND9-IP> example.com MX
dig @<BIND9-IP> example.com TXT

What’s Next?

You’ve successfully deployed Bindy and created your first DNS zone with records!

Learn More

RNDC-Based Architecture - Understand the RNDC protocol architecture
Architecture Overview - Understand how Bindy works
Multi-Region Setup - Deploy across multiple regions
Status Conditions - Monitor resource health

Common Next Steps

Add Secondary DNS Instances for high availability
Configure Zone Transfers between primary and secondary
Set up Monitoring to track DNS performance
Integrate with GitOps for automated deployments
Configure DNSSEC for enhanced security

Production Checklist

Before going to production:

Deploy multiple controller replicas for HA
Set up primary and secondary DNS instances
Configure resource limits and requests
Enable monitoring and alerting
Set up backup for CRD definitions
Configure RBAC properly
Review security settings
Test disaster recovery procedures

Troubleshooting

If something doesn’t work:

Check controller logs:

kubectl logs -n dns-system -l app=bind9-controller -f

Check resource status:

kubectl describe dnszone example-com -n dns-system

Verify CRDs are installed:

kubectl get crd | grep bindy.firestoned.io

See the Troubleshooting Guide for more help.

Installing CRDs

Custom Resource Definitions (CRDs) extend Kubernetes with new resource types for DNS management.

What are CRDs?

CRDs define the schema for custom resources in Kubernetes. Bindy uses CRDs to represent:

BIND9 clusters (cluster-level configuration)
BIND9 instances (individual DNS server deployments)
DNS zones
DNS records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

Installation

Install all Bindy CRDs:

kubectl create -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/crds/

Or install from local files:

cd bindy
kubectl create -f deploy/crds/

Important: Use kubectl create instead of kubectl apply to avoid the 256KB annotation size limit that can occur with large CRDs like Bind9Instance.

Updating Existing CRDs

To update CRDs that are already installed:

kubectl replace --force -f deploy/crds/

The --force flag deletes and recreates the CRDs, which is necessary to avoid annotation size limits.

Verify Installation

Check that all CRDs are installed:

kubectl get crd | grep bindy.firestoned.io

Expected output:

aaaarecords.bindy.firestoned.io         2024-01-01T00:00:00Z
arecords.bindy.firestoned.io            2024-01-01T00:00:00Z
bind9clusters.bindy.firestoned.io       2024-01-01T00:00:00Z
bind9instances.bindy.firestoned.io      2024-01-01T00:00:00Z
caarecords.bindy.firestoned.io          2024-01-01T00:00:00Z
cnamerecords.bindy.firestoned.io        2024-01-01T00:00:00Z
dnszones.bindy.firestoned.io            2024-01-01T00:00:00Z
mxrecords.bindy.firestoned.io           2024-01-01T00:00:00Z
nsrecords.bindy.firestoned.io           2024-01-01T00:00:00Z
srvrecords.bindy.firestoned.io          2024-01-01T00:00:00Z
txtrecords.bindy.firestoned.io          2024-01-01T00:00:00Z

CRD Details

For detailed specifications of each CRD, see:

Bind9Instance Spec
DNSZone Spec
Record Specs

Deploying the Controller

The Bindy controller watches for DNS resources and manages BIND9 configurations.

Prerequisites

Before deploying the controller:

CRDs must be installed
RBAC must be configured
Namespace must exist (dns-system recommended)

Installation

Create Namespace

kubectl create namespace dns-system

Install RBAC

kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/rbac/

This creates:

ServiceAccount for the controller
ClusterRole with required permissions
ClusterRoleBinding to bind them together

Deploy Controller

kubectl apply -f https://raw.githubusercontent.com/firestoned/bindy/main/deploy/controller/deployment.yaml

Wait for Readiness

kubectl wait --for=condition=available --timeout=300s \
  deployment/bind9-controller -n dns-system

Verify Deployment

Check controller pod status:

kubectl get pods -n dns-system -l app=bind9-controller

Expected output:

NAME                                READY   STATUS    RESTARTS   AGE
bind9-controller-7d4b8c4f9b-x7k2m   1/1     Running   0          1m

Check controller logs:

kubectl logs -n dns-system -l app=bind9-controller -f

You should see:

{"timestamp":"2024-01-01T00:00:00Z","level":"INFO","message":"Starting Bindy controller"}
{"timestamp":"2024-01-01T00:00:01Z","level":"INFO","message":"Watching DNSZone resources"}
{"timestamp":"2024-01-01T00:00:01Z","level":"INFO","message":"Watching DNS record resources"}

Configuration

Environment Variables

Configure the controller via environment variables:

Variable	Default	Description
`RUST_LOG`	`info`	Log level (error, warn, info, debug, trace)
`BIND9_ZONES_DIR`	`/etc/bind/zones`	Directory for zone files
`RECONCILE_INTERVAL`	`300`	Reconciliation interval in seconds

Edit the deployment to customize:

env:
  - name: RUST_LOG
    value: "debug"
  - name: BIND9_ZONES_DIR
    value: "/var/lib/bind/zones"

Resource Limits

For production, set appropriate resource limits:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

High Availability

Run multiple replicas with leader election:

spec:
  replicas: 3

Troubleshooting

Controller Not Starting

Check pod events:

kubectl describe pod -n dns-system -l app=bind9-controller

Check if CRDs are installed:

kubectl get crd | grep bindy.firestoned.io

Check RBAC permissions:

kubectl auth can-i list dnszones --as=system:serviceaccount:dns-system:bind9-controller

High Memory Usage

If the controller uses excessive memory:

Reduce log level: RUST_LOG=warn
Increase resource limits
Check for memory leaks in logs

Next Steps

Quick Start Guide - Create your first DNS zone
Configuration - Advanced configuration
Monitoring - Set up monitoring

Basic Concepts

This section introduces the core concepts behind Bindy and how it manages DNS infrastructure in Kubernetes.

The Kubernetes Way

Bindy follows Kubernetes patterns and idioms:

Declarative Configuration - You declare what DNS records should exist, Bindy makes it happen
Custom Resources - DNS zones and records are Kubernetes resources
Controllers - Bindy watches resources and reconciles state
Labels and Selectors - Target specific BIND9 instances using labels
Status Subresources - Track the health and state of DNS resources

Core Resources

Bindy introduces these Custom Resource Definitions (CRDs):

Infrastructure Resources

Bind9Cluster - Cluster-level configuration (version, shared config, TSIG keys, ACLs)
Bind9Instance - Individual BIND9 DNS server deployment (inherits from cluster)

DNS Resources

DNSZone - Defines a DNS zone with SOA record (references a cluster)
DNS Records - Individual DNS record types:
- ARecord (IPv4)
- AAAARecord (IPv6)
- CNAMERecord (Canonical Name)
- MXRecord (Mail Exchange)
- TXTRecord (Text)
- NSRecord (Name Server)
- SRVRecord (Service)
- CAARecord (Certificate Authority Authorization)

How It Works

graph TB
    subgraph k8s["Kubernetes API"]
        zone["DNSZone"]
        arecord["ARecord"]
        mx["MXRecord"]
        txt["TXTRecord"]
        more["..."]
    end

    controller["Bindy Controller<br/>• Watches CRDs<br/>• Reconciles state<br/>• RNDC client<br/>• TSIG authentication"]

    bind9["BIND9 Instances<br/>• rndc daemon (port 953)<br/>• Primary servers<br/>• Secondary servers<br/>• Dynamic zones<br/>• DNS queries (port 53)"]

    zone --> controller
    arecord --> controller
    mx --> controller
    txt --> controller
    more --> controller

    controller -->|"RNDC Protocol<br/>(Port 953/TCP)<br/>TSIG/HMAC-SHA256"| bind9

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px

Reconciliation Loop

Watch - Controller watches for changes to DNS resources
Discover - Finds BIND9 instance pods via Kubernetes API
Authenticate - Loads RNDC key from Kubernetes Secret
Execute - Sends RNDC commands to BIND9 (addzone, reload, etc.)
Verify - BIND9 executes command and returns success/error
Status - Reports success or failure via status conditions

RNDC Protocol

Bindy uses the native BIND9 Remote Name Daemon Control (RNDC) protocol for managing DNS zones and servers. This provides:

Direct Control - Native BIND9 management without intermediate files
Real-time Operations - Immediate feedback on success or failure
Atomic Commands - Operations succeed or fail atomically
Secure Communication - TSIG authentication with HMAC-SHA256

RNDC Commands

Common RNDC operations used by Bindy:

addzone <zone> - Dynamically add a new zone
delzone <zone> - Remove a zone
reload <zone> - Reload zone data
notify <zone> - Trigger zone transfer to secondaries
zonestatus <zone> - Query zone status
retransfer <zone> - Force zone transfer from primary

TSIG Authentication

All RNDC communication is secured using TSIG (Transaction Signature):

Authentication - Verifies command source is authorized
Integrity - Prevents command tampering
Replay Protection - Timestamp validation prevents replay attacks
Key Storage - RNDC keys stored in Kubernetes Secrets
Per-Instance Keys - Each BIND9 instance has unique HMAC-SHA256 key

Cluster References

Instead of label selectors, zones now reference a specific BIND9 cluster:

# DNS Zone references a cluster
spec:
  zoneName: example.com
  clusterRef: my-dns-cluster  # References Bind9Instance name

This simplifies:

Zone placement - Direct reference to cluster
Pod discovery - Find instances by cluster name
RNDC key lookup - Keys named {clusterRef}-rndc-key

Resource Relationships

graph BT
    records["DNS Records<br/>(A, CNAME, MX, etc.)"]
    zone["DNSZone<br/>(has clusterRef)"]
    instance["Bind9Instance<br/>(has clusterRef)"]
    cluster["Bind9Cluster<br/>(cluster config)"]

    records -->|"references<br/>zone field"| zone
    zone -->|"references<br/>clusterRef"| instance
    instance -->|"references<br/>clusterRef"| cluster

    style records fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style zone fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style instance fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style cluster fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px

Three-Tier Hierarchy

Bind9Cluster - Cluster-level configuration
- Shared BIND9 version
- Common config (recursion, DNSSEC, forwarders)
- TSIG keys for zone transfers
- ACL definitions
Bind9Instance - Instance deployment
- References a Bind9Cluster via clusterRef
- Can override cluster config
- Has RNDC key for management
- Manages pods and services
DNSZone - DNS zone definition
- References a Bind9Instance via clusterRef
- Contains SOA record
- Applied to instance via RNDC
DNS Records - Individual records
- Reference a DNSZone by name
- Added to zone via RNDC (planned: nsupdate)

RNDC Key Secret Relationship

graph TD
    instance["Bind9Instance:<br/>my-dns-instance"]
    secret["Secret:<br/>my-dns-instance-rndc-key"]
    data["data:<br/>key-name: my-dns-instance<br/>algorithm: hmac-sha256<br/>secret: base64-encoded-key"]

    instance -->|creates/expects| secret
    secret --> data

    style instance fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style secret fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style data fill:#fce4ec,stroke:#880e4f,stroke-width:2px

The controller uses this Secret to authenticate RNDC commands to the BIND9 instance.

Status and Conditions

All resources report their status:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: Zone created for 2 instances
      lastTransitionTime: 2024-01-01T00:00:00Z
  observedGeneration: 1
  matchedInstances: 2

Status conditions follow Kubernetes conventions:

Type - What aspect (Ready, Synced, etc.)
Status - True, False, or Unknown
Reason - Machine-readable reason code
Message - Human-readable description

Next Steps

RNDC-Based Architecture - Learn about the RNDC protocol architecture
Architecture Overview - Deep dive into Bindy’s architecture
Custom Resource Definitions - Detailed CRD specifications
Bind9Instance - Learn about BIND9 instance resources
DNSZone - Learn about DNS zone resources
DNS Records - Learn about DNS record types

Architecture Overview

This page provides a detailed overview of Bindy’s architecture and design principles.

High-Level Architecture

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph crds["Custom Resource Definitions"]
            crd1["Bind9Instance"]
            crd2["DNSZone"]
            crd3["ARecord, MXRecord, ..."]
        end

        subgraph controller["Bindy Controller (Rust)"]
            reconciler1["Instance<br/>Reconciler"]
            reconciler2["Zone<br/>Reconciler"]
            reconciler3["Records<br/>Reconciler"]
            zonegen["Zone File Generator"]
        end

        subgraph bind9["BIND9 Instances"]
            primary["Primary DNS<br/>(us-east)"]
            secondary1["Secondary DNS<br/>(us-west)"]
            secondary2["Secondary DNS<br/>(eu)"]
        end
    end

    clients["Clients<br/>• Apps<br/>• Services<br/>• External"]

    crds -->|watches| controller
    controller -->|configures| bind9
    primary -->|AXFR| secondary1
    secondary1 -->|AXFR| secondary2
    bind9 -->|"DNS queries<br/>(UDP/TCP 53)"| clients

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style crds fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style clients fill:#fce4ec,stroke:#880e4f,stroke-width:2px

Components

Bindy Controller

The controller is written in Rust using the kube-rs library. It consists of:

1. Reconcilers

Each reconciler handles a specific resource type:

Bind9Instance Reconciler - Manages BIND9 instance lifecycle
- Creates StatefulSets for BIND9 pods
- Configures services and networking
- Updates instance status
Bind9Cluster Reconciler - Manages cluster-level configuration
- Manages finalizers for cascade deletion
- Creates and reconciles managed instances
- Propagates global configuration to instances
- Tracks cluster-wide status
DNSZone Reconciler - Manages DNS zones
- Evaluates label selectors
- Generates zone files
- Updates zone configuration
- Reports matched instances
Record Reconcilers - Manage individual DNS records
- One reconciler per record type (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
- Validates record specifications
- Appends records to zone files
- Updates record status

2. Zone File Generator

Generates BIND9-compatible zone files from Kubernetes resources:

#![allow(unused)]
fn main() {
// Simplified example
pub fn generate_zone_file(zone: &DNSZone, records: Vec<DNSRecord>) -> String {
    let mut zone_file = String::new();

    // SOA record
    zone_file.push_str(&format_soa_record(&zone.spec.soa_record));

    // NS records
    for ns in &zone.spec.name_servers {
        zone_file.push_str(&format_ns_record(ns));
    }

    // Individual records
    for record in records {
        zone_file.push_str(&format_record(record));
    }

    zone_file
}
}

Custom Resource Definitions (CRDs)

CRDs define the schema for DNS resources:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: dnszones.bindy.firestoned.io
spec:
  group: bindy.firestoned.io
  names:
    kind: DNSZone
    plural: dnszones
  scope: Namespaced
  versions:
    - name: v1alpha1
      served: true
      storage: true

BIND9 Instances

BIND9 servers managed by Bindy:

Deployed as Kubernetes StatefulSets
Configuration via ConfigMaps
Zone files mounted from ConfigMaps or PVCs
Support for primary and secondary architectures

Data Flow

Zone Creation Flow

User creates DNSZone resource
```
kubectl apply -f dnszone.yaml
```

Controller watches and receives event

#![allow(unused)]
fn main() {
// Watch stream receives create event
stream.next().await
}

DNSZone reconciler evaluates selector

#![allow(unused)]
fn main() {
// Find matching Bind9Instances
let instances = find_matching_instances(&zone.spec.instance_selector).await?;
}

Generate zone file for each instance

#![allow(unused)]
fn main() {
// Create zone configuration
let zone_file = generate_zone_file(&zone, &records)?;
}

Update BIND9 configuration

#![allow(unused)]
fn main() {
// Apply ConfigMap with zone file
update_bind9_config(&instance, &zone_file).await?;
}

Update DNSZone status

#![allow(unused)]
fn main() {
// Report success
update_status(&zone, conditions, matched_instances).await?;
}

Managed Instance Creation Flow

When a Bind9Cluster specifies replica counts, the controller automatically creates instances:

flowchart TD
    A[Bind9Cluster Created] --> B{Has primary.replicas?}
    B -->|Yes| C[Create primary-0, primary-1, ...]
    B -->|No| D{Has secondary.replicas?}
    C --> D
    D -->|Yes| E[Create secondary-0, secondary-1, ...]
    D -->|No| F[No instances created]
    E --> G[Add management labels]
    G --> H[Instances inherit cluster config]

User creates Bind9Cluster with replicas

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
spec:
  primary:
    replicas: 2
  secondary:
    replicas: 3

Bind9Cluster reconciler evaluates replica counts

#![allow(unused)]
fn main() {
let primary_replicas = cluster.spec.primary.as_ref()
    .and_then(|p| p.replicas).unwrap_or(0);
}

Create missing instances with management labels

#![allow(unused)]
fn main() {
let mut labels = BTreeMap::new();
labels.insert("bindy.firestoned.io/managed-by", "Bind9Cluster");
labels.insert("bindy.firestoned.io/cluster", &cluster_name);
labels.insert("bindy.firestoned.io/role", "primary");
}

Instances inherit cluster configuration

#![allow(unused)]
fn main() {
let instance_spec = Bind9InstanceSpec {
    cluster_ref: cluster_name.clone(),
    version: cluster.spec.version.clone(),
    config: None,  // Inherit from cluster
    // ...
};
}

Self-healing: Recreate deleted instances
- Controller detects missing managed instances
- Automatically recreates them with same configuration

Cascade Deletion Flow

When a Bind9Cluster is deleted, all its instances are automatically cleaned up:

flowchart TD
    A[kubectl delete bind9cluster] --> B[Deletion timestamp set]
    B --> C{Finalizer present?}
    C -->|Yes| D[Controller detects deletion]
    D --> E[Find all instances with clusterRef]
    E --> F[Delete each instance]
    F --> G{All deleted?}
    G -->|Yes| H[Remove finalizer]
    G -->|No| I[Retry deletion]
    H --> J[Cluster deleted]
    I --> F

User deletes Bind9Cluster

kubectl delete bind9cluster production-dns

Finalizer prevents immediate deletion

#![allow(unused)]
fn main() {
if cluster.metadata.deletion_timestamp.is_some() {
    // Cleanup before allowing deletion
    delete_cluster_instances(&client, &namespace, &name).await?;
}
}

Find and delete all referencing instances

#![allow(unused)]
fn main() {
let instances: Vec<_> = all_instances.into_iter()
    .filter(|i| i.spec.cluster_ref == cluster_name)
    .collect();

for instance in instances {
    api.delete(&instance_name, &DeleteParams::default()).await?;
}
}

Remove finalizer once cleanup complete

#![allow(unused)]
fn main() {
let mut finalizers = cluster.metadata.finalizers.unwrap_or_default();
finalizers.retain(|f| f != FINALIZER_NAME);
}

Record Addition Flow

User creates DNS record resource
Controller receives event
Record reconciler validates zone reference
Append record to existing zone file
Reload BIND9 configuration
Update record status

Zone Transfer Configuration Flow

For primary/secondary DNS architectures, zones must be configured with zone transfer settings:

flowchart TD
    A[DNSZone Reconciliation] --> B[Discover Secondary Pods]
    B --> C{Secondary IPs Found?}
    C -->|Yes| D[Configure zone with<br/>also-notify & allow-transfer]
    C -->|No| E[Configure zone<br/>without transfers]
    D --> F[Store IPs in<br/>DNSZone.status.secondaryIps]
    E --> F
    F --> G[Next Reconciliation]
    G --> H[Compare Current vs Stored IPs]
    H --> I{IPs Changed?}
    I -->|Yes| J[Delete & Recreate Zones]
    I -->|No| K[No Action]
    J --> B
    K --> G

Implementation Details:

Secondary Discovery - On every reconciliation:

#![allow(unused)]
fn main() {
// Find all Bind9Instance resources with role=secondary for this cluster
let instance_api: Api<Bind9Instance> = Api::namespaced(client.clone(), namespace);
let lp = ListParams::default().labels(&format!("cluster={cluster_name},role=secondary"));
let instances = instance_api.list(&lp).await?;

// Collect IPs from running pods
for instance in instances {
    let pod_ips = get_pod_ips(&client, namespace, &instance).await?;
    secondary_ips.extend(pod_ips);
}
}

Zone Transfer Configuration - Pass secondary IPs to zone creation:

#![allow(unused)]
fn main() {
let zone_config = ZoneConfig {
    // ... other fields ...
    also_notify: Some(secondary_ips.clone()),
    allow_transfer: Some(secondary_ips.clone()),
};
}

Change Detection - Compare IPs on each reconciliation:

#![allow(unused)]
fn main() {
// Get stored IPs from status
let stored_ips = dnszone.status.as_ref()
    .and_then(|s| s.secondary_ips.as_ref());

// Compare sorted lists
let secondaries_changed = match stored_ips {
    Some(stored) => {
        let mut stored = stored.clone();
        let mut current = current_secondary_ips.clone();
        stored.sort();
        current.sort();
        stored != current
    }
    None => !current_secondary_ips.is_empty(),
};

// Recreate zones if IPs changed
if secondaries_changed {
    delete_dnszone(client.clone(), dnszone.clone(), zone_manager).await?;
    add_dnszone(client.clone(), dnszone.clone(), zone_manager).await?;
}
}

Status Tracking - Store current IPs for future comparison:

#![allow(unused)]
fn main() {
let new_status = DNSZoneStatus {
    conditions: vec![ready_condition],
    observed_generation: dnszone.metadata.generation,
    record_count: Some(total_records),
    secondary_ips: Some(current_secondary_ips),  // Store for next reconciliation
};
}

Why This Matters:

Self-healing: When secondary pods are rescheduled/restarted and get new IPs, zones automatically update
No manual intervention: Primary zones always have correct secondary IPs for zone transfers
Automatic recovery: Zone transfers resume within one reconciliation period (~5-10 minutes) after IP changes
Minimal overhead: Leverages existing reconciliation loop, no additional watchers needed

Concurrency Model

Bindy uses Rust’s async/await with Tokio runtime:

#[tokio::main]
async fn main() -> Result<()> {
    // Spawn multiple reconcilers concurrently
    tokio::try_join!(
        run_bind9instance_controller(),
        run_dnszone_controller(),
        run_record_controllers(),
    )?;
    Ok(())
}

Benefits:

Concurrent reconciliation - Multiple resources reconciled simultaneously
Non-blocking I/O - Efficient API server communication
Low memory footprint - Async tasks use minimal memory
High throughput - Handle thousands of DNS records efficiently

Resource Watching

The controller uses Kubernetes watch API with reflector caching:

#![allow(unused)]
fn main() {
let api: Api<DNSZone> = Api::all(client);
let watcher = watcher(api, ListParams::default());

// Reflector caches resources locally
let store = reflector::store::Writer::default();
let reader = store.as_reader();
let reflector = reflector(store, watcher);

// Process events
while let Some(event) = stream.try_next().await? {
    match event {
        Applied(zone) => reconcile_zone(zone).await?,
        Deleted(zone) => cleanup_zone(zone).await?,
        Restarted(_) => refresh_all().await?,
    }
}
}

Error Handling

Multi-layer error handling strategy:

Validation Errors - Caught early, reported in status
Reconciliation Errors - Retried with exponential backoff
Fatal Errors - Logged and cause controller restart
Status Reporting - All errors visible in resource status

#![allow(unused)]
fn main() {
match reconcile_zone(&zone).await {
    Ok(_) => update_status(Ready, "Synchronized"),
    Err(e) => {
        log::error!("Failed to reconcile zone: {}", e);
        update_status(NotReady, e.to_string());
        // Requeue for retry
        Err(e)
    }
}
}

Performance Optimizations

1. Incremental Updates

Only regenerate zone files when records change, not on every reconciliation.

2. Caching

Local cache of BIND9 instances to avoid repeated API calls.

3. Batch Processing

Group related updates to minimize BIND9 reloads.

4. Zero-Copy Operations

Use string slicing and references to avoid unnecessary allocations.

5. Compiled Binary

Rust compilation produces optimized native code with no runtime overhead.

Security Architecture

RBAC

Controller uses least-privilege service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bind9-controller
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bind9-controller
rules:
  - apiGroups: ["bindy.firestoned.io"]
    resources: ["dnszones", "arecords", ...]
    verbs: ["get", "list", "watch", "update"]

Non-Root Containers

Controller runs as non-root user:

USER 65532:65532

Network Policies

Limit controller network access:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-controller
spec:
  podSelector:
    matchLabels:
      app: bind9-controller
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 443  # API server only

Scalability

Horizontal Scaling - Operator Leader Election

Multiple controller replicas use Kubernetes Lease-based leader election for high availability:

sequenceDiagram
    participant O1 as Operator Instance 1
    participant O2 as Operator Instance 2
    participant L as Kubernetes Lease
    participant K as Kubernetes API

    O1->>L: Acquire lease
    L-->>O1: Lease granted
    O1->>K: Start reconciliation
    O2->>L: Try acquire lease
    L-->>O2: Lease already held
    O2->>O2: Wait in standby

    Note over O1: Instance fails
    O2->>L: Acquire lease
    L-->>O2: Lease granted
    O2->>K: Start reconciliation

Implementation:

#![allow(unused)]
fn main() {
// Create lease manager with configuration
let lease_manager = LeaseManagerBuilder::new(client.clone(), &lease_name)
    .with_namespace(&lease_namespace)
    .with_identity(&identity)
    .with_duration(Duration::from_secs(15))
    .with_grace(Duration::from_secs(2))
    .build()
    .await?;

// Watch leadership status
let (leader_rx, lease_handle) = lease_manager.watch().await;

// Run controllers with leader monitoring
tokio::select! {
    result = monitor_leadership(leader_rx) => {
        warn!("Leadership lost! Stopping all controllers...");
    }
    result = run_all_controllers() => {
        // Normal controller execution
    }
}
}

Failover characteristics:

Lease duration: 15 seconds (configurable)
Automatic failover: ~15 seconds if leader fails
Zero data loss: New leader resumes from Kubernetes state
Multiple replicas: Support for 2-5+ operator instances

Resource Limits

Recommended production configuration:

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

Can handle:

1000+ DNS zones
10,000+ DNS records
<100ms average reconciliation time

Next Steps

Custom Resource Definitions - CRD specifications
Controller Design - Implementation details
Performance Tuning - Optimization strategies

Technical Architecture

System Overview

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph namespace["DNS System Namespace (dns-system)"]
            subgraph controller["Rust Controller Pod"]
                subgraph eventloop["Main Event Loop<br/>(runs concurrently via Tokio)"]
                    dnszone_ctrl["DNSZone Controller"]
                    arecord_ctrl["ARecord Controller"]
                    txt_ctrl["TXTRecord Controller"]
                    cname_ctrl["CNAMERecord Controller"]
                end

                subgraph reconcilers["Reconcilers"]
                    rec_dnszone["reconcile_dnszone()"]
                    rec_a["reconcile_a_record()"]
                    rec_txt["reconcile_txt_record()"]
                    rec_cname["reconcile_cname_record()"]
                end

                subgraph manager["BIND9 Manager"]
                    create_zone["create_zone_file()"]
                    add_a["add_a_record()"]
                    add_txt["add_txt_record()"]
                    delete_zone["delete_zone()"]
                end
            end

            subgraph bind9["BIND9 Instance Pods (scaled)"]
                zones["/etc/bind/zones/db.example.com<br/>/etc/bind/zones/db.internal.local<br/>..."]
            end
        end

        subgraph etcd["Custom Resources (in etcd)"]
            instances["• Bind9Instance (primary-dns, secondary-dns)"]
            dnszones["• DNSZone (example-com, internal-local)"]
            arecords["• ARecord (www, api, db, ...)"]
            txtrecords["• TXTRecord (spf, dmarc, ...)"]
            cnamerecords["• CNAMERecord (blog, cache, ...)"]
        end
    end

    eventloop --> reconcilers
    reconcilers --> manager
    manager --> bind9

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style namespace fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style eventloop fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reconcilers fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style manager fill:#ffe0b2,stroke:#e65100,stroke-width:2px
    style bind9 fill:#f1f8e9,stroke:#33691e,stroke-width:2px
    style etcd fill:#e0f2f1,stroke:#004d40,stroke-width:2px

Control Flow

1. DNSZone Creation Flow

User creates DNSZone
    ↓
Kubernetes API Server stores in etcd
    ↓
Watch event triggered
    ↓
Controller receives event (via kube-rs runtime)
    ↓
reconcile_dnszone_wrapper() called
    ↓
reconcile_dnszone() logic:
  1. Extract DNSZone spec
  2. Evaluate instanceSelector against Bind9Instance labels
  3. Find matching instances (e.g., 2 matching)
  4. Call zone_manager.create_zone_file()
  5. Zone file created in /etc/bind/zones/db.example.com
  6. Update DNSZone status with "Ready" condition
    ↓
Status Update (via API)
    ↓
Done, requeue after 5 minutes

2. Record Creation Flow

User creates ARecord
    ↓
Kubernetes API Server stores in etcd
    ↓
Watch event triggered
    ↓
Controller receives event
    ↓
reconcile_a_record_wrapper() called
    ↓
reconcile_a_record() logic:
  1. Extract ARecord spec (zone, name, ip, ttl)
  2. Call zone_manager.add_a_record()
  3. Record appended to zone file
  4. Update ARecord status with "Ready" condition
    ↓
Status Update (via API)
    ↓
Done, requeue after 5 minutes

Concurrency Model

graph TB
    subgraph runtime["Main Tokio Runtime"]
        dnszone_task["DNSZone Controller Task<br/>(watches DNSZone resources)"]
        arecord_task["ARecord Controller Task<br/>(concurrent)"]
        txt_task["TXTRecord Controller Task<br/>(concurrent)"]
        cname_task["CNAME Controller Task<br/>(concurrent)"]

        dnszone_task --> arecord_task
        arecord_task --> txt_task
        txt_task --> cname_task
    end

    note["All tasks run concurrently via Tokio's<br/>thread pool without blocking each other."]

    style runtime fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style dnszone_task fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style arecord_task fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style txt_task fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style cname_task fill:#fce4ec,stroke:#880e4f,stroke-width:2px
    style note fill:#fffde7,stroke:#f57f17,stroke-width:1px

Data Structures

CRD Type Hierarchy

trait CustomResource (from kube-derive)
    │
    ├─→ Bind9Instance
    │       └─ spec: Bind9InstanceSpec
    │       └─ status: Bind9InstanceStatus
    │
    ├─→ DNSZone
    │       └─ spec: DNSZoneSpec
    │       │        ├─ zone_name: String
    │       │        ├─ instance_selector: LabelSelector
    │       │        └─ soa_record: SOARecord
    │       └─ status: DNSZoneStatus
    │
    ├─→ ARecord
    │       └─ spec: ARecordSpec
    │       └─ status: RecordStatus
    │
    ├─→ TXTRecord
    │       └─ spec: TXTRecordSpec
    │       └─ status: RecordStatus
    │
    └─→ CNAMERecord
            └─ spec: CNAMERecordSpec
            └─ status: RecordStatus

Label Selector

LabelSelector
    ├─ match_labels: Option<BTreeMap<String, String>>
    │       └─ "dns-role": "primary"
    │       └─ "environment": "production"
    │
    └─ match_expressions: Option<Vec<LabelSelectorRequirement>>
            ├─ key: "dns-role"
            │  operator: "In"
            │  values: ["primary", "secondary"]
            │
            └─ key: "environment"
               operator: "In"
               values: ["production", "staging"]

Zone File Generation

Input: DNSZone resource
    │
    ├─ zone_name: "example.com"
    ├─ soa_record:
    │   ├─ primary_ns: "ns1.example.com."
    │   ├─ admin_email: "admin@example.com"
    │   ├─ serial: 2024010101
    │   ├─ refresh: 3600
    │   ├─ retry: 600
    │   ├─ expire: 604800
    │   └─ negative_ttl: 86400
    │
    └─ ttl: 3600

Processing:
    1. Create file: /etc/bind/zones/db.example.com
    2. Write SOA record header
    3. Add NS record for primary
    4. Set default TTL

Output: /etc/bind/zones/db.example.com
    │
    ├─ $TTL 3600
    ├─ @ IN SOA ns1.example.com. admin.example.com. (
    │       2024010101  ; serial
    │       3600        ; refresh
    │       600         ; retry
    │       604800      ; expire
    │       86400 )     ; minimum
    ├─ @ IN NS ns1.example.com.
    │
    └─ (waiting for record additions)

Then for each ARecord, TXTRecord, etc:
    Append:
    www 300 IN A 192.0.2.1
    @ 3600 IN TXT "v=spf1 include:_spf.example.com ~all"
    blog 300 IN CNAME www.example.com.

Error Handling Strategy

Reconciliation Error
    │
    ├─→ Log error with context
    ├─→ Update resource status with error condition
    ├─→ Return error to controller
    │
    └─→ Error Policy Handler:
        ├─ If transient (file not found, etc.)
        │   └─ Requeue after 30 seconds (exponential backoff possible)
        │
        └─ If persistent (validation error, etc.)
            └─ Log and skip (manual intervention needed)

Dependencies Flow

main.rs
    ├─→ crd.rs (type definitions)
    │   ├─ Bind9Instance
    │   ├─ DNSZone
    │   ├─ ARecord
    │   ├─ TXTRecord
    │   ├─ CNAMERecord
    │   └─ LabelSelector
    │
    ├─→ bind9.rs (zone management)
    │   └─ Bind9Manager
    │
    ├─→ reconcilers/
    │   ├─ dnszone.rs
    │   │   ├─ reconcile_dnszone()
    │   │   ├─ delete_dnszone()
    │   │   └─ update_status()
    │   │
    │   └─ records.rs
    │       ├─ reconcile_a_record()
    │       ├─ reconcile_txt_record()
    │       └─ reconcile_cname_record()
    │
    └─→ Tokio (async runtime)
        └─ kube-rs (Kubernetes client)

Performance Characteristics

Memory Layout

Rust Controller (typical): ~50MB
    ├─ Binary loaded: ~20MB
    ├─ Tokio runtime: ~10MB
    ├─ In-flight reconciliations: ~5MB
    ├─ Caches/buffers: ~5MB
    └─ Misc overhead: ~10MB

vs Python Operator: ~250MB+
    ├─ Python interpreter: ~50MB
    ├─ Dependencies: ~100MB
    ├─ Kopf framework: ~50MB
    └─ Runtime data: ~50MB+

Latency Profile

Operation                    Rust         Python
─────────────────────────────────────────────────
Create DNSZone              <100ms       500-1000ms
Add A Record                <50ms        200-500ms
Evaluate label selector     <20ms        100-300ms
Update status              <30ms        150-300ms
Controller startup         <1s          5-10s
Full zone reconciliation   <500ms       2-5s

Scalability

With Rust Controller:
    • 10 zones: <1s reconciliation
    • 100 zones: <5s reconciliation
    • 1000 records: <10s total reconciliation
    • Handles hundreds of events/sec

vs Python Operator:
    • 10 zones: 5-10s reconciliation
    • 100 zones: 50-100s reconciliation
    • 1000 records: 30-60s total reconciliation
    • Struggles with >10 events/sec

RBAC Requirements

cluster-role: bind9-controller
    │
    ├─ [get, list, watch] on dnszones
    ├─ [get, list, watch] on arecords
    ├─ [get, list, watch] on txtrecords
    ├─ [get, list, watch] on cnamerecords
    ├─ [get, list, watch] on bind9instances
    │
    └─ [update, patch] on [*/status]
        └─ (for updating status subresources)

State Management

Kubernetes etcd (Source of Truth)
    │
    ├─→ Store DNSZone resources
    ├─→ Store Record resources
    ├─→ Store status conditions
    │
    └─→ Controller watches via kube-rs
        │
        ├─→ Detects changes
        ├─→ Triggers reconciliation
        ├─→ Generates zone files
        │
        └─→ BIND9 pod reads zone files
            ├─→ Loads into memory
            └─→ Serves DNS queries

Extension Points

Current Implementation:
    • DNSZone → Zone file creation
    • ARecord → A record addition
    • TXTRecord → TXT record addition
    • CNAMERecord → CNAME record addition

Future Extensions (easy to add):
    • AAAARecord → IPv6 support
    • MXRecord → Mail record support
    • NSRecord → Nameserver support
    • SRVRecord → Service record support
    • Health endpoints → Liveness/readiness
    • Metrics → Prometheus integration
    • Webhooks → Custom validation
    • Finalizers → Graceful cleanup

This architecture provides a clean, performant, and extensible foundation for managing DNS infrastructure in Kubernetes.

HTTP API Sidecar Architecture

This page provides a detailed overview of Bindy’s architecture that uses an HTTP API sidecar (bindcar) to manage BIND9 instances. The sidecar executes RNDC commands locally within the pod, providing a modern RESTful interface for DNS management.

High-Level Architecture

graph TB
    subgraph k8s["Kubernetes Cluster"]
        subgraph crds["Custom Resource Definitions (CRDs)"]
            cluster["Bind9Cluster<br/>(cluster config)"]
            instance["Bind9Instance"]
            zone["DNSZone"]
            records["ARecord, AAAARecord,<br/>TXTRecord, MXRecord, etc."]

            cluster --> instance
            instance --> zone
            zone --> records
        end

        subgraph controller["Bindy Controller (Rust)"]
            rec1["Bind9Cluster<br/>Reconciler"]
            rec2["Bind9Instance<br/>Reconciler"]
            rec3["DNSZone<br/>Reconciler"]
            rec4["DNS Record<br/>Reconcilers"]
            manager["Bind9Manager (RNDC Client)<br/>• add_zone() • reload_zone()<br/>• delete_zone() • notify_zone()<br/>• zone_status() • freeze_zone()"]
        end

        subgraph bind9["BIND9 Instances (Pods)"]
            subgraph primary_pod["Primary Pod (bind9-primary)"]
                primary_bind["BIND9 Container<br/>• rndc daemon (localhost:953)<br/>• DNS (port 53)<br/>Dynamic zones:<br/>- example.com<br/>- internal.local"]
                primary_api["Bindcar API Sidecar<br/>• HTTP API (port 80→8080)<br/>• ServiceAccount auth<br/>• Local RNDC client<br/>• Zone file management"]

                primary_api -->|"rndc localhost:953"| primary_bind
            end

            subgraph secondary_pod["Secondary Pod (bind9-secondary)"]
                secondary_bind["BIND9 Container<br/>• rndc daemon (localhost:953)<br/>• DNS (port 53)<br/>Transferred zones:<br/>- example.com<br/>- internal.local"]
                secondary_api["Bindcar API Sidecar<br/>• HTTP API (port 80→8080)<br/>• ServiceAccount auth<br/>• Local RNDC client"]

                secondary_api -->|"rndc localhost:953"| secondary_bind
            end
        end

        secrets["RNDC Keys (Secrets)<br/>• bind9-primary-rndc-key<br/>• bind9-secondary-rndc-key<br/>(HMAC-SHA256)"]
        volumes["Shared Volumes<br/>• /var/cache/bind (zone files)<br/>• /etc/bind/keys (RNDC keys, read-only for API)"]
    end

    clients["DNS Clients<br/>• Applications<br/>• Services<br/>• External users"]

    crds -->|"watches<br/>(Kubernetes API)"| controller
    controller -->|"HTTP API<br/>(REST/JSON)<br/>Port 80/TCP"| bind9
    volumes -.->|"mounts"| primary_pod
    volumes -.->|"mounts"| secondary_pod
    primary_bind -->|"AXFR/IXFR"| secondary_bind
    secondary_bind -.->|"IXFR"| primary_bind
    bind9 -->|"DNS Queries<br/>(UDP/TCP 53)"| clients
    secrets -.->|"authenticates"| primary_api
    secrets -.->|"authenticates"| secondary_api

    style k8s fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style crds fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style controller fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style bind9 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style secrets fill:#ffe0b2,stroke:#e65100,stroke-width:2px
    style clients fill:#fce4ec,stroke:#880e4f,stroke-width:2px

Key Architectural Changes from File-Based Approach

Old Architecture (File-Based)

Controller generated zone files
Files written to ConfigMaps
ConfigMaps mounted into BIND9 pods
Manual rndc reload triggered after file changes
Complex synchronization between ConfigMaps and BIND9 state

New Architecture (RNDC Protocol + Cluster Hierarchy)

Three-tier resource model: Bind9Cluster → Bind9Instance → DNSZone
Controller uses native RNDC protocol
Direct communication with BIND9 via port 953
Commands executed in real-time: addzone, delzone, reload
No file manipulation or ConfigMap management
BIND9 manages zone files internally with dynamic updates
Atomic operations with immediate feedback
Cluster-level config sharing (version, TSIG keys, ACLs)

Three-Tier Resource Model

1. Bind9Cluster (Cluster Configuration)

Defines shared configuration for a logical group of BIND9 instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  config:
    recursion: false
    dnssec:
      enabled: true
      validation: true
    allowQuery:
      - any
    allowTransfer:
      - 10.0.0.0/8
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: base64-encoded-key
  acls:
    internal:
      - 10.0.0.0/8
      - 172.16.0.0/12

2. Bind9Instance (Instance Deployment)

References a cluster and deploys BIND9 pods:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-primary
spec:
  clusterRef: production-dns  # References Bind9Cluster
  role: primary
  replicas: 2

The instance inherits configuration from the cluster but can override specific settings.

3. DNSZone (Zone Definition)

References an instance and creates zones via RNDC:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
spec:
  zoneName: example.com
  clusterRef: dns-primary  # References Bind9Instance
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101

RNDC Protocol Communication

┌──────────────────────┐                 ┌──────────────────────┐
│  Bindy Controller    │                 │   BIND9 Instance     │
│                      │                 │   (Primary)          │
│  ┌────────────────┐  │                 │                      │
│  │ Bind9Manager   │  │                 │  ┌────────────────┐  │
│  │                │  │   TCP Port 953  │  │  rndc daemon   │  │
│  │ RndcClient     │──┼────────────────▶│  │                │  │
│  │  • Server URL  │  │  TSIG Auth      │  │  Validates:    │  │
│  │  • Algorithm   │  │  HMAC-SHA256    │  │  • Key name    │  │
│  │  • Secret Key  │  │                 │  │  • Signature   │  │
│  │                │  │                 │  │  • Timestamp   │  │
│  └────────────────┘  │                 │  └────────────────┘  │
│         │            │                 │         │            │
│         │ Commands:  │                 │         │            │
│         │            │                 │         ▼            │
│    addzone zone {    │                 │  ┌────────────────┐  │
│      type master;    │                 │  │ BIND9 named    │  │
│      file "x.zone";  │────────────────▶│  │                │  │
│    };                │                 │  │ • Creates zone │  │
│                      │◀────────────────│  │ • Loads into   │  │
│    Success/Error     │    Response     │  │   memory       │  │
│                      │                 │  │ • Writes file  │  │
│                      │                 │  └────────────────┘  │
└──────────────────────┘                 └──────────────────────┘

RNDC Authentication Flow

┌────────────────────────────────────────────────────────────────┐
│  1. Controller Retrieves RNDC Key from Kubernetes Secret      │
│                                                                │
│  Secret: bind9-primary-rndc-key                              │
│    data:                                                      │
│      key-name: "bind9-primary"                               │
│      algorithm: "hmac-sha256"                                │
│      secret: "base64-encoded-256-bit-key"                    │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  2. Create RndcClient Instance                                │
│                                                                │
│  let client = RndcClient::new(                                │
│      "bind9-primary.dns-system.svc.cluster.local:953",       │
│      "hmac-sha256",                                           │
│      "base64-secret-key"                                      │
│  );                                                           │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  3. Execute RNDC Command with TSIG Authentication             │
│                                                                │
│  TSIG Signature = HMAC-SHA256(                                │
│      key: secret,                                             │
│      data: command + timestamp + nonce                        │
│  )                                                            │
│                                                                │
│  Request packet:                                              │
│    • Command: "addzone example.com { type master; ... }"     │
│    • TSIG record with signature                              │
│    • Timestamp                                                │
└────────────────────────────────────────────────────────────────┘
                         │
                         ▼
┌────────────────────────────────────────────────────────────────┐
│  4. BIND9 Validates Request                                   │
│                                                                │
│  • Looks up key "bind9-primary" in rndc.key file             │
│  • Verifies HMAC-SHA256 signature matches                    │
│  • Checks timestamp is within acceptable window              │
│  • Executes command if valid                                 │
│  • Returns success/error with TSIG-signed response           │
└────────────────────────────────────────────────────────────────┘

Data Flow: Zone Creation

User creates DNSZone resource
    │
    │ kubectl apply -f dnszone.yaml
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Kubernetes API Server stores DNSZone in etcd            │
└─────────────────────────────────────────────────────────┘
    │
    │ Watch event
    ▼
┌─────────────────────────────────────────────────────────┐
│ Bindy Controller receives event                         │
│   • DNSZone watcher triggers                            │
│   • Event: Applied(dnszone)                             │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ reconcile_dnszone() called                              │
│   1. Extract namespace and name                         │
│   2. Get zone spec (zone_name, cluster_ref, etc.)      │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Find PRIMARY pod for cluster                            │
│   • List pods with labels:                              │
│     app=bind9, instance={cluster_ref}                   │
│   • Select first running pod                            │
│   • Build server address:                               │
│     "{cluster_ref}.{namespace}.svc.cluster.local:953"   │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Load RNDC key from Secret                               │
│   • Secret name: "{cluster_ref}-rndc-key"              │
│   • Parse key-name, algorithm, secret                   │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Execute RNDC addzone command                            │
│   zone_manager.add_zone(                                │
│       zone_name: "example.com",                         │
│       zone_type: "master",                              │
│       zone_file: "/var/lib/bind/example.com.zone",     │
│       server: "bind9-primary...:953",                   │
│       key_data: RndcKeyData { ... }                     │
│   )                                                     │
└─────────────────────────────────────────────────────────┘
    │
    │ RNDC Protocol (Port 953)
    ▼
┌─────────────────────────────────────────────────────────┐
│ BIND9 Instance executes command                         │
│   • Creates zone configuration                          │
│   • Allocates memory for zone                           │
│   • Creates zone file /var/lib/bind/example.com.zone   │
│   • Loads zone into memory                              │
│   • Starts serving DNS queries for zone                 │
│   • Returns success response                            │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Update DNSZone status                                   │
│   status:                                               │
│     conditions:                                         │
│       - type: Ready                                     │
│         status: "True"                                  │
│         message: "Zone created for cluster: ..."        │
└─────────────────────────────────────────────────────────┘

Data Flow: Record Addition

User creates ARecord resource
    │
    │ kubectl apply -f arecord.yaml
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Kubernetes API Server stores ARecord in etcd            │
└─────────────────────────────────────────────────────────┘
    │
    │ Watch event
    ▼
┌─────────────────────────────────────────────────────────┐
│ Bindy Controller receives event                         │
│   • ARecord watcher triggers                            │
│   • Event: Applied(arecord)                             │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ reconcile_a_record() called                             │
│   1. Extract namespace and name                         │
│   2. Get spec (zone, name, ipv4_address, ttl)          │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Find cluster from zone                                   │
│   • List DNSZone resources in namespace                 │
│   • Find zone matching spec.zone                        │
│   • Extract zone.spec.cluster_ref                       │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Load RNDC key and build server address                  │
│   • Load "{cluster_ref}-rndc-key" Secret               │
│   • Server: "{cluster_ref}.{namespace}.svc:953"        │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Add record via RNDC (PLACEHOLDER - Future nsupdate)     │
│   zone_manager.add_a_record(                            │
│       zone: "example.com",                              │
│       name: "www",                                      │
│       ipv4: "192.0.2.1",                               │
│       ttl: Some(300),                                   │
│       server: "bind9-primary...:953",                   │
│       key_data: RndcKeyData { ... }                     │
│   )                                                     │
│                                                         │
│ NOTE: Currently logs intent. Full implementation will   │
│       use nsupdate protocol for dynamic DNS updates.    │
└─────────────────────────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────────────────────────┐
│ Update ARecord status                                   │
│   status:                                               │
│     conditions:                                         │
│       - type: Ready                                     │
│         status: "True"                                  │
│         message: "A record created"                     │
└─────────────────────────────────────────────────────────┘

RNDC Commands Supported

The Bind9Manager provides the following RNDC operations:

Zone Management

┌────────────────────┬─────────────────────────────────────────┐
│ Operation          │ RNDC Command                            │
├────────────────────┼─────────────────────────────────────────┤
│ add_zone()         │ addzone <zone> { type <type>;           │
│                    │                   file "<file>"; };     │
│                    │                                         │
│ delete_zone()      │ delzone <zone>                          │
│                    │                                         │
│ reload_zone()      │ reload <zone>                           │
│                    │                                         │
│ reload_all_zones() │ reload                                  │
│                    │                                         │
│ retransfer_zone()  │ retransfer <zone>                       │
│                    │                                         │
│ notify_zone()      │ notify <zone>                           │
│                    │                                         │
│ freeze_zone()      │ freeze <zone>                           │
│                    │                                         │
│ thaw_zone()        │ thaw <zone>                             │
│                    │                                         │
│ zone_status()      │ zonestatus <zone>                       │
│                    │                                         │
│ server_status()    │ status                                  │
└────────────────────┴─────────────────────────────────────────┘

Record Management (Planned)

Currently implemented as placeholders:
  • add_a_record()      (will use nsupdate protocol)
  • add_aaaa_record()   (will use nsupdate protocol)
  • add_txt_record()    (will use nsupdate protocol)
  • add_cname_record()  (will use nsupdate protocol)
  • add_mx_record()     (will use nsupdate protocol)
  • add_ns_record()     (will use nsupdate protocol)
  • add_srv_record()    (will use nsupdate protocol)
  • add_caa_record()    (will use nsupdate protocol)

Note: RNDC protocol doesn't support individual record operations.
These will be implemented using the nsupdate protocol for dynamic
DNS updates, or via zone file manipulation + reload.

Pod Discovery and Networking

┌────────────────────────────────────────────────────────────┐
│ Controller discovers BIND9 pods using labels:              │
│                                                            │
│   Pod labels:                                             │
│     app: bind9                                            │
│     instance: {cluster_ref}                               │
│                                                            │
│   Controller searches:                                    │
│     List pods where app=bind9 AND instance={cluster_ref}  │
│                                                            │
│   Service DNS:                                            │
│     {cluster_ref}.{namespace}.svc.cluster.local:953      │
│                                                            │
│   Example:                                                │
│     bind9-primary.dns-system.svc.cluster.local:953       │
└────────────────────────────────────────────────────────────┘

Zone Transfers (AXFR/IXFR)

Primary Instance                    Secondary Instance
┌─────────────────┐                ┌─────────────────┐
│ example.com     │                │                 │
│ Serial: 2024010│                │                 │
│                 │   1. NOTIFY    │                 │
│                 │───────────────▶│                 │
│                 │                │                 │
│                 │   2. SOA Query │                 │
│                 │◀───────────────│  Checks serial  │
│                 │                │                 │
│                 │   3. AXFR/IXFR │                 │
│                 │◀───────────────│  Serial outdated│
│                 │                │                 │
│  Sends full     │   Zone data    │                 │
│  zone (AXFR) or │───────────────▶│  Updates zone   │
│  delta (IXFR)   │                │  Serial: 2024010│
│                 │                │                 │
└─────────────────┘                └─────────────────┘

Triggered by:
  • zone_manager.notify_zone()
  • zone_manager.retransfer_zone()
  • BIND9 automatic refresh timers (SOA refresh value)

Components Deep Dive

1. Bind9Manager

Rust struct that wraps the rndc crate for BIND9 management:

#![allow(unused)]
fn main() {
pub struct Bind9Manager;

impl Bind9Manager {
    pub fn new() -> Self { Self }

    // RNDC key generation
    pub fn generate_rndc_key() -> RndcKeyData { ... }
    pub fn create_rndc_secret_data(key_data: &RndcKeyData) -> BTreeMap<String, String> { ... }
    pub fn parse_rndc_secret_data(data: &BTreeMap<String, Vec<u8>>) -> Result<RndcKeyData> { ... }

    // Core RNDC operations
    async fn exec_rndc_command(&self, server: &str, key_data: &RndcKeyData, command: &str) -> Result<String> { ... }

    // Zone management
    pub async fn add_zone(&self, zone_name: &str, zone_type: &str, zone_file: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn delete_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn reload_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
    pub async fn notify_zone(&self, zone_name: &str, server: &str, key_data: &RndcKeyData) -> Result<()> { ... }
}
}

2. RndcKeyData

Struct for RNDC authentication:

#![allow(unused)]
fn main() {
pub struct RndcKeyData {
    pub name: String,      // Key name (e.g., "bind9-primary")
    pub algorithm: String, // HMAC algorithm (e.g., "hmac-sha256")
    pub secret: String,    // Base64-encoded secret key
}
}

3. Reconcilers

Zone reconciler using RNDC:

#![allow(unused)]
fn main() {
pub async fn reconcile_dnszone(
    client: Client,
    dnszone: DNSZone,
    zone_manager: &Bind9Manager,
) -> Result<()> {
    // 1. Find PRIMARY pod
    let primary_pod = find_primary_pod(&client, &namespace, &cluster_ref).await?;

    // 2. Load RNDC key
    let key_data = load_rndc_key(&client, &namespace, &cluster_ref).await?;

    // 3. Build server address
    let server = format!("{}.{}.svc.cluster.local:953", cluster_ref, namespace);

    // 4. Add zone via RNDC
    zone_manager.add_zone(&zone_name, "master", &zone_file, &server, &key_data).await?;

    // 5. Update status
    update_status(&client, &dnszone, "Ready", "True", "Zone created").await?;

    Ok(())
}
}

Security Architecture

TSIG Authentication

┌────────────────────────────────────────────────────────────┐
│ TSIG (Transaction Signature) provides:                     │
│                                                            │
│  1. Authentication - Verifies command source               │
│  2. Integrity - Prevents command tampering                 │
│  3. Replay protection - Timestamp validation               │
│                                                            │
│ Algorithm: HMAC-SHA256 (256-bit keys)                     │
│ Key Storage: Kubernetes Secrets (base64-encoded)          │
│ Key Generation: Random 256-bit keys per instance          │
└────────────────────────────────────────────────────────────┘

Network Security

┌────────────────────────────────────────────────────────────┐
│ • RNDC traffic on port 953/TCP (not exposed externally)   │
│ • DNS queries on port 53/UDP+TCP (exposed via Service)    │
│ • All RNDC communication within cluster network           │
│ • No external RNDC access (ClusterIP services only)       │
│ • NetworkPolicies can restrict RNDC access to controller  │
└────────────────────────────────────────────────────────────┘

RBAC Requirements

# Controller needs access to:
- Secrets (get, list) - for RNDC keys
- Pods (get, list) - for pod discovery
- Services (get, list) - for DNS resolution
- DNSZone, ARecord, etc. (get, list, watch, update status)

Performance Characteristics

Latency

Operation                    Old (File-based)    New (RNDC)
─────────────────────────────────────────────────────────────
Create DNSZone              2-5 seconds          <500ms
Add DNS Record              1-3 seconds          <200ms
Delete DNSZone              2-4 seconds          <500ms
Zone reload                 1-2 seconds          <300ms
Status check                N/A                  <100ms

Benefits of RNDC Protocol

✓ Atomic operations - Commands succeed or fail atomically
✓ Real-time feedback - Immediate success/error responses
✓ No ConfigMap overhead - No intermediate Kubernetes resources
✓ Direct control - Native BIND9 management interface
✓ Better error messages - BIND9 provides detailed errors
✓ Zone status queries - Can check zone state anytime
✓ Freeze/thaw support - Control dynamic updates precisely
✓ Notify support - Trigger zone transfers on demand

Future Enhancements

1. nsupdate Protocol Integration

Implement dynamic DNS updates for individual records:
  • Use nsupdate protocol alongside RNDC
  • Add/update/delete individual A, AAAA, TXT, etc. records
  • No full zone reload needed for record changes
  • Even lower latency for record operations

2. Zone Transfer Monitoring

Monitor AXFR/IXFR operations:
  • Track transfer status
  • Report transfer errors
  • Automatic retry on failures

3. Health Checks

Periodic health checks using RNDC:
  • server_status() - overall server health
  • zone_status() - per-zone health
  • Update CRD status with health information

Next Steps

BIND9 Integration Deep Dive - Implementation details
DNSZone Spec - DNSZone resource reference
Operations Guide - Production configuration

Architecture Diagrams

Comprehensive visual diagrams showing Bindy’s architecture, components, and data flows.

System Architecture

graph TB
    subgraph "Kubernetes Cluster"
        subgraph "Custom Resources"
            BC[Bind9Cluster]
            BI[Bind9Instance]
            DZ[DNSZone]
            AR[ARecord]
            CR[CNAMERecord]
            MR[MXRecord]
            TR[TXTRecord]
        end

        subgraph "Bindy Controller (Rust)"
            WA[Watch API<br/>kube-rs]

            subgraph "Reconcilers"
                BCR[Bind9Cluster<br/>Reconciler]
                BIR[Bind9Instance<br/>Reconciler]
                DZR[DNSZone<br/>Reconciler]
                RR[Record<br/>Reconcilers]
            end

            subgraph "Core Components"
                BM[Bind9Manager<br/>RNDC Client]
                RES[Resource<br/>Builders]
            end
        end

        subgraph "Kubernetes Resources"
            DEP[Deployments]
            CM[ConfigMaps]
            SEC[Secrets]
            SVC[Services]
        end

        subgraph "BIND9 Pods"
            P1[Primary DNS<br/>us-east]
            P2[Secondary DNS<br/>us-west]
            P3[Secondary DNS<br/>eu]
        end
    end

    subgraph "External"
        CLI[DNS Clients]
    end

    %% Custom Resource relationships
    BC -.inherits.-> BI
    BI -.references.-> DZ
    DZ -.contains.-> AR
    DZ -.contains.-> CR
    DZ -.contains.-> MR
    DZ -.contains.-> TR

    %% Watch relationships
    BC --> WA
    BI --> WA
    DZ --> WA
    AR --> WA
    CR --> WA
    MR --> WA
    TR --> WA

    %% Reconciler routing
    WA --> BCR
    WA --> BIR
    WA --> DZR
    WA --> RR

    %% Component interactions
    BCR --> RES
    BIR --> RES
    DZR --> BM
    RR --> BM

    %% K8s resource creation
    RES --> DEP
    RES --> CM
    RES --> SEC
    RES --> SVC

    %% RNDC communication
    BM -.RNDC:953.-> P1
    BM -.RNDC:953.-> P2
    BM -.RNDC:953.-> P3

    %% DNS deployment
    DEP --> P1
    DEP --> P2
    DEP --> P3
    CM --> P1
    CM --> P2
    CM --> P3
    SEC --> P1

    %% Zone transfers
    P1 -.AXFR/IXFR.-> P2
    P1 -.AXFR/IXFR.-> P3

    %% DNS queries
    CLI -.DNS:53.-> P1
    CLI -.DNS:53.-> P2
    CLI -.DNS:53.-> P3

    style BC fill:#e1f5ff
    style BI fill:#e1f5ff
    style DZ fill:#e1f5ff
    style AR fill:#fff4e1
    style CR fill:#fff4e1
    style MR fill:#fff4e1
    style TR fill:#fff4e1
    style WA fill:#f0f0f0
    style BCR fill:#d4e8d4
    style BIR fill:#d4e8d4
    style DZR fill:#d4e8d4
    style RR fill:#d4e8d4
    style BM fill:#ffd4d4
    style RES fill:#ffd4d4

Rust Component Architecture

graph TB
    subgraph "Main Process"
        MAIN[main.rs<br/>Tokio Runtime]
    end

    subgraph "CRD Definitions (src/crd.rs)"
        CRD_BC[Bind9Cluster]
        CRD_BI[Bind9Instance]
        CRD_DZ[DNSZone]
        CRD_REC[Record Types<br/>A, AAAA, CNAME,<br/>MX, NS, TXT,<br/>SRV, CAA]
    end

    subgraph "Reconcilers (src/reconcilers/)"
        RECON_BC[bind9cluster.rs]
        RECON_BI[bind9instance.rs]
        RECON_DZ[dnszone.rs]
        RECON_REC[records.rs]
    end

    subgraph "BIND9 Management (src/bind9/)"
        BM_MGR[Bind9Manager]
        BM_KEY[RndcKeyData]
        BM_CMD[Zone Operations<br/>HTTP API & RNDC<br/>addzone, delzone,<br/>reload, freeze,<br/>thaw, notify]
    end

    subgraph "Resource Builders (src/bind9_resources.rs)"
        RB_DEP[build_deployment]
        RB_CM[build_configmap]
        RB_SVC[build_service]
        RB_VOL[build_volumes]
        RB_POD[build_podspec]
    end

    subgraph "External Dependencies"
        KUBE[kube-rs<br/>Kubernetes Client]
        RNDC[rndc-rs<br/>RNDC Protocol]
        TOKIO[tokio<br/>Async Runtime]
        SERDE[serde<br/>Serialization]
    end

    %% Main process spawns reconcilers
    MAIN --> RECON_BC
    MAIN --> RECON_BI
    MAIN --> RECON_DZ
    MAIN --> RECON_REC

    %% Reconcilers use CRD types
    RECON_BC -.uses.-> CRD_BC
    RECON_BI -.uses.-> CRD_BI
    RECON_DZ -.uses.-> CRD_DZ
    RECON_REC -.uses.-> CRD_REC

    %% Reconcilers call managers
    RECON_BI --> RB_DEP
    RECON_BI --> RB_CM
    RECON_BI --> RB_SVC
    RECON_DZ --> BM_MGR
    RECON_REC --> BM_MGR

    %% Resource builders use components
    RB_DEP --> RB_POD
    RB_DEP --> RB_VOL
    RB_CM --> RB_VOL

    %% BIND9 manager components
    BM_MGR --> BM_KEY
    BM_MGR --> BM_CMD

    %% External dependencies
    MAIN --> TOKIO
    RECON_BC --> KUBE
    RECON_BI --> KUBE
    RECON_DZ --> KUBE
    RECON_REC --> KUBE
    BM_CMD --> RNDC
    CRD_BC --> SERDE
    CRD_BI --> SERDE
    CRD_DZ --> SERDE
    CRD_REC --> SERDE

    style MAIN fill:#e1f5ff
    style CRD_BC fill:#d4e8d4
    style CRD_BI fill:#d4e8d4
    style CRD_DZ fill:#d4e8d4
    style CRD_REC fill:#d4e8d4
    style RECON_BC fill:#fff4e1
    style RECON_BI fill:#fff4e1
    style RECON_DZ fill:#fff4e1
    style RECON_REC fill:#fff4e1
    style BM_MGR fill:#ffd4d4
    style BM_KEY fill:#ffd4d4
    style BM_CMD fill:#ffd4d4
    style RB_DEP fill:#e8d4f8
    style RB_CM fill:#e8d4f8
    style RB_SVC fill:#e8d4f8
    style RB_VOL fill:#e8d4f8
    style RB_POD fill:#e8d4f8

DNS Record Creation Data Flow

sequenceDiagram
    participant User
    participant K8sAPI as Kubernetes API
    participant Watch as Watch Stream
    participant RecRec as Record Reconciler
    participant ZoneRec as DNSZone Reconciler
    participant BindMgr as Bind9Manager
    participant Primary as Primary BIND9
    participant Secondary as Secondary BIND9
    participant Client as DNS Client

    Note over User,Client: Record Creation Flow

    User->>K8sAPI: kubectl apply -f arecord.yaml
    K8sAPI->>K8sAPI: Validate CRD schema
    K8sAPI->>K8sAPI: Store in etcd
    K8sAPI-->>User: ARecord created

    K8sAPI->>Watch: Event: ARecord Added
    Watch->>RecRec: Trigger reconciliation

    RecRec->>K8sAPI: Get referenced DNSZone
    K8sAPI-->>RecRec: DNSZone details

    RecRec->>K8sAPI: Get Bind9Instance (via clusterRef)
    K8sAPI-->>RecRec: Bind9Instance details

    RecRec->>K8sAPI: Get RNDC Secret
    K8sAPI-->>RecRec: RNDC key data

    RecRec->>BindMgr: Call add_a_record()
    Note over BindMgr: Currently placeholder<br/>Will use nsupdate
    BindMgr-->>RecRec: Ok(())

    RecRec->>BindMgr: Call reload_zone(zone_name)
    BindMgr->>Primary: RNDC reload zone
    activate Primary
    Primary->>Primary: Reload zone file
    Primary-->>BindMgr: Success
    deactivate Primary
    BindMgr-->>RecRec: Zone reloaded

    RecRec->>K8sAPI: Update ARecord status
    K8sAPI-->>RecRec: Status updated

    Note over Primary,Secondary: Zone Transfer (AXFR/IXFR)

    Primary->>Secondary: NOTIFY (zone updated)
    activate Secondary
    Secondary->>Primary: SOA query (check serial)
    Primary-->>Secondary: SOA record

    alt Serial increased
        Secondary->>Primary: IXFR/AXFR request
        Primary-->>Secondary: Zone transfer
        Secondary->>Secondary: Update zone
    else Serial unchanged
        Secondary->>Secondary: No update needed
    end
    deactivate Secondary

    Note over Client,Secondary: DNS Query

    Client->>Secondary: DNS query (www.example.com A?)
    activate Secondary
    Secondary->>Secondary: Lookup in zone
    Secondary-->>Client: Answer: 192.0.2.1
    deactivate Secondary

Zone Creation and Synchronization Flow

stateDiagram-v2
    [*] --> ZoneCreated: User creates DNSZone

    ZoneCreated --> Validating: Controller watches event

    Validating --> ValidatingInstance: Validate zone spec
    ValidatingInstance --> ValidatingCluster: Find Bind9Instance
    ValidatingCluster --> GeneratingConfig: Find Bind9Cluster

    GeneratingConfig --> CreatingRNDCKey: Generate zone config
    CreatingRNDCKey --> StoringSecret: Generate RNDC key
    StoringSecret --> AddingZone: Store in Secret

    AddingZone --> ConnectingRNDC: Call rndc addzone
    ConnectingRNDC --> ExecutingCommand: Connect via port 953
    ExecutingCommand --> VerifyingZone: Execute addzone command

    VerifyingZone --> Ready: Verify zone exists
    Ready --> [*]: Update status to Ready

    ValidatingInstance --> Failed: Instance not found
    ValidatingCluster --> Failed: Cluster not found
    AddingZone --> Failed: RNDC command failed
    ConnectingRNDC --> Failed: Connection failed

    Failed --> [*]: Update status conditions

    note right of GeneratingConfig
        Creates zone with:
        - SOA record
        - Default TTL
        - Zone file path
    end note

    note right of AddingZone
        Uses RNDC protocol:
        addzone example.com
        '{ type master;
           file "zones/example.com"; }'
    end note

Primary to Secondary Zone Transfer Flow

sequenceDiagram
    participant Ctl as Bindy Controller
    participant Pri as Primary BIND9<br/>(us-east)
    participant Sec1 as Secondary BIND9<br/>(us-west)
    participant Sec2 as Secondary BIND9<br/>(eu)

    Note over Ctl,Sec2: Initial Zone Setup

    Ctl->>Pri: RNDC addzone example.com
    activate Pri
    Pri->>Pri: Create zone file
    Pri-->>Ctl: Zone added
    deactivate Pri

    Ctl->>Sec1: RNDC addzone example.com (type secondary)
    activate Sec1
    Sec1->>Sec1: Configure as secondary
    Sec1-->>Ctl: Zone added as secondary
    deactivate Sec1

    Ctl->>Sec2: RNDC addzone example.com (type secondary)
    activate Sec2
    Sec2->>Sec2: Configure as secondary
    Sec2-->>Ctl: Zone added as secondary
    deactivate Sec2

    Note over Pri,Sec2: Initial Zone Transfer

    Sec1->>Pri: SOA query (get serial)
    Pri-->>Sec1: SOA serial=2024010101
    Sec1->>Pri: AXFR request (full transfer)
    Pri-->>Sec1: Complete zone data
    Sec1->>Sec1: Write zone file

    Sec2->>Pri: SOA query (get serial)
    Pri-->>Sec2: SOA serial=2024010101
    Sec2->>Pri: AXFR request (full transfer)
    Pri-->>Sec2: Complete zone data
    Sec2->>Sec2: Write zone file

    Note over Ctl,Sec2: Record Update

    Ctl->>Ctl: User adds new ARecord
    Ctl->>Pri: Update zone + reload
    activate Pri
    Pri->>Pri: Update zone file
    Pri->>Pri: Increment serial to 2024010102
    Pri-->>Ctl: Zone reloaded
    deactivate Pri

    Note over Pri,Sec2: NOTIFY and Incremental Transfer

    Pri->>Sec1: NOTIFY (zone updated)
    Pri->>Sec2: NOTIFY (zone updated)

    activate Sec1
    Sec1->>Pri: SOA query (check serial)
    Pri-->>Sec1: SOA serial=2024010102
    Sec1->>Sec1: Compare: 2024010102 > 2024010101
    Sec1->>Pri: IXFR request (incremental)
    Pri-->>Sec1: Only changed records
    Sec1->>Sec1: Apply changes
    Sec1-->>Pri: ACK
    deactivate Sec1

    activate Sec2
    Sec2->>Pri: SOA query (check serial)
    Pri-->>Sec2: SOA serial=2024010102
    Sec2->>Sec2: Compare: 2024010102 > 2024010101
    Sec2->>Pri: IXFR request (incremental)
    Pri-->>Sec2: Only changed records
    Sec2->>Sec2: Apply changes
    Sec2-->>Pri: ACK
    deactivate Sec2

    Note over Pri,Sec2: All zones synchronized

Reconciliation Loop

flowchart TD
    Start([Watch Event Received]) --> CheckType{Event Type?}

    CheckType -->|Added/Modified| GetResource[Get Resource from API]
    CheckType -->|Deleted| Cleanup[Run Cleanup Logic]
    CheckType -->|Restarted| RefreshAll[Refresh All Resources]

    GetResource --> CheckGen{observedGeneration<br/>== metadata.generation?}
    CheckGen -->|Yes| SkipRecon[Skip: Already reconciled]
    CheckGen -->|No| ValidateSpec[Validate Spec]

    ValidateSpec --> CheckValid{Valid?}
    CheckValid -->|No| UpdateFailed[Update Status: Failed]
    CheckValid -->|Yes| Reconcile[Execute Reconciliation]

    Reconcile --> ReconcileResult{Success?}
    ReconcileResult -->|Yes| UpdateReady[Update Status: Ready]
    ReconcileResult -->|No| CheckRetry{Retryable?}

    CheckRetry -->|Yes| Requeue[Requeue with backoff]
    CheckRetry -->|No| UpdateError[Update Status: Error]

    UpdateReady --> UpdateGen[Update observedGeneration]
    UpdateError --> Requeue
    UpdateFailed --> End

    UpdateGen --> End([Done])
    Cleanup --> End
    RefreshAll --> End
    SkipRecon --> End
    Requeue --> End

    style Start fill:#e1f5ff
    style End fill:#e1f5ff
    style Reconcile fill:#d4e8d4
    style UpdateReady fill:#d4f8d4
    style UpdateError fill:#f8d4d4
    style UpdateFailed fill:#f8d4d4
    style CheckType fill:#fff4e1
    style CheckGen fill:#fff4e1
    style CheckValid fill:#fff4e1
    style ReconcileResult fill:#fff4e1
    style CheckRetry fill:#fff4e1

RNDC Protocol Communication

sequenceDiagram
    participant BM as Bind9Manager<br/>(Rust)
    participant RC as RNDC Client<br/>(rndc-rs)
    participant Net as TCP Socket<br/>:953
    participant BIND as BIND9 Server<br/>(rndc daemon)

    Note over BM,BIND: RNDC Key Setup (One-time)

    BM->>BM: generate_rndc_key()
    BM->>BM: Create HMAC-SHA256 key
    BM->>BM: Store in K8s Secret

    Note over BM,BIND: RNDC Command Execution

    BM->>RC: new(server, algorithm, secret)
    RC->>RC: Parse RNDC key
    RC->>RC: Prepare TSIG signature

    BM->>RC: rndc_command("reload zone")
    RC->>Net: Connect to server:953
    Net->>BIND: TCP handshake

    RC->>RC: Create RNDC message
    RC->>RC: Sign with HMAC-SHA256
    RC->>Net: Send signed message
    Net->>BIND: Forward RNDC message

    activate BIND
    BIND->>BIND: Verify TSIG signature
    BIND->>BIND: Execute: reload zone
    BIND->>BIND: Reload zone file
    BIND->>Net: Response + TSIG
    deactivate BIND

    Net->>RC: Receive response
    RC->>RC: Verify response TSIG
    RC->>RC: Parse result
    RC-->>BM: Ok(result.text)

    alt Authentication Failed
        BIND-->>Net: Error: TSIG verification failed
        Net-->>RC: Error response
        RC-->>BM: Err("RNDC authentication failed")
    end

    alt Command Failed
        BIND-->>Net: Error: Zone not found
        Net-->>RC: Error response
        RC-->>BM: Err("Zone not found")
    end

Multi-Cluster Deployment

graph TB
    subgraph "Cluster: us-east-1"
        BC1[Bind9Cluster:<br/>production-dns]
        BI1[Bind9Instance:<br/>primary-dns]
        DZ1[DNSZone:<br/>example.com]
        P1[Primary BIND9<br/>172.16.1.10]

        BC1 -.-> BI1
        BI1 -.-> DZ1
        DZ1 --> P1
    end

    subgraph "Cluster: us-west-2"
        BC2[Bind9Cluster:<br/>production-dns]
        BI2[Bind9Instance:<br/>secondary-dns-west]
        DZ2[DNSZone:<br/>example.com]
        S1[Secondary BIND9<br/>172.16.2.10]

        BC2 -.-> BI2
        BI2 -.-> DZ2
        DZ2 --> S1
    end

    subgraph "Cluster: eu-central-1"
        BC3[Bind9Cluster:<br/>production-dns]
        BI3[Bind9Instance:<br/>secondary-dns-eu]
        DZ3[DNSZone:<br/>example.com]
        S2[Secondary BIND9<br/>172.16.3.10]

        BC3 -.-> BI3
        BI3 -.-> DZ3
        DZ3 --> S2
    end

    P1 -.AXFR/IXFR.-> S1
    P1 -.AXFR/IXFR.-> S2

    LB[Global Load Balancer<br/>GeoDNS]

    LB -.US Traffic.-> P1
    LB -.US Traffic.-> S1
    LB -.EU Traffic.-> S2

    style BC1 fill:#e1f5ff
    style BC2 fill:#e1f5ff
    style BC3 fill:#e1f5ff
    style BI1 fill:#d4e8d4
    style BI2 fill:#d4e8d4
    style BI3 fill:#d4e8d4
    style P1 fill:#ffd4d4
    style S1 fill:#fff4e1
    style S2 fill:#fff4e1
    style LB fill:#f0f0f0

Architecture Overview - Detailed text description
RNDC Architecture - RNDC protocol details
Technical Architecture - Implementation specifics
CRD Specifications - Custom resource definitions

Custom Resource Definitions

Bindy extends Kubernetes with these Custom Resource Definitions (CRDs).

Infrastructure CRDs

Bind9Cluster

Represents cluster-level configuration shared across multiple BIND9 instances.

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: "base64-encoded-secret"

Learn more: Bind9Cluster concept documentation

Bind9Instance

Represents a BIND9 DNS server instance that references a Bind9Cluster.

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References Bind9Cluster
  replicas: 2

Learn more about Bind9Instance

DNS CRDs

DNSZone

Defines a DNS zone with SOA record and references a Bind9Instance.

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns  # References Bind9Instance
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Learn more about DNSZone

DNS Record Types

Bindy supports all common DNS record types:

ARecord - IPv4 addresses
AAAARecord - IPv6 addresses
CNAMERecord - Canonical name aliases
MXRecord - Mail exchange
TXTRecord - Text records (SPF, DKIM, etc.)
NSRecord - Nameserver delegation
SRVRecord - Service discovery
CAARecord - Certificate authority authorization

Learn more about DNS Records

Resource Hierarchy

The three-tier resource model:

Bind9Cluster (cluster config)
    ↑
    │ referenced by clusterRef
    │
Bind9Instance (instance deployment)
    ↑
    │ referenced by clusterRef
    │
DNSZone (zone definition)
    ↑
    │ referenced by zone field
    │
DNS Records (A, CNAME, MX, etc.)

Common Fields

All Bindy CRDs share these common fields:

Metadata

metadata:
  name: resource-name
  namespace: dns-system
  labels:
    key: value
  annotations:
    key: value

Status Subresource

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: Resource is synchronized
      lastTransitionTime: "2024-01-01T00:00:00Z"
  observedGeneration: 1

API Group and Versions

All Bindy CRDs belong to the bindy.firestoned.io API group:

Current version: v1alpha1
API stability: Alpha (subject to breaking changes)

Next Steps

Bind9Instance Details
DNSZone Details
DNS Record Details
RNDC-Based Architecture
API Reference

Bind9Cluster

The Bind9Cluster resource represents a logical DNS cluster - a collection of related BIND9 instances with shared configuration.

Overview

A Bind9Cluster defines cluster-level configuration that can be inherited by multiple Bind9Instance resources:

Shared BIND9 version and container image
Common configuration (recursion, ACLs, etc.)
Custom ConfigMap references for BIND9 configuration files
TSIG keys for authenticated zone transfers
Access Control Lists (ACLs)

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
  rndcSecretRefs:
    - name: transfer-key
      algorithm: hmac-sha256
      secret: "base64-encoded-secret"
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    external:
      - "0.0.0.0/0"
status:
  conditions:
    - type: Ready
      status: "True"
      reason: ClusterConfigured
      message: "Cluster configured successfully"
  instanceCount: 4
  readyInstances: 4

Specification

Optional Fields

spec.version - BIND9 version for all instances in the cluster
spec.image - Container image configuration for all instances
- image - Full container image reference (registry/repo:tag)
- imagePullPolicy - Image pull policy (Always, IfNotPresent, Never)
- imagePullSecrets - List of secret names for private registries
spec.configMapRefs - Custom ConfigMap references for BIND9 configuration
- namedConf - Name of ConfigMap containing named.conf
- namedConfOptions - Name of ConfigMap containing named.conf.options
spec.global - Shared BIND9 configuration
- recursion - Enable/disable recursion globally
- allowQuery - List of CIDR ranges allowed to query
- allowTransfer - List of CIDR ranges allowed zone transfers
- dnssec - DNSSEC configuration
- forwarders - DNS forwarders
- listenOn - IPv4 addresses to listen on
- listenOnV6 - IPv6 addresses to listen on
spec.primary - Primary instance configuration
- replicas - Number of primary instances to create (managed instances)
spec.secondary - Secondary instance configuration
- replicas - Number of secondary instances to create (managed instances)
spec.tsigKeys - TSIG keys for authenticated zone transfers
- name - Key name
- algorithm - HMAC algorithm (hmac-sha256, hmac-sha512, etc.)
- secret - Base64-encoded shared secret
spec.acls - Named ACL definitions that instances can reference

Cluster vs Instance

The relationship between Bind9Cluster and Bind9Instance:

# Cluster defines shared configuration
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
spec:
  version: "9.18"
  global:
    recursion: false
  acls:
    internal:
      - "10.0.0.0/8"

---
# Instance references the cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    cluster: prod-cluster
    dns-role: primary
spec:
  clusterRef: prod-cluster
  role: primary
  replicas: 2
  # Instance-specific config can override cluster defaults
  config:
    allowQuery:
      - acl:internal  # Reference the cluster's ACL

TSIG Keys

TSIG (Transaction SIGnature) keys provide authenticated zone transfers:

spec:
  rndcSecretRefs:
    - name: primary-secondary-key
      algorithm: hmac-sha256
      secret: "K8x...base64...=="
    - name: backup-key
      algorithm: hmac-sha512
      secret: "L9y...base64...=="

These keys are used by:

Primary instances for authenticated zone transfers to secondaries
Secondary instances to authenticate when requesting zone transfers
Dynamic DNS updates (if enabled)

Access Control Lists (ACLs)

ACLs define reusable network access policies:

spec:
  acls:
    # Internal networks
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
      - "192.168.0.0/16"

    # External clients
    external:
      - "0.0.0.0/0"

    # Secondary DNS servers
    secondaries:
      - "10.0.1.10"
      - "10.0.2.10"
      - "10.0.3.10"

Instances can then reference these ACLs:

# In Bind9Instance spec
config:
  allowQuery:
    - acl:external
  allowTransfer:
    - acl:secondaries

Status

The controller updates status to reflect cluster state:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ClusterConfigured
      message: "Cluster configured with 4 instances"
  instanceCount: 4      # Total instances in cluster
  readyInstances: 4     # Instances reporting ready
  observedGeneration: 1

Managed Instances

Bind9Cluster can automatically create and manage Bind9Instance resources based on the spec.primary.replicas and spec.secondary.replicas fields.

Automatic Scaling

The operator automatically scales instances up and down based on the replica counts in the cluster spec:

Scale-Up: When you increase replica counts, the operator creates missing instances Scale-Down: When you decrease replica counts, the operator deletes excess instances (highest-indexed first)

When you specify replica counts in the cluster spec, the operator automatically creates the corresponding instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  primary:
    replicas: 2  # Creates 2 primary instances
  secondary:
    replicas: 3  # Creates 3 secondary instances
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

This cluster definition will automatically create 5 Bind9Instance resources:

production-dns-primary-0
production-dns-primary-1
production-dns-secondary-0
production-dns-secondary-1
production-dns-secondary-2

Management Labels

All managed instances are labeled with:

bindy.firestoned.io/managed-by: "Bind9Cluster" - Identifies cluster-managed instances
bindy.firestoned.io/cluster: "<cluster-name>" - Links instance to parent cluster
bindy.firestoned.io/role: "primary"|"secondary" - Indicates instance role

And annotated with:

bindy.firestoned.io/instance-index: "<index>" - Sequential index for the instance

Example of a managed instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: production-dns-primary-0
  namespace: dns-system
  labels:
    bindy.firestoned.io/managed-by: "Bind9Cluster"
    bindy.firestoned.io/cluster: "production-dns"
    bindy.firestoned.io/role: "primary"
  annotations:
    bindy.firestoned.io/instance-index: "0"
spec:
  clusterRef: production-dns
  role: Primary
  replicas: 1
  version: "9.18"
  # Configuration inherited from cluster's spec.global

Configuration Inheritance

Managed instances automatically inherit configuration from the cluster:

BIND9 version (spec.version)
Container image (spec.image)
ConfigMap references (spec.configMapRefs)
Volumes and volume mounts
Global configuration (spec.global)

Self-Healing

The Bind9Cluster controller provides comprehensive self-healing for managed instances:

Instance-Level Self-Healing:

If a managed instance (Bind9Instance CRD) is deleted (manually or accidentally), the controller automatically recreates it during the next reconciliation cycle

Resource-Level Self-Healing:

If any child resource is deleted, the controller automatically triggers recreation:
- ConfigMap - BIND9 configuration files
- Secret - RNDC key for remote control
- Service - DNS traffic routing (TCP/UDP port 53)
- Deployment - BIND9 pods

This ensures complete desired state is maintained even if individual Kubernetes resources are manually deleted or corrupted.

Example self-healing scenario:

# Manually delete a ConfigMap
kubectl delete configmap production-dns-primary-0-config -n dns-system

# During next reconciliation (~10 seconds), the controller:
# 1. Detects missing ConfigMap
# 2. Triggers Bind9Instance reconciliation
# 3. Recreates ConfigMap with correct configuration
# 4. BIND9 pod automatically remounts updated ConfigMap

Example scaling scenario:

# Initial cluster with 2 primary instances
kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  primary:
    replicas: 2
EOF

# Controller creates: production-dns-primary-0, production-dns-primary-1

# Scale up to 4 primaries
kubectl patch bind9cluster production-dns -n dns-system --type=merge -p '{"spec":{"primary":{"replicas":4}}}'

# Controller creates: production-dns-primary-2, production-dns-primary-3

# Scale down to 3 primaries
kubectl patch bind9cluster production-dns -n dns-system --type=merge -p '{"spec":{"primary":{"replicas":3}}}'

# Controller deletes: production-dns-primary-3 (highest index first)

Manual vs Managed Instances

You can mix managed and manual instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: mixed-cluster
spec:
  version: "9.18"
  primary:
    replicas: 2  # Managed instances
  # No secondary replicas - create manually
---
# Manual instance with custom configuration
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-secondary
spec:
  clusterRef: mixed-cluster
  role: Secondary
  replicas: 1
  # Custom configuration overrides
  config:
    allowQuery:
      - "192.168.1.0/24"

Lifecycle Management

Cascade Deletion

When a Bind9Cluster is deleted, the operator automatically deletes all instances that reference it via spec.clusterRef. This ensures clean removal of all cluster resources.

Finalizer: bindy.firestoned.io/bind9cluster-finalizer

The cluster resource uses a finalizer to ensure proper cleanup before deletion:

# Delete the cluster
kubectl delete bind9cluster production-dns

# The operator will:
# 1. Detect deletion timestamp
# 2. Find all instances with clusterRef: production-dns
# 3. Delete each instance
# 4. Remove finalizer
# 5. Allow cluster deletion to complete

Example deletion logs:

INFO Deleting Bind9Cluster production-dns
INFO Found 5 instances to delete
INFO Deleted instance production-dns-primary-0
INFO Deleted instance production-dns-primary-1
INFO Deleted instance production-dns-secondary-0
INFO Deleted instance production-dns-secondary-1
INFO Deleted instance production-dns-secondary-2
INFO Removed finalizer from cluster
INFO Cluster deletion complete

Important Warnings

⚠️ Deleting a Bind9Cluster will delete ALL instances that reference it, including:

Managed instances (created by spec.primary.replicas and spec.secondary.replicas)
Manual instances (created separately but referencing the cluster via spec.clusterRef)

To preserve instances during cluster deletion, remove the spec.clusterRef field from instances first:

# Remove clusterRef from an instance to preserve it
kubectl patch bind9instance my-instance --type=json -p='[{"op": "remove", "path": "/spec/clusterRef"}]'

# Now safe to delete the cluster without affecting this instance
kubectl delete bind9cluster production-dns

Troubleshooting Stuck Deletions

If a cluster is stuck in Terminating state:

# Check for finalizers
kubectl get bind9cluster production-dns -o jsonpath='{.metadata.finalizers}'

# Check operator logs
kubectl logs -n dns-system deployment/bindy -f

# If operator is not running, manually remove finalizer (last resort)
kubectl patch bind9cluster production-dns -p '{"metadata":{"finalizers":null}}' --type=merge

Use Cases

Multi-Region DNS Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: global-dns
spec:
  version: "9.18"
  global:
    recursion: false
    dnssec:
      enabled: true
      validation: true
  rndcSecretRefs:
    - name: region-sync-key
      algorithm: hmac-sha256
      secret: "..."
  acls:
    us-east:
      - "10.1.0.0/16"
    us-west:
      - "10.2.0.0/16"
    eu-west:
      - "10.3.0.0/16"

Development Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true  # Allow recursion for dev
    allowQuery:
      - "0.0.0.0/0"
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"
  acls:
    dev-team:
      - "192.168.1.0/24"

Custom Image Cluster

Use a custom container image across all instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-image-cluster
  namespace: dns-system
spec:
  version: "9.18"
  # Custom image with organization-specific patches
  image:
    image: "my-registry.example.com/bind9:9.18-custom"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - docker-registry-secret
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

All Bind9Instances referencing this cluster will inherit the custom image configuration unless they override it.

Custom ConfigMap Cluster

Share custom BIND9 configuration files across all instances:

apiVersion: v1
kind: ConfigMap
metadata:
  name: shared-bind9-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      allow-transfer { 10.0.2.0/24; };
      dnssec-validation auto;

      # Custom logging
      querylog yes;

      # Rate limiting
      rate-limit {
        responses-per-second 10;
        window 5;
      };
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-config-cluster
  namespace: dns-system
spec:
  version: "9.18"
  configMapRefs:
    namedConfOptions: "shared-bind9-options"

All instances in this cluster will use the custom configuration, while named.conf is auto-generated.

Best Practices

One cluster per environment - Separate clusters for production, staging, development
Consistent TSIG keys - Use the same keys across all instances in a cluster
Version pinning - Specify exact BIND9 versions to avoid unexpected updates
ACL organization - Define ACLs at cluster level for consistency
DNSSEC - Enable DNSSEC at the cluster level for all zones
Image management - Define container images at cluster level for consistency; override at instance level only for canary testing
ConfigMap strategy - Use cluster-level ConfigMaps for shared configuration; use instance-level ConfigMaps for instance-specific customizations
Image pull secrets - Configure imagePullSecrets at cluster level to avoid duplicating secrets across instances

Next Steps

Bind9Instance - Learn about DNS instances
DNSZone - Learn about DNS zones
Multi-Region Setup - Deploy across multiple regions

Bind9GlobalCluster

The Bind9GlobalCluster CRD defines a cluster-scoped logical grouping of BIND9 DNS server instances for platform-managed infrastructure.

Overview

Bind9GlobalCluster is a cluster-scoped resource (no namespace) designed for platform teams to provide shared DNS infrastructure accessible from all namespaces in the cluster.

Key Characteristics

Cluster-Scoped: No namespace - visible cluster-wide
Platform-Managed: Typically managed by platform/infrastructure teams
Shared Infrastructure: DNSZones in any namespace can reference it
High Availability: Designed for production workloads
RBAC: Requires ClusterRole + ClusterRoleBinding

Relationship with Bind9Cluster

Bindy provides two cluster types:

Feature	Bind9Cluster	Bind9GlobalCluster
Scope	Namespace-scoped	Cluster-scoped
Managed By	Development teams	Platform teams
Visibility	Single namespace	All namespaces
RBAC	Role + RoleBinding	ClusterRole + ClusterRoleBinding
Zone Reference	`clusterRef`	`globalClusterRef`
Use Case	Dev/test, team isolation	Production, shared infrastructure

Shared Configuration: Both cluster types use the same Bind9ClusterCommonSpec for configuration, ensuring consistency.

Spec Structure

The Bind9GlobalClusterSpec uses the same configuration fields as Bind9Cluster through a shared spec:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  # No namespace - cluster-scoped
spec:
  # BIND9 version
  version: "9.18"

  # Primary instance configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

  # Secondary instance configuration
  secondary:
    replicas: 2

  # Global BIND9 configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"

  # Access control lists
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    secondaries:
      - "10.10.1.0/24"

  # Volumes for persistent storage
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-storage

  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind

For detailed field descriptions, see the Bind9Cluster Spec Reference - all fields are identical.

Status

The status subresource tracks the overall health of the global cluster:

status:
  # Cluster-level conditions
  conditions:
    - type: Ready
      status: "True"
      reason: AllReady
      message: "All 5 instances are ready"
      lastTransitionTime: "2025-01-10T12:00:00Z"

  # Instance tracking (namespace/name format for global clusters)
  instances:
    - "production/primary-dns-0"
    - "production/primary-dns-1"
    - "production/primary-dns-2"
    - "staging/secondary-dns-0"
    - "staging/secondary-dns-1"

  # Generation tracking
  observedGeneration: 3

  # Instance counts
  instanceCount: 5
  readyInstances: 5

Key Difference from Bind9Cluster: Instance names include namespace prefix (namespace/name) since instances can be in any namespace.

Usage Patterns

Pattern 1: Platform-Managed Production DNS

Scenario: Platform team provides shared DNS for all production workloads.

# Platform team creates global cluster (ClusterRole required)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
---
# Application team references global cluster (Role in their namespace)
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-service  # Application namespace
spec:
  zoneName: api.example.com
  globalClusterRef: shared-production-dns  # References cluster-scoped cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Different application, same global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: web-zone
  namespace: web-frontend  # Different namespace
spec:
  zoneName: www.example.com
  globalClusterRef: shared-production-dns  # Same global cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Pattern 2: Multi-Region Global Clusters

Scenario: Geo-distributed DNS with regional global clusters.

# US East region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-us-east
  labels:
    region: us-east-1
    tier: production
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.0.0.0/8"
---
# EU West region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-eu-west
  labels:
    region: eu-west-1
    tier: production
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.128.0.0/9"
---
# Application chooses regional cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-us
  namespace: api-service
spec:
  zoneName: api.us.example.com
  globalClusterRef: dns-us-east  # US region
  soaRecord: { /* ... */ }
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-eu
  namespace: api-service
spec:
  zoneName: api.eu.example.com
  globalClusterRef: dns-eu-west  # EU region
  soaRecord: { /* ... */ }

Pattern 3: Tiered DNS Service

Scenario: Platform offers different DNS service tiers.

# Premium tier - high availability
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-premium
  labels:
    tier: premium
    sla: "99.99"
spec:
  version: "9.18"
  primary:
    replicas: 5
    service:
      type: LoadBalancer
  secondary:
    replicas: 5
  global:
    options:
      - "minimal-responses yes"
      - "recursive-clients 10000"
---
# Standard tier - balanced cost/availability
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-standard
  labels:
    tier: standard
    sla: "99.9"
spec:
  version: "9.18"
  primary:
    replicas: 3
  secondary:
    replicas: 2
---
# Economy tier - minimal resources
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-economy
  labels:
    tier: economy
    sla: "99.0"
spec:
  version: "9.18"
  primary:
    replicas: 2
  secondary:
    replicas: 1

RBAC Requirements

Platform Team (ClusterRole)

Platform teams need ClusterRole to manage global clusters:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
# Manage global clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View global cluster status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters/status"]
  verbs: ["get", "list", "watch"]

# Manage instances across namespaces (for global clusters)
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns
subjects:
- kind: Group
  name: platform-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Application Teams (Role)

Application teams only need namespace-scoped permissions:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-zone-admin
  namespace: api-service
rules:
# Manage DNS zones and records in this namespace
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "mxrecords"
    - "txtrecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View resource status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

# Note: No permissions for Bind9GlobalCluster needed
# Application teams only manage DNSZones, not the cluster itself
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: api-team-dns
  namespace: api-service
subjects:
- kind: Group
  name: api-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-zone-admin
  apiGroup: rbac.authorization.k8s.io

Instance Management

Creating Instances for Global Clusters

Instances can be created in any namespace and reference the global cluster:

# Instance in production namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns-0
  namespace: production
spec:
  cluster_ref: shared-production-dns  # References global cluster
  role: primary
  replicas: 1
---
# Instance in staging namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns-0
  namespace: staging
spec:
  cluster_ref: shared-production-dns  # Same global cluster
  role: secondary
  replicas: 1

Status Tracking: The global cluster status includes instances from all namespaces:

status:
  instances:
    - "production/primary-dns-0"  # namespace/name format
    - "staging/secondary-dns-0"
  instanceCount: 2
  readyInstances: 2

Configuration Inheritance

How Configuration Flows to Deployments

When you update a Bind9GlobalCluster, the configuration automatically propagates down to all managed Deployment resources. This ensures consistency across your entire DNS infrastructure.

Configuration Precedence

Configuration is resolved with the following precedence (highest to lowest):

Bind9Instance - Instance-specific overrides
Bind9Cluster - Namespace-scoped cluster defaults
Bind9GlobalCluster - Cluster-scoped global defaults
System defaults - Built-in fallback values

Example:

# Bind9GlobalCluster defines global defaults
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"  # Global default version
  image:
    image: "internetsystemsconsortium/bind9:9.18"  # Global default image
  global:
    bindcarConfig:
      image: "ghcr.io/company/bindcar:v1.2.0"  # Global bindcar image

---
# Bind9Instance can override specific fields
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-0
  namespace: production
spec:
  clusterRef: production-dns
  role: Primary
  # version: "9.20"  # Would override global version if specified
  # Uses global version "9.18" and global bindcar image

Propagation Flow

When you update Bind9GlobalCluster.spec.common.global.bindcarConfig.image, the change propagates automatically:

sequenceDiagram
    participant User
    participant GC as Bind9GlobalCluster<br/>Reconciler
    participant BC as Bind9Cluster<br/>Reconciler
    participant BI as Bind9Instance<br/>Reconciler
    participant Deploy as Deployment

    User->>GC: Update bindcarConfig.image
    Note over GC: metadata.generation increments
    GC->>GC: Detect spec change
    GC->>BC: PATCH Bind9Cluster with new spec
    Note over BC: metadata.generation increments
    BC->>BC: Detect spec change
    BC->>BI: PATCH Bind9Instance with new spec
    Note over BI: metadata.generation increments
    BI->>BI: Detect spec change
    BI->>BI: Fetch Bind9GlobalCluster config
    BI->>BI: resolve_deployment_config():<br/>instance > cluster > global_cluster
    BI->>Deploy: UPDATE Deployment with new image
    Deploy->>Deploy: Rolling update pods

Inherited Configuration Fields

The following fields are inherited from Bind9GlobalCluster to Deployment:

Field	Example	Description
image	`spec.common.image`	Container image configuration
version	`spec.common.version`	BIND9 version tag
volumes	`spec.common.volumes`	Pod volumes (PVCs, ConfigMaps, etc.)
volumeMounts	`spec.common.volumeMounts`	Container volume mounts
bindcarConfig	`spec.common.global.bindcarConfig`	API sidecar configuration
configMapRefs	`spec.common.configMapRefs`	Custom ConfigMap references

Complete Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"

  # Image configuration - inherited by all instances
  image:
    image: "ghcr.io/mycompany/bind9:9.18-custom"
    imagePullPolicy: Always
    imagePullSecrets:
      - name: ghcr-credentials

  # API sidecar configuration - inherited by all instances
  global:
    bindcarConfig:
      image: "ghcr.io/mycompany/bindcar:v1.2.0"
      port: 8080

  # Volumes - inherited by all instances
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zones-pvc
    - name: custom-config
      configMap:
        name: bind9-custom-config

  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind
    - name: custom-config
      mountPath: /etc/bind/custom

All instances referencing this global cluster will inherit these configurations in their Deployment resources.

Verifying Configuration Propagation

To verify configuration is inherited correctly:

# 1. Check Bind9GlobalCluster spec
kubectl get bind9globalcluster production-dns -o yaml | grep -A 5 bindcarConfig

# 2. Check Bind9Instance spec (should be empty if using global config)
kubectl get bind9instance primary-0 -n production -o yaml | grep -A 5 bindcarConfig

# 3. Check Deployment - should show global cluster's bindcar image
kubectl get deployment primary-0 -n production -o yaml | grep "image:" | grep bindcar

Expected Output:

# Deployment should use global cluster's bindcar image
containers:
  - name: bindcar
    image: ghcr.io/mycompany/bindcar:v1.2.0  # From Bind9GlobalCluster

Reconciliation

Controller Behavior

The Bind9GlobalCluster reconciler:

Lists instances across ALL namespaces

#![allow(unused)]
fn main() {
let instances_api: Api<Bind9Instance> = Api::all(client.clone());
let all_instances = instances_api.list(&lp).await?;
}

Filters instances by cluster_ref matching the global cluster name

#![allow(unused)]
fn main() {
let instances: Vec<_> = all_instances
    .items
    .into_iter()
    .filter(|inst| inst.spec.cluster_ref == global_cluster_name)
    .collect();
}

Calculates cluster status
- Counts total and ready instances
- Aggregates health conditions
- Formats instance names as namespace/name
Updates status
- Sets observedGeneration
- Updates Ready condition
- Lists all instances with namespace prefix

Generation Tracking

The reconciler uses standard Kubernetes generation tracking:

metadata:
  generation: 5  # Incremented on spec changes

status:
  observedGeneration: 5  # Updated after reconciliation

Reconciliation occurs only when metadata.generation != status.observedGeneration (spec changed).

Comparison with Bind9Cluster

Similarities

✓ Identical configuration fields (Bind9ClusterCommonSpec)
✓ Same reconciliation logic for health tracking
✓ Status subresource with conditions
✓ Generation-based reconciliation
✓ Finalizer-based cleanup

Differences

Aspect	Bind9Cluster	Bind9GlobalCluster
Scope	Namespace-scoped	Cluster-scoped (no namespace)
API Used	`Api::namespaced()`	`Api::all()`
Instance Listing	Same namespace only	All namespaces
Instance Names	`name`	`namespace/name`
RBAC	Role + RoleBinding	ClusterRole + ClusterRoleBinding
Zone Reference Field	`spec.clusterRef`	`spec.globalClusterRef`
Kubectl Get	`kubectl get bind9cluster -n <namespace>`	`kubectl get bind9globalcluster`

Best Practices

1. Use for Production Workloads

Global clusters are ideal for production:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  labels:
    environment: production
    managed-by: platform-team
spec:
  version: "9.18"
  primary:
    replicas: 3  # High availability
    service:
      type: LoadBalancer
  secondary:
    replicas: 3

2. Separate Global Clusters by Environment

# Production cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-production
  labels:
    environment: production
spec: { /* production config */ }
---
# Staging cluster (also global, but separate)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-staging
  labels:
    environment: staging
spec: { /* staging config */ }

3. Label for Organization

Use labels to categorize global clusters:

metadata:
  name: dns-us-east-prod
  labels:
    region: us-east-1
    environment: production
    tier: premium
    team: platform
    cost-center: infrastructure

4. Monitor Status Across Namespaces

# View global cluster status
kubectl get bind9globalcluster dns-production

# See instances across all namespaces
kubectl get bind9globalcluster dns-production -o jsonpath='{.status.instances}'

# Check instance distribution
kubectl get bind9instance -A -l cluster=dns-production

5. Use with DNSZone Namespace Isolation

Remember: DNSZones are always namespace-scoped, even when referencing global clusters:

# DNSZone in namespace-a
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: zone-a
  namespace: namespace-a
spec:
  zoneName: app-a.example.com
  globalClusterRef: shared-dns
  # Records in namespace-a can ONLY reference this zone
---
# DNSZone in namespace-b
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: zone-b
  namespace: namespace-b
spec:
  zoneName: app-b.example.com
  globalClusterRef: shared-dns
  # Records in namespace-b can ONLY reference this zone

Troubleshooting

Viewing Global Clusters

# List all global clusters
kubectl get bind9globalclusters

# Describe a specific global cluster
kubectl describe bind9globalcluster production-dns

# View status
kubectl get bind9globalcluster production-dns -o yaml

Common Issues

Issue: Application team cannot create global cluster

Solution: Check RBAC - requires ClusterRole, not Role

kubectl auth can-i create bind9globalclusters --as=user@example.com

Issue: Instances not showing in status

Solution: Verify instance cluster_ref matches global cluster name

kubectl get bind9instance -A -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: {.spec.cluster_ref}{"\n"}{end}'

Issue: DNSZone cannot find global cluster

Solution: Check globalClusterRef field (not clusterRef)

spec:
  globalClusterRef: production-dns  # ✓ Correct
  # clusterRef: production-dns      # ✗ Wrong - for namespace-scoped

Next Steps

Multi-Tenancy Guide - RBAC setup and examples
Choosing a Cluster Type - Decision guide
Bind9Cluster Reference - Namespace-scoped alternative
Architecture Overview - Dual-cluster model

Bind9Instance

The Bind9Instance resource represents a BIND9 DNS server deployment in Kubernetes.

Overview

A Bind9Instance defines:

Number of replicas
BIND9 version and container image
Configuration options (or custom ConfigMap references)
Network settings
Labels for targeting
Optional cluster reference for inheriting shared configuration

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
    datacenter: us-east
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
status:
  conditions:
    - type: Ready
      status: "True"
      reason: Running
      message: "2 replicas running"
  readyReplicas: 2
  currentVersion: "9.18"

Specification

Optional Fields

All fields are optional. If no clusterRef is specified, default values are used.

spec.clusterRef - Reference to a Bind9Cluster for inheriting shared configuration
spec.replicas - Number of BIND9 pods (default: 1)
spec.version - BIND9 version to deploy (default: “9.18”, or inherit from cluster)
spec.image - Container image configuration (inherits from cluster if not specified)
- image - Full container image reference
- imagePullPolicy - Image pull policy (Always, IfNotPresent, Never)
- imagePullSecrets - List of secret names for private registries
spec.configMapRefs - Custom ConfigMap references (inherits from cluster if not specified)
- namedConf - ConfigMap name containing named.conf
- namedConfOptions - ConfigMap name containing named.conf.options
spec.config - BIND9 configuration options (inherits from cluster if not specified)
- recursion - Enable/disable recursion (default: false)
- allowQuery - List of CIDR ranges allowed to query
- allowTransfer - List of CIDR ranges allowed to transfer zones
- dnssec - DNSSEC configuration
- forwarders - DNS forwarders
- listenOn - IPv4 addresses to listen on
- listenOnV6 - IPv6 addresses to listen on

Configuration Inheritance

When a Bind9Instance references a Bind9Cluster via clusterRef:

Instance-level settings take precedence
If not specified at instance level, cluster settings are used
If not specified at cluster level, defaults are used

Labels and Selectors

Labels on Bind9Instance resources are used by DNSZone resources to target specific instances:

# Instance with labels
metadata:
  labels:
    dns-role: primary
    region: us-east
    environment: production

# Zone selecting this instance
spec:
  instanceSelector:
    matchLabels:
      dns-role: primary
      region: us-east

Status

The controller updates status to reflect the instance state:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Running
  readyReplicas: 2
  currentVersion: "9.18"

Use Cases

Primary DNS Instance

metadata:
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Allow secondaries to transfer

Secondary DNS Instance

metadata:
  labels:
    dns-role: secondary
spec:
  replicas: 2
  config:
    allowTransfer: []  # No transfers from secondary

Instance with Custom Image

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-image-dns
  namespace: dns-system
spec:
  replicas: 2
  image:
    image: "my-registry.example.com/bind9:9.18-patched"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Instance with Custom ConfigMaps

apiVersion: v1
kind: ConfigMap
metadata:
  name: custom-dns-config
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };

      # Custom rate limiting
      rate-limit {
        responses-per-second 10;
      };
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-config-dns
  namespace: dns-system
spec:
  replicas: 2
  configMapRefs:
    namedConfOptions: "custom-dns-config"

Instance Inheriting from Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
  namespace: dns-system
spec:
  version: "9.18"
  image:
    image: "internetsystemsconsortium/bind9:9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: prod-instance-1
  namespace: dns-system
spec:
  clusterRef: prod-cluster
  replicas: 2
  # Inherits version, image, and config from cluster

Canary Instance with Override

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: canary-instance
  namespace: dns-system
spec:
  clusterRef: prod-cluster  # Inherits most settings from cluster
  replicas: 1
  # Override image for canary testing
  image:
    image: "internetsystemsconsortium/bind9:9.19-beta"
    imagePullPolicy: "Always"

Next Steps

DNSZone - Learn about DNS zones
Primary Instances - Deploy primary DNS
Secondary Instances - Deploy secondary DNS

DNSZone

The DNSZone resource defines a DNS zone with its SOA record and references a specific BIND9 cluster.

Overview

A DNSZone represents:

Zone name (e.g., example.com)
SOA (Start of Authority) record
Cluster reference to a Bind9Instance
Default TTL for records

The zone is created on the referenced BIND9 cluster using the RNDC protocol.

Example

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: my-dns-cluster  # References Bind9Instance name
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600
status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: "Zone created for cluster: my-dns-cluster"
  observedGeneration: 1

Specification

Required Fields

spec.zoneName - The DNS zone name (e.g., example.com)
spec.clusterRef - Name of the Bind9Instance to host this zone
spec.soaRecord - Start of Authority record configuration

SOA Record Fields

primaryNs - Primary nameserver (must end with .)
adminEmail - Zone administrator email (@ replaced with ., must end with .)
serial - Zone serial number (typically YYYYMMDDNN format)
refresh - Refresh interval in seconds (how often secondaries check for updates)
retry - Retry interval in seconds (retry delay after failed refresh)
expire - Expiry time in seconds (when to stop serving if primary unreachable)
negativeTtl - Negative caching TTL (cache duration for NXDOMAIN responses)

Optional Fields

spec.ttl - Default TTL for records in seconds (default: 3600)

How Zones Are Created

When you create a DNSZone resource:

Controller discovers pods - Finds BIND9 pods with label instance={clusterRef}
Loads RNDC key - Retrieves Secret named {clusterRef}-rndc-key
Connects via RNDC - Establishes connection to {clusterRef}.{namespace}.svc.cluster.local:953
Executes addzone - Runs rndc addzone command with zone configuration
BIND9 creates zone - BIND9 creates the zone file and starts serving the zone
Updates status - Controller updates DNSZone status to Ready

Cluster References

Zones reference a specific BIND9 cluster by name:

spec:
  clusterRef: my-dns-cluster

This references a Bind9Instance resource:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: my-dns-cluster  # Referenced by DNSZone
  namespace: dns-system
spec:
  role: primary
  replicas: 2

RNDC Key Discovery

The controller automatically finds the RNDC key using the cluster reference:

DNSZone.spec.clusterRef = "my-dns-cluster"
    ↓
Secret name = "my-dns-cluster-rndc-key"
    ↓
RNDC authentication to: my-dns-cluster.dns-system.svc.cluster.local:953

Status

The controller reports zone status with granular condition types that provide real-time visibility into the reconciliation process.

Status During Reconciliation

# Phase 1: Configuring primary instances
status:
  conditions:
    - type: Progressing
      status: "True"
      reason: PrimaryReconciling
      message: "Configuring zone on primary instances"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

# Phase 2: Primary success, configuring secondaries
status:
  conditions:
    - type: Progressing
      status: "True"
      reason: SecondaryReconciling
      message: "Configured on 2 primary server(s), now configuring secondaries"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

Status After Successful Reconciliation

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Configured on 2 primary server(s) and 3 secondary server(s)"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"
    - "10.42.0.7"

Status After Partial Failure (Degraded)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: SecondaryFailed
      message: "Configured on 2 primary server(s), but secondary configuration failed: connection timeout"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

Condition Types

DNSZone uses the following condition types:

Progressing - Zone is being configured
- PrimaryReconciling: Configuring on primary instances
- PrimaryReconciled: Primary configuration successful
- SecondaryReconciling: Configuring on secondary instances
- SecondaryReconciled: Secondary configuration successful
Ready - Zone fully configured and operational
- ReconcileSucceeded: All primaries and secondaries configured successfully
Degraded - Partial or complete failure
- PrimaryFailed: Primary configuration failed (zone not functional)
- SecondaryFailed: Secondary configuration failed (primaries work, but secondaries unavailable)

Benefits of Granular Status

Real-time visibility - See which reconciliation phase is running
Better debugging - Know exactly which phase failed (primary vs secondary)
Graceful degradation - Secondary failures don’t break the zone (primaries still work)
Accurate counts - Status shows exact number of configured servers

Use Cases

Simple Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: simple-com
spec:
  zoneName: simple.com
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.simple.com.
    adminEmail: admin.simple.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Production Zone with Custom TTL

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-example-com
spec:
  zoneName: api.example.com
  clusterRef: production-dns
  ttl: 300  # 5 minute default TTL for faster updates
  soaRecord:
    primaryNs: ns1.api.example.com.
    adminEmail: ops.example.com.
    serial: 2024010101
    refresh: 1800   # Check every 30 minutes
    retry: 300      # Retry after 5 minutes
    expire: 604800
    negativeTtl: 300  # Short negative cache

Next Steps

DNS Records - Add records to zones
RNDC-Based Architecture - Learn how RNDC protocol works
Bind9Instance - Learn about BIND9 instance resources
Creating Zones - Zone management guide

DNS Records

Bindy supports all common DNS record types as Custom Resources.

Supported Record Types

ARecord - IPv4 address mapping
AAAARecord - IPv6 address mapping
CNAMERecord - Canonical name (alias)
MXRecord - Mail exchange
TXTRecord - Text data
NSRecord - Nameserver delegation
SRVRecord - Service location
CAARecord - Certificate authority authorization

Common Fields

All DNS record types share these fields:

metadata:
  name: record-name
  namespace: dns-system
spec:
  # Zone reference (use ONE of these):
  zone: example.com          # Match against DNSZone spec.zoneName
  # OR
  zoneRef: example-com       # Direct reference to DNSZone metadata.name

  name: record-name          # DNS name (@ for zone apex)
  ttl: 300                   # Time to live (optional)

Zone Referencing

DNS records can reference their parent zone using two different methods:

zone field - Searches for a DNSZone by matching spec.zoneName
- Value: The actual DNS zone name (e.g., example.com)
- The controller searches all DNSZones in the namespace for matching spec.zoneName
- More intuitive but requires a list operation
zoneRef field - Direct reference to a DNSZone resource
- Value: The Kubernetes resource name (e.g., example-com)
- The controller directly retrieves the DNSZone by metadata.name
- More efficient (no search required)
- Recommended for production use

Important: You must specify exactly one of zone or zoneRef (not both).

Example: Zone vs ZoneRef

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com        # Kubernetes resource name
  namespace: dns-system
spec:
  zoneName: example.com    # Actual DNS zone name
  clusterRef: primary-dns
  # ... soa_record, etc.

You can reference it using either method:

Method 1: Using zone (matches spec.zoneName)

spec:
  zone: example.com  # Matches DNSZone spec.zoneName
  name: www

Method 2: Using zoneRef (matches metadata.name)

spec:
  zoneRef: example-com  # Matches DNSZone metadata.name
  name: www

ARecord (IPv4)

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

Learn more about A Records

AAAARecord (IPv6)

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-ipv6
spec:
  zone: example-com
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

Learn more about AAAA Records

CNAMERecord (Alias)

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example
spec:
  zone: example-com
  name: blog
  target: www.example.com.
  ttl: 300

Learn more about CNAME Records

MXRecord (Mail Exchange)

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
spec:
  zone: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
  ttl: 3600

Learn more about MX Records

TXTRecord (Text)

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-example
spec:
  zone: example-com
  name: "@"
  text:
    - "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Learn more about TXT Records

NSRecord (Nameserver)

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: delegate-subdomain
spec:
  zone: example-com
  name: subdomain
  nameserver: ns1.subdomain.example.com.
  ttl: 3600

Learn more about NS Records

SRVRecord (Service)

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-service
spec:
  zone: example-com
  name: _sip._tcp
  priority: 10
  weight: 60
  port: 5060
  target: sipserver.example.com.
  ttl: 3600

Learn more about SRV Records

CAARecord (Certificate Authority)

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: letsencrypt-caa
spec:
  zone: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
  ttl: 3600

Learn more about CAA Records

Record Status

All DNS record types use granular status conditions to provide real-time visibility into the record configuration process.

Status During Configuration

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: RecordReconciling
      message: "Configuring A record on zone endpoints"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

Status After Successful Configuration

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Status After Failure

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: RecordFailed
      message: "Failed to configure record: Zone not found on primary servers"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Condition Types

All DNS record types use the following condition types:

Progressing - Record is being configured
- RecordReconciling: Before adding record to zone endpoints
Ready - Record successfully configured
- ReconcileSucceeded: Record configured on all endpoints (message includes endpoint count)
Degraded - Configuration failure
- RecordFailed: Failed to configure record (includes error details)

Benefits

Real-time progress - See when records are being configured
Better debugging - Know immediately if/why a record failed
Accurate reporting - Status shows exact number of endpoints configured
Consistent across types - All 8 record types use the same status pattern

Record Management

Referencing Zones

All records reference a DNSZone via the zone field:

spec:
  zone: example-com  # Must match DNSZone metadata.name

Zone Apex Records

Use @ for zone apex records:

spec:
  name: "@"  # Represents the zone itself

Subdomain Records

Use the subdomain name:

spec:
  name: www        # www.example.com
  name: api.v2     # api.v2.example.com

Next Steps

Managing DNS Records - Complete record management guide
A Records - IPv4 address records
CNAME Records - Alias records
MX Records - Mail server records

Architecture Overview

This guide explains the Bindy architecture, focusing on the dual-cluster model that enables multi-tenancy and flexible deployment patterns.

Architecture Principles
Cluster Models
Resource Hierarchy
Reconciliation Flow
Multi-Tenancy Model
Namespace Isolation

Architecture Principles

Bindy follows Kubernetes controller pattern best practices:

Declarative Configuration: Users declare desired state via CRDs, controllers reconcile to match
Level-Based Reconciliation: Controllers continuously ensure actual state matches desired state
Status Subresources: All CRDs expose status for observability
Finalizers: Proper cleanup of dependent resources before deletion
Generation Tracking: Reconcile only when spec changes (using metadata.generation)

Cluster Models

Bindy provides two cluster models to support different organizational patterns:

Namespace-Scoped Clusters (`Bind9Cluster`)

Use Case: Development teams manage their own DNS infrastructure within their namespace.

graph TB
    subgraph "Namespace: dev-team-alpha"
        Cluster[Bind9Cluster<br/>dev-team-dns]
        Zone1[DNSZone<br/>app.example.com]
        Zone2[DNSZone<br/>test.local]
        Record1[ARecord<br/>www]
        Record2[MXRecord<br/>mail]

        Cluster --> Zone1
        Cluster --> Zone2
        Zone1 --> Record1
        Zone2 --> Record2
    end

    style Cluster fill:#e1f5ff
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1
    style Record1 fill:#f0f0f0
    style Record2 fill:#f0f0f0

Characteristics:

Isolated to a single namespace
Teams manage their own DNS independently
RBAC scoped to namespace (Role/RoleBinding)
Cannot be referenced from other namespaces

YAML Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-team-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1

Cluster-Scoped Clusters (`Bind9GlobalCluster`)

Use Case: Platform teams provide shared DNS infrastructure accessible from all namespaces.

graph TB
    subgraph "Cluster-Scoped (no namespace)"
        GlobalCluster[Bind9GlobalCluster<br/>shared-production-dns]
    end

    subgraph "Namespace: production"
        Zone1[DNSZone<br/>api.example.com]
        Record1[ARecord<br/>api]
    end

    subgraph "Namespace: staging"
        Zone2[DNSZone<br/>staging.example.com]
        Record2[ARecord<br/>app]
    end

    GlobalCluster -.-> Zone1
    GlobalCluster -.-> Zone2
    Zone1 --> Record1
    Zone2 --> Record2

    style GlobalCluster fill:#c8e6c9
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1
    style Record1 fill:#f0f0f0
    style Record2 fill:#f0f0f0

Characteristics:

Cluster-wide visibility (no namespace)
Platform team manages centralized DNS
RBAC requires ClusterRole/ClusterRoleBinding
DNSZones in any namespace can reference it

YAML Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
  # No namespace - cluster-scoped resource
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2

Resource Hierarchy

The complete resource hierarchy shows how components relate:

graph TD
    subgraph "Cluster-Scoped Resources"
        GlobalCluster[Bind9GlobalCluster]
    end

    subgraph "Namespace-Scoped Resources"
        Cluster[Bind9Cluster]
        Zone[DNSZone]
        Instance[Bind9Instance]
        Records[DNS Records<br/>A, AAAA, CNAME, MX, etc.]
    end

    GlobalCluster -.globalClusterRef.-> Zone
    Cluster --clusterRef--> Zone

    Cluster --cluster_ref--> Instance
    GlobalCluster -.cluster_ref.-> Instance

    Zone --> Records

    style GlobalCluster fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style Zone fill:#fff4e1
    style Instance fill:#ffe1e1
    style Records fill:#f0f0f0

Key Relationships

DNSZone → Cluster References:
- spec.clusterRef: References namespace-scoped Bind9Cluster (same namespace)
- spec.globalClusterRef: References cluster-scoped Bind9GlobalCluster
- Mutual Exclusivity: Exactly one must be specified
Bind9Instance → Cluster Reference:
- spec.cluster_ref: Can reference either Bind9Cluster or Bind9GlobalCluster
- Controller auto-detects cluster type
DNS Records → Zone Reference:
- spec.zone: Zone name lookup (searches in same namespace)
- spec.zoneRef: Direct DNSZone resource name (same namespace)
- Namespace Isolation: Records can ONLY reference zones in their own namespace

Reconciliation Flow

DNSZone Reconciliation

sequenceDiagram
    participant K8s as Kubernetes API
    participant Controller as DNSZone Controller
    participant Cluster as Bind9Cluster/GlobalCluster
    participant Instances as Bind9Instances
    participant BIND9 as BIND9 Pods

    K8s->>Controller: DNSZone created/updated
    Controller->>Controller: Check metadata.generation vs status.observedGeneration
    alt Spec unchanged
        Controller->>K8s: Skip reconciliation (status-only update)
    else Spec changed
        Controller->>Controller: Validate clusterRef XOR globalClusterRef
        Controller->>Cluster: Get cluster by clusterRef or globalClusterRef
        Controller->>Instances: List instances by cluster reference
        Controller->>BIND9: Update zone files via Bindcar API
        Controller->>K8s: Update status (observedGeneration, conditions)
    end

Bind9GlobalCluster Reconciliation

sequenceDiagram
    participant K8s as Kubernetes API
    participant Controller as GlobalCluster Controller
    participant Instances as Bind9Instances (all namespaces)

    K8s->>Controller: Bind9GlobalCluster created/updated
    Controller->>Controller: Check generation changed
    Controller->>Instances: List all instances across all namespaces
    Controller->>Controller: Filter instances by cluster_ref
    Controller->>Controller: Calculate cluster status
    Note over Controller: - Count ready instances<br/>- Aggregate conditions<br/>- Format instance names as namespace/name
    Controller->>K8s: Update status with aggregated health

Multi-Tenancy Model

Bindy supports multi-tenancy through two organizational patterns:

Platform Team Pattern

Platform teams manage cluster-wide DNS infrastructure:

graph TB
    subgraph "Platform Team (ClusterRole)"
        PlatformAdmin[Platform Admin]
    end

    subgraph "Cluster-Scoped"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
    end

    subgraph "Namespace: app-a"
        Zone1[DNSZone<br/>app-a.example.com]
        Instance1[Bind9Instance<br/>primary-app-a]
    end

    subgraph "Namespace: app-b"
        Zone2[DNSZone<br/>app-b.example.com]
        Instance2[Bind9Instance<br/>primary-app-b]
    end

    PlatformAdmin -->|manages| GlobalCluster
    GlobalCluster -.->|referenced by| Zone1
    GlobalCluster -.->|referenced by| Zone2
    GlobalCluster -->|references| Instance1
    GlobalCluster -->|references| Instance2

    style PlatformAdmin fill:#ff9800
    style GlobalCluster fill:#c8e6c9
    style Zone1 fill:#fff4e1
    style Zone2 fill:#fff4e1

RBAC Setup:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns
subjects:
- kind: Group
  name: platform-team
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Development Team Pattern

Development teams manage namespace-scoped DNS:

graph TB
    subgraph "Namespace: dev-team-alpha (Role)"
        DevAdmin[Dev Team Admin]
        Cluster[Bind9Cluster<br/>dev-dns]
        Zone[DNSZone<br/>dev.example.com]
        Records[DNS Records]
        Instance[Bind9Instance]
    end

    DevAdmin -->|manages| Cluster
    DevAdmin -->|manages| Zone
    DevAdmin -->|manages| Records
    Cluster --> Instance
    Cluster --> Zone
    Zone --> Records

    style DevAdmin fill:#2196f3
    style Cluster fill:#e1f5ff
    style Zone fill:#fff4e1

RBAC Setup:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-admin
  namespace: dev-team-alpha
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9clusters", "dnszones", "arecords", "cnamerecords", "mxrecords", "txtrecords"]
  verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dev-team-dns
  namespace: dev-team-alpha
subjects:
- kind: Group
  name: dev-team-alpha
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-admin
  apiGroup: rbac.authorization.k8s.io

Namespace Isolation

Security Principle: DNSZones and records are always namespace-scoped, even when referencing cluster-scoped resources.

graph TB
    subgraph "Cluster-Scoped"
        GlobalCluster[Bind9GlobalCluster<br/>shared-dns]
    end

    subgraph "Namespace: team-a"
        ZoneA[DNSZone<br/>team-a.example.com]
        RecordA[ARecord<br/>www]
    end

    subgraph "Namespace: team-b"
        ZoneB[DNSZone<br/>team-b.example.com]
        RecordB[ARecord<br/>api]
    end

    GlobalCluster -.-> ZoneA
    GlobalCluster -.-> ZoneB
    ZoneA --> RecordA
    ZoneB --> RecordB

    RecordA -.X|blocked|ZoneB
    RecordB -.X|blocked|ZoneA

    style GlobalCluster fill:#c8e6c9
    style ZoneA fill:#fff4e1
    style ZoneB fill:#fff4e1

Isolation Rules:

Records can ONLY reference zones in their own namespace
- Controller uses Api::namespaced() to enforce this
- Cross-namespace references are impossible
DNSZones are namespace-scoped
- Even when referencing Bind9GlobalCluster
- Each team manages their own zones
RBAC controls zone management
- Platform team: ClusterRole for Bind9GlobalCluster
- Dev teams: Role for DNSZone and records in their namespace

Example - Record Isolation:

# team-a namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-a-zone  # ✅ References zone in same namespace
  name: www
  ipv4Address: "192.0.2.1"
---
# This would FAIL - cannot reference zone in another namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-b-zone  # ❌ References zone in team-b namespace - BLOCKED
  name: www
  ipv4Address: "192.0.2.1"

Decision Tree: Choosing a Cluster Model

Use this decision tree to determine which cluster model fits your use case:

graph TD
    Start{Who manages<br/>DNS infrastructure?}
    Start -->|Platform Team| PlatformCheck{Shared across<br/>namespaces?}
    Start -->|Development Team| DevCheck{Isolated to<br/>namespace?}

    PlatformCheck -->|Yes| Global[Use Bind9GlobalCluster<br/>cluster-scoped]
    PlatformCheck -->|No| Cluster[Use Bind9Cluster<br/>namespace-scoped]

    DevCheck -->|Yes| Cluster
    DevCheck -->|No| Global

    Global --> GlobalDetails[✓ ClusterRole required<br/>✓ Accessible from all namespaces<br/>✓ Centralized management<br/>✓ Production workloads]

    Cluster --> ClusterDetails[✓ Role required namespace<br/>✓ Isolated to namespace<br/>✓ Team autonomy<br/>✓ Dev/test workloads]

    style Global fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style GlobalDetails fill:#e8f5e9
    style ClusterDetails fill:#e3f2fd

Next Steps

Multi-Tenancy Guide - Detailed RBAC setup and examples
Choosing a Cluster Type - Decision guide for cluster selection
Quickstart Guide - Getting started with both cluster types

Multi-Tenancy Guide

This guide explains how to set up multi-tenancy with Bindy using the dual-cluster model, RBAC configuration, and namespace isolation.

Overview
Tenancy Models
Platform Team Setup
Development Team Setup
RBAC Configuration
Security Best Practices
Example Scenarios

Overview

Bindy supports multi-tenancy through two complementary approaches:

Platform-Managed DNS: Centralized DNS infrastructure managed by platform teams
Tenant-Managed DNS: Isolated DNS infrastructure managed by development teams

Both can coexist in the same cluster, providing flexibility for different organizational needs.

Key Principles

Namespace Isolation: DNSZones and records are always namespace-scoped
RBAC-Based Access: Kubernetes RBAC controls who can manage DNS resources
Cluster Model Flexibility: Choose namespace-scoped or cluster-scoped clusters based on needs
No Cross-Namespace Access: Records cannot reference zones in other namespaces

Tenancy Models

Model 1: Platform-Managed DNS

Use Case: Platform team provides shared DNS infrastructure for all applications.

graph TB
    subgraph "Platform Team ClusterRole"
        PlatformAdmin[Platform Admin]
    end

    subgraph "Cluster-Scoped Resources"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
    end

    subgraph "Application Team A Namespace"
        ZoneA[DNSZone<br/>app-a.example.com]
        RecordsA[DNS Records]
    end

    subgraph "Application Team B Namespace"
        ZoneB[DNSZone<br/>app-b.example.com]
        RecordsB[DNS Records]
    end

    PlatformAdmin -->|manages| GlobalCluster
    GlobalCluster -.globalClusterRef.-> ZoneA
    GlobalCluster -.globalClusterRef.-> ZoneB
    ZoneA --> RecordsA
    ZoneB --> RecordsB

    style PlatformAdmin fill:#ff9800
    style GlobalCluster fill:#c8e6c9
    style ZoneA fill:#fff4e1
    style ZoneB fill:#fff4e1

Characteristics:

Platform team manages Bind9GlobalCluster (requires ClusterRole)
Application teams manage DNSZone and records in their namespace (requires Role)
Shared DNS infrastructure, distributed zone management
Suitable for production workloads

Model 2: Tenant-Managed DNS

Use Case: Development teams run isolated DNS infrastructure for testing/dev.

graph TB
    subgraph "Team A Namespace + Role"
        AdminA[Team A Admin]
        ClusterA[Bind9Cluster<br/>team-a-dns]
        ZoneA[DNSZone<br/>dev-a.local]
        RecordsA[DNS Records]
    end

    subgraph "Team B Namespace + Role"
        AdminB[Team B Admin]
        ClusterB[Bind9Cluster<br/>team-b-dns]
        ZoneB[DNSZone<br/>dev-b.local]
        RecordsB[DNS Records]
    end

    AdminA -->|manages| ClusterA
    AdminA -->|manages| ZoneA
    AdminA -->|manages| RecordsA
    ClusterA --> ZoneA
    ZoneA --> RecordsA

    AdminB -->|manages| ClusterB
    AdminB -->|manages| ZoneB
    AdminB -->|manages| RecordsB
    ClusterB --> ZoneB
    ZoneB --> RecordsB

    style AdminA fill:#2196f3
    style AdminB fill:#2196f3
    style ClusterA fill:#e1f5ff
    style ClusterB fill:#e1f5ff

Characteristics:

Each team manages their own Bind9Cluster (namespace-scoped Role)
Complete isolation between teams
Teams have full autonomy over DNS configuration
Suitable for development/testing environments

Platform Team Setup

Step 1: Create ClusterRole for Platform DNS Management

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: platform-dns-admin
rules:
# Manage cluster-scoped global clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View global cluster status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9globalclusters/status"]
  verbs: ["get", "list", "watch"]

# Manage bind9 instances across all namespaces (for global clusters)
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View instance status
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances/status"]
  verbs: ["get", "list", "watch"]

Step 2: Bind ClusterRole to Platform Team

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-dns-admin
subjects:
- kind: Group
  name: platform-team  # Your IdP/OIDC group name
  apiGroup: rbac.authorization.k8s.io
# Alternative: Bind to specific users
# - kind: User
#   name: alice@example.com
#   apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: platform-dns-admin
  apiGroup: rbac.authorization.k8s.io

Step 3: Create Bind9GlobalCluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-production-dns
  # No namespace - cluster-scoped
spec:
  version: "9.18"

  # Primary instances configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"

  # Secondary instances configuration
  secondary:
    replicas: 2

  # Global BIND9 configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"

  # Access control lists
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"

Step 4: Grant Application Teams DNS Zone Management

Create a Role in each application namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-zone-admin
  namespace: app-team-a
rules:
# Manage DNS zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "aaaarecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
    - "nsrecords"
    - "srvrecords"
    - "caarecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View resource status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
    - "cnamerecords/status"
    - "mxrecords/status"
    - "txtrecords/status"
  verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: app-team-a-dns
  namespace: app-team-a
subjects:
- kind: Group
  name: app-team-a
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-zone-admin
  apiGroup: rbac.authorization.k8s.io

Step 5: Application Teams Create DNSZones

Application teams can now create zones in their namespace:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-a-zone
  namespace: app-team-a
spec:
  zoneName: app-a.example.com
  globalClusterRef: shared-production-dns  # References platform cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Development Team Setup

Step 1: Create Namespace for Team

apiVersion: v1
kind: Namespace
metadata:
  name: dev-team-alpha
  labels:
    team: dev-team-alpha
    environment: development

Step 2: Create Role for Full DNS Management

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-full-admin
  namespace: dev-team-alpha
rules:
# Manage namespace-scoped clusters
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9clusters"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Manage instances
- apiGroups: ["bindy.firestoned.io"]
  resources: ["bind9instances"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Manage zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "aaaarecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
    - "nsrecords"
    - "srvrecords"
    - "caarecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# View status for all resources
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "bind9clusters/status"
    - "bind9instances/status"
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

Step 3: Bind Role to Development Team

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dev-team-alpha-dns
  namespace: dev-team-alpha
subjects:
- kind: Group
  name: dev-team-alpha
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: dns-full-admin
  apiGroup: rbac.authorization.k8s.io

Step 4: Development Team Creates Infrastructure

# Namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
# DNS zone referencing namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: dev-zone
  namespace: dev-team-alpha
spec:
  zoneName: dev.local
  clusterRef: dev-dns  # References namespace-scoped cluster
  soaRecord:
    primaryNs: ns1.dev.local.
    adminEmail: admin.dev.local.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 300
  ttl: 300
---
# DNS record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: test-server
  namespace: dev-team-alpha
spec:
  zoneRef: dev-zone
  name: test-server
  ipv4Address: "10.244.1.100"
  ttl: 60

RBAC Configuration

ClusterRole vs Role Decision Matrix

Resource	Scope	RBAC Type	Who Gets It
`Bind9GlobalCluster`	Cluster-scoped	ClusterRole + ClusterRoleBinding	Platform team
`Bind9Cluster`	Namespace-scoped	Role + RoleBinding	Development teams
`Bind9Instance`	Namespace-scoped	Role + RoleBinding	Teams managing instances
`DNSZone`	Namespace-scoped	Role + RoleBinding	Application teams
DNS Records	Namespace-scoped	Role + RoleBinding	Application teams

Example RBAC Hierarchy

graph TD
    subgraph "Cluster-Level RBAC"
        CR1[ClusterRole:<br/>platform-dns-admin]
        CRB1[ClusterRoleBinding:<br/>platform-team]
    end

    subgraph "Namespace-Level RBAC"
        R1[Role: dns-full-admin<br/>namespace: dev-team-alpha]
        RB1[RoleBinding: dev-team-alpha-dns]

        R2[Role: dns-zone-admin<br/>namespace: app-team-a]
        RB2[RoleBinding: app-team-a-dns]
    end

    CR1 --> CRB1
    R1 --> RB1
    R2 --> RB2

    CRB1 -.->|grants| PlatformTeam[platform-team group]
    RB1 -.->|grants| DevTeam[dev-team-alpha group]
    RB2 -.->|grants| AppTeam[app-team-a group]

    style CR1 fill:#ffccbc
    style R1 fill:#c5e1a5
    style R2 fill:#c5e1a5

Minimal Permissions for Application Teams

If application teams only need to manage DNS records (not clusters):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: dns-record-editor
  namespace: app-team-a
rules:
# Only manage DNS zones and records
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones"
    - "arecords"
    - "cnamerecords"
    - "mxrecords"
    - "txtrecords"
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

# Read-only access to status
- apiGroups: ["bindy.firestoned.io"]
  resources:
    - "dnszones/status"
    - "arecords/status"
  verbs: ["get", "list", "watch"]

Security Best Practices

1. Namespace Isolation

Enforce strict namespace boundaries:

Records cannot reference zones in other namespaces
This is enforced by the controller using Api::namespaced()
No configuration needed - isolation is automatic

# team-a namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-a-zone  # ✅ Same namespace
  name: www
  ipv4Address: "192.0.2.1"
---
# This FAILS - cross-namespace reference blocked
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: team-a
spec:
  zoneRef: team-b-zone  # ❌ Different namespace - BLOCKED
  name: www
  ipv4Address: "192.0.2.1"

2. Least Privilege RBAC

Grant minimum necessary permissions:

# ✅ GOOD - Specific permissions
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["dnszones", "arecords"]
  verbs: ["get", "list", "create", "update"]

# ❌ BAD - Overly broad permissions
rules:
- apiGroups: ["bindy.firestoned.io"]
  resources: ["*"]
  verbs: ["*"]

3. Separate Platform and Tenant Roles

Keep platform and tenant permissions separate:

Role Type	Manages	Scope
Platform DNS Admin	`Bind9GlobalCluster`	Cluster-wide
Tenant Cluster Admin	`Bind9Cluster`, `Bind9Instance`	Namespace
Tenant Zone Admin	`DNSZone`, Records	Namespace
Tenant Record Editor	Records only	Namespace

4. Audit and Monitoring

Enable audit logging for DNS changes:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all changes to Bindy resources
- level: RequestResponse
  resources:
  - group: bindy.firestoned.io
    resources:
    - bind9globalclusters
    - bind9clusters
    - dnszones
    - arecords
    - mxrecords
  verbs: ["create", "update", "patch", "delete"]

5. NetworkPolicies for BIND9 Pods

Restrict network access to DNS pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-network-policy
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow DNS queries on port 53
  - from:
    - podSelector: {}  # All pods in namespace
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow Bindcar API access (internal only)
  - from:
    - podSelector:
        matchLabels:
          app: bindy-controller
    ports:
    - protocol: TCP
      port: 8080

Example Scenarios

Scenario 1: Multi-Region Production DNS

Requirement: Platform team manages production DNS across multiple regions.

# Platform creates global cluster per region
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns-us-east
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  acls:
    trusted:
      - "10.0.0.0/8"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns-eu-west
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 3
  acls:
    trusted:
      - "10.128.0.0/9"
---
# App teams create zones in their namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-us
  namespace: api-service
spec:
  zoneName: api.example.com
  globalClusterRef: production-dns-us-east
  soaRecord: { /* ... */ }
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone-eu
  namespace: api-service
spec:
  zoneName: api.eu.example.com
  globalClusterRef: production-dns-eu-west
  soaRecord: { /* ... */ }

Scenario 2: Development Team Sandboxes

Requirement: Each dev team has isolated DNS for testing.

# Dev Team Alpha namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: alpha-dns
  namespace: dev-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: alpha-zone
  namespace: dev-alpha
spec:
  zoneName: alpha.test.local
  clusterRef: alpha-dns
  soaRecord: { /* ... */ }
---
# Dev Team Beta namespace (completely isolated)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: beta-dns
  namespace: dev-beta
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: beta-zone
  namespace: dev-beta
spec:
  zoneName: beta.test.local
  clusterRef: beta-dns
  soaRecord: { /* ... */ }

Scenario 3: Hybrid - Platform + Tenant DNS

Requirement: Production uses platform DNS, dev teams use their own.

# Platform manages production global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
---
# Production app references global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-prod
  namespace: production
spec:
  zoneName: app.example.com
  globalClusterRef: production-dns  # Platform-managed
  soaRecord: { /* ... */ }
---
# Dev team manages their own cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: development
spec:
  version: "9.18"
  primary:
    replicas: 1
---
# Dev app references namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: app-dev
  namespace: development
spec:
  zoneName: app.dev.local
  clusterRef: dev-dns  # Team-managed
  soaRecord: { /* ... */ }

Next Steps

Architecture Overview - Understand the dual-cluster model
Choosing a Cluster Type - Decision guide
Quickstart Guide - Get started with examples

Choosing a Cluster Type

This guide helps you decide between Bind9Cluster (namespace-scoped) and Bind9GlobalCluster (cluster-scoped) for your DNS infrastructure.

Quick Decision Matrix

Factor	Bind9Cluster	Bind9GlobalCluster
Scope	Single namespace	Cluster-wide
Who Manages	Development teams	Platform teams
RBAC	Role + RoleBinding	ClusterRole + ClusterRoleBinding
Visibility	Namespace-only	All namespaces
Use Case	Dev/test environments	Production infrastructure
Zone References	`clusterRef`	`globalClusterRef`
Isolation	Complete isolation between teams	Shared infrastructure
Cost	Higher (per-namespace overhead)	Lower (shared resources)

Decision Tree

graph TD
    Start[Need DNS Infrastructure?]
    Start --> Q1{Who should<br/>manage it?}

    Q1 -->|Platform Team| Q2{Shared across<br/>multiple namespaces?}
    Q1 -->|Development Team| Q3{Need isolation<br/>from other teams?}

    Q2 -->|Yes| Global[Bind9GlobalCluster]
    Q2 -->|No| Cluster[Bind9Cluster]

    Q3 -->|Yes| Cluster
    Q3 -->|No| Global

    Global --> GlobalUse[Platform-managed<br/>Production DNS<br/>Shared infrastructure]
    Cluster --> ClusterUse[Team-managed<br/>Dev/Test DNS<br/>Isolated infrastructure]

    style Global fill:#c8e6c9
    style Cluster fill:#e1f5ff
    style GlobalUse fill:#a5d6a7
    style ClusterUse fill:#90caf9

When to Use Bind9Cluster (Namespace-Scoped)

Ideal For:

✅ Development and Testing Environments

Teams need isolated DNS for development
Frequent DNS configuration changes
Short-lived environments

✅ Multi-Tenant Platforms

Each tenant gets their own namespace
Complete isolation between tenants
Teams manage their own DNS independently

✅ Team Autonomy

Development teams need full control
No dependency on platform team
Self-service DNS management

✅ Learning and Experimentation

Safe environment to learn BIND9
Can delete and recreate easily
No impact on other teams

Example Use Cases:

1. Development Team Sandbox

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-alpha
spec:
  version: "9.18"
  primary:
    replicas: 1  # Minimal resources for dev
  secondary:
    replicas: 1
  global:
    options:
      - "recursion yes"  # Allow recursion for dev
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: test-zone
  namespace: dev-team-alpha
spec:
  zoneName: test.local
  clusterRef: dev-dns  # Namespace-scoped reference
  soaRecord:
    primaryNs: ns1.test.local.
    adminEmail: admin.test.local.
    serial: 2025010101
    refresh: 300  # Fast refresh for dev
    retry: 60
    expire: 3600
    negativeTtl: 60
  ttl: 60  # Low TTL for frequent changes

2. CI/CD Ephemeral Environments

# Each PR creates isolated DNS infrastructure
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: pr-{{PR_NUMBER}}-dns
  namespace: ci-pr-{{PR_NUMBER}}
  labels:
    pr-number: "{{PR_NUMBER}}"
    environment: ephemeral
spec:
  version: "9.18"
  primary:
    replicas: 1
  # Minimal config for short-lived environment

3. Multi-Tenant SaaS Platform

# Each customer gets isolated DNS in their namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: customer-dns
  namespace: customer-{{CUSTOMER_ID}}
  labels:
    customer-id: "{{CUSTOMER_ID}}"
spec:
  version: "9.18"
  primary:
    replicas: 1
  secondary:
    replicas: 1
  # Customer-specific ACLs and configuration
  acls:
    customer-networks:
      - "{{CUSTOMER_CIDR}}"

Characteristics:

✓ Pros:

Complete isolation between teams
No cross-namespace dependencies
Teams have full autonomy
Easy to delete and recreate
No ClusterRole permissions needed

✗ Cons:

Higher resource overhead (per-namespace clusters)
Cannot share DNS infrastructure across namespaces
Each team must manage their own BIND9 instances
Duplication of configuration

When to Use Bind9GlobalCluster (Cluster-Scoped)

Ideal For:

✅ Production Infrastructure

Centralized DNS for production workloads
High availability requirements
Shared across multiple applications

✅ Platform Team Management

Platform team provides DNS as a service
Centralized governance and compliance
Consistent configuration across environments

✅ Resource Efficiency

Share DNS infrastructure across namespaces
Reduce operational overhead
Lower total cost of ownership

✅ Enterprise Requirements

Audit logging and compliance
Centralized monitoring and alerting
Disaster recovery and backups

Example Use Cases:

1. Production DNS Infrastructure

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
  # No namespace - cluster-scoped
spec:
  version: "9.18"

  # High availability configuration
  primary:
    replicas: 3
    service:
      type: LoadBalancer
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
        service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"

  secondary:
    replicas: 3

  # Production-grade configuration
  global:
    options:
      - "recursion no"
      - "allow-transfer { none; }"
      - "notify yes"
      - "minimal-responses yes"

  # Access control
  acls:
    trusted:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    secondaries:
      - "10.10.1.0/24"

  # Persistent storage for zone files
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-storage
  volumeMounts:
    - name: zone-data
      mountPath: /var/cache/bind

Application teams reference the global cluster:

# Application in any namespace can use the global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-service  # Different namespace
spec:
  zoneName: api.example.com
  globalClusterRef: production-dns  # References cluster-scoped cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600
---
# Another application in a different namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: web-zone
  namespace: web-frontend  # Different namespace
spec:
  zoneName: www.example.com
  globalClusterRef: production-dns  # Same global cluster
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: dns-admin.example.com.
    serial: 2025010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

2. Multi-Region DNS

# Regional global clusters for geo-distributed DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-us-east
  labels:
    region: us-east-1
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.0.0.0/8"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-eu-west
  labels:
    region: eu-west-1
spec:
  version: "9.18"
  primary:
    replicas: 3
    service:
      type: LoadBalancer
  secondary:
    replicas: 2
  acls:
    region-networks:
      - "10.128.0.0/9"

3. Platform DNS as a Service

# Platform team provides multiple tiers of DNS service
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-premium
  labels:
    tier: premium
    sla: "99.99"
spec:
  version: "9.18"
  primary:
    replicas: 5  # High availability
    service:
      type: LoadBalancer
  secondary:
    replicas: 5
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: dns-standard
  labels:
    tier: standard
    sla: "99.9"
spec:
  version: "9.18"
  primary:
    replicas: 3
  secondary:
    replicas: 2

Characteristics:

✓ Pros:

Shared infrastructure across namespaces
Lower total resource usage
Centralized management and governance
Consistent configuration
Platform team controls DNS

✗ Cons:

Requires ClusterRole permissions
Platform team must manage it
Less autonomy for application teams
Single point of management (not necessarily failure)

Hybrid Approach

You can use both cluster types in the same Kubernetes cluster:

graph TB
    subgraph "Production Workloads"
        GlobalCluster[Bind9GlobalCluster<br/>production-dns]
        ProdZone1[DNSZone: api.example.com<br/>namespace: api-prod]
        ProdZone2[DNSZone: www.example.com<br/>namespace: web-prod]
    end

    subgraph "Development Namespace A"
        ClusterA[Bind9Cluster<br/>dev-dns-a]
        DevZoneA[DNSZone: dev-a.local<br/>namespace: dev-a]
    end

    subgraph "Development Namespace B"
        ClusterB[Bind9Cluster<br/>dev-dns-b]
        DevZoneB[DNSZone: dev-b.local<br/>namespace: dev-b]
    end

    GlobalCluster -.globalClusterRef.-> ProdZone1
    GlobalCluster -.globalClusterRef.-> ProdZone2
    ClusterA --> DevZoneA
    ClusterB --> DevZoneB

    style GlobalCluster fill:#c8e6c9
    style ClusterA fill:#e1f5ff
    style ClusterB fill:#e1f5ff

Example Configuration:

# Platform team manages production DNS globally
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: production-dns
spec:
  version: "9.18"
  primary:
    replicas: 3
---
# Dev teams manage their own DNS per namespace
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dev-dns
  namespace: dev-team-a
spec:
  version: "9.18"
  primary:
    replicas: 1
---
# Production app references global cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: prod-zone
  namespace: production
spec:
  zoneName: app.example.com
  globalClusterRef: production-dns
  soaRecord: { /* ... */ }
---
# Dev app references namespace-scoped cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: dev-zone
  namespace: dev-team-a
spec:
  zoneName: app.dev.local
  clusterRef: dev-dns
  soaRecord: { /* ... */ }

Common Scenarios

Scenario 1: Startup/Small Team

Recommendation: Start with Bind9Cluster (namespace-scoped)

Why:

Simpler RBAC (no ClusterRole needed)
Faster iteration and experimentation
Easy to recreate if configuration is wrong
Lower learning curve

Migration Path: When you grow, migrate production to Bind9GlobalCluster while keeping dev on Bind9Cluster.

Scenario 2: Enterprise with Platform Team

Recommendation: Use Bind9GlobalCluster for production, Bind9Cluster for dev

Why:

Platform team provides production DNS as a service
Development teams have autonomy in their namespaces
Clear separation of responsibilities
Resource efficiency at scale

Scenario 3: Multi-Tenant SaaS

Recommendation: Use Bind9Cluster per tenant namespace

Why:

Complete isolation between customers
Tenant-specific configuration
Easier to delete customer data (namespace deletion)
No risk of cross-tenant data leaks

Scenario 4: CI/CD with Ephemeral Environments

Recommendation: Use Bind9Cluster per environment

Why:

Isolated DNS per PR/branch
Easy cleanup when PR closes
No impact on other environments
Fast provisioning

Migration Between Cluster Types

From Bind9Cluster to Bind9GlobalCluster

Steps:

Create Bind9GlobalCluster:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: shared-dns
spec:
  # Copy configuration from Bind9Cluster
  version: "9.18"
  primary:
    replicas: 3
  secondary:
    replicas: 2

Update DNSZone References:

# Before
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: my-zone
  namespace: my-namespace
spec:
  zoneName: example.com
  clusterRef: my-cluster  # namespace-scoped

# After
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: my-zone
  namespace: my-namespace
spec:
  zoneName: example.com
  globalClusterRef: shared-dns  # cluster-scoped

Update RBAC (if needed):
- Application teams no longer need permissions for bind9clusters
- Only need permissions for dnszones and records

Delete Old Bind9Cluster:

kubectl delete bind9cluster my-cluster -n my-namespace

From Bind9GlobalCluster to Bind9Cluster

Steps:

Create Bind9Cluster in Target Namespace:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: team-dns
  namespace: my-namespace
spec:
  # Copy configuration from global cluster
  version: "9.18"
  primary:
    replicas: 2

Update DNSZone References:

# Before
spec:
  globalClusterRef: shared-dns

# After
spec:
  clusterRef: team-dns

Update RBAC (if needed):
- Team needs permissions for bind9clusters in their namespace

Summary

Choose This	If You Need
Bind9Cluster	Team autonomy, complete isolation, dev/test environments
Bind9GlobalCluster	Shared infrastructure, platform management, production DNS
Both (Hybrid)	Production on global, dev on namespace-scoped

Key Takeaway: There’s no “wrong” choice - select based on your organizational structure and requirements. Many organizations use both cluster types for different purposes.

Next Steps

Architecture Overview - Understand the dual-cluster model
Multi-Tenancy Guide - RBAC setup and examples
Quickstart Guide - Get started with examples

Creating DNS Infrastructure

This section guides you through setting up your DNS infrastructure using Bindy. A typical DNS setup consists of:

Primary DNS Instances: Authoritative DNS servers that host the master copies of your zones
Secondary DNS Instances: Replica servers that receive zone transfers from primaries
Multi-Region Setup: Geographically distributed DNS servers for redundancy

Overview

Bindy uses Kubernetes Custom Resources to define DNS infrastructure. The Bind9Instance resource creates and manages BIND9 DNS server deployments, including:

BIND9 Deployment pods
ConfigMaps for BIND9 configuration
Services for DNS traffic (TCP/UDP port 53)

Infrastructure Components

Bind9Instance

A Bind9Instance represents a single BIND9 DNS server deployment. You can create multiple instances for:

High availability - Multiple replicas of the same instance
Role separation - Separate primary and secondary instances
Geographic distribution - Instances in different regions or availability zones

Planning Your Infrastructure

Before creating instances, consider:

Zone Hosting Strategy
- Which zones will be primary vs. secondary?
- How will zones be distributed across instances?
Redundancy Requirements
- How many replicas per instance?
- How many geographic locations?
Label Strategy
- How will you select instances for zones?
- Common labels: dns-role, region, environment

Next Steps

Create Primary DNS Instances
Create Secondary DNS Instances
Setup Multi-Region Infrastructure

Primary DNS Instances

Primary DNS instances are authoritative DNS servers that host the master copies of your DNS zones. They are the source of truth for DNS data and handle zone updates.

Creating a Primary Instance

Here’s a basic example of a primary DNS instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"  # Allow zone transfers to secondary servers
    dnssec:
      enabled: true
      validation: true

Apply it with:

kubectl apply -f primary-instance.yaml

Configuration Options

Replicas

The replicas field controls how many BIND9 pods to run:

spec:
  replicas: 2  # Run 2 pods for high availability

BIND9 Version

Specify the BIND9 version to use:

spec:
  version: "9.18"  # Use BIND 9.18

Query Access Control

Control who can query your DNS server:

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"      # Allow queries from anywhere
      - "10.0.0.0/8"     # Or restrict to specific networks

Zone Transfer Control

Restrict zone transfers to authorized servers (typically secondaries):

spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"     # Allow transfers to secondary network
      - "192.168.1.0/24" # Or specific secondary server network

DNSSEC Configuration

Enable DNSSEC signing and validation:

spec:
  config:
    dnssec:
      enabled: true      # Enable DNSSEC signing
      validation: true   # Enable DNSSEC validation

Recursion

Primary authoritative servers should disable recursion:

spec:
  config:
    recursion: false  # Disable recursion for authoritative servers

Labels

Use labels to organize and select instances:

metadata:
  labels:
    dns-role: primary        # Indicates this is a primary server
    environment: production  # Environment designation
    region: us-east-1       # Geographic location

These labels are used by DNSZone resources to select which instances should host their zones.

Verifying Deployment

Check the instance status:

kubectl get bind9instances -n dns-system
kubectl describe bind9instance primary-dns -n dns-system

Check the created resources:

# View the deployment
kubectl get deployment -n dns-system -l instance=primary-dns

# View the pods
kubectl get pods -n dns-system -l instance=primary-dns

# View the service
kubectl get service -n dns-system -l instance=primary-dns

Testing DNS Resolution

Once deployed, test DNS queries:

# Get the service IP
SERVICE_IP=$(kubectl get svc -n dns-system primary-dns -o jsonpath='{.spec.clusterIP}')

# Test DNS query
dig @$SERVICE_IP example.com

Next Steps

Create DNS Zones to host on this instance
Setup Secondary Instances for redundancy
Configure Multi-Region Setup for geographic distribution

Secondary DNS Instances

Secondary DNS instances receive zone data from primary servers via zone transfers (AXFR/IXFR). They provide redundancy and load distribution for DNS queries.

Creating a Secondary Instance

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    dns-role: secondary
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Apply with:

kubectl apply -f secondary-instance.yaml

Key Differences from Primary

No Zone Transfers Allowed

Secondary servers typically don’t allow zone transfers:

spec:
  config:
    allowTransfer: []  # Empty or omitted - no transfers from secondary

Read-Only Zones

Secondaries receive zone data from primaries and cannot be updated directly. All zone modifications must be made on the primary server.

Label for Selection

Use the role: secondary label to enable automatic zone transfer configuration:

metadata:
  labels:
    role: secondary      # Required for automatic discovery
    cluster: production  # Required - must match cluster name

Important: The role: secondary label is required for Bindy to automatically discover secondary instances and configure zone transfers on primary zones.

Configuring Secondary Zones

When creating a DNSZone resource for secondary zones, use the secondary type and specify primary servers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"  # IP of primary DNS server
      - "10.0.1.11"  # Additional primary for redundancy

Automatic Zone Transfer Configuration

New in v0.1.0: Bindy automatically configures zone transfers from primaries to secondaries!

When you create primary DNSZone resources, Bindy automatically:

Discovers secondary instances using the role=secondary label
Configures zone transfers on primary zones with also-notify and allow-transfer
Tracks secondary IPs in DNSZone.status.secondaryIps
Detects IP changes when secondary pods restart or are rescheduled
Auto-updates zones when secondary IPs change (within 5-10 minutes)

Example:

# Create secondary instance with proper labels
cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    role: secondary          # Required for discovery
    cluster: production      # Must match cluster name
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
EOF

# Create primary zone - zone transfers auto-configured!
cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production  # Matches cluster label
  # ... other config ...
EOF

# Verify automatic configuration
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'
# Output: ["10.244.1.5","10.244.2.8"]

Self-Healing: When secondary pods restart and get new IPs:

Bindy detects the change within one reconciliation cycle (~5-10 minutes)
Primary zones are automatically updated with new secondary IPs
Zone transfers resume automatically with no manual intervention

Verifying Zone Transfers

Check that zones are being transferred:

# Check zone files on secondary
kubectl exec -n dns-system deployment/secondary-dns -- ls -la /var/lib/bind/zones/

# Check BIND9 logs for transfer messages
kubectl logs -n dns-system -l instance=secondary-dns | grep "transfer of"

# Verify secondary IPs are configured on primary zones
kubectl get dnszone -n dns-system -o yaml | yq '.items[].status.secondaryIps'

Best Practices

Use Multiple Secondaries

Deploy secondary instances in different locations:

# Secondary in different AZ/region
metadata:
  labels:
    dns-role: secondary
    region: us-west-1

Configure NOTIFY

Primary servers send NOTIFY messages to secondaries when zones change. Ensure network connectivity allows these notifications.

Monitor Transfer Status

Watch for failed transfers in logs:

kubectl logs -n dns-system -l instance=secondary-dns --tail=100 | grep -i transfer

Network Requirements

Secondaries must be able to:

Receive zone transfers from primaries (TCP port 53)
Receive NOTIFY messages from primaries (UDP port 53)
Respond to DNS queries from clients (UDP/TCP port 53)

Ensure Kubernetes network policies and firewall rules allow this traffic.

Next Steps

Configure Multi-Region Setup with geographically distributed secondaries
Create Secondary Zones that transfer from primaries
Monitor DNS Infrastructure

Multi-Region Setup

Distribute your DNS infrastructure across multiple regions or availability zones for maximum availability and performance.

Architecture Overview

A multi-region DNS setup typically includes:

Primary instances in one or more regions
Secondary instances in multiple geographic locations
Zone distribution across all instances using label selectors

Creating Regional Instances

Primary in Region 1

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east
  namespace: dns-system
  labels:
    dns-role: primary
    region: us-east-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true

Secondary in Region 2

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  namespace: dns-system
  labels:
    dns-role: secondary
    region: us-west-2
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Secondary in Region 3

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  namespace: dns-system
  labels:
    dns-role: secondary
    region: eu-west-1
    environment: production
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Distributing Zones Across Regions

Create zones that target all regions:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  type: primary
  instanceSelector:
    matchExpressions:
      - key: environment
        operator: In
        values:
          - production
      - key: dns-role
        operator: In
        values:
          - primary
          - secondary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

This zone will be deployed to all instances matching the selector (all production primary and secondary instances).

Deployment Strategy

Option 1: Primary-Secondary Model

One region hosts primary instances
All other regions host secondary instances
Zone transfers flow from primary to secondaries

graph LR
    region1["Region 1 (us-east-1)<br/>Primary Instances<br/>(Master zones)"]
    region2["Region 2 (us-west-2)<br/>Secondary Instances<br/>(Slave zones)"]
    region3["Region 3 (eu-west-1)<br/>Secondary Instances<br/>(Slave zones)"]

    region1 -->|Zone Transfer| region2
    region2 -->|Zone Transfer| region3

    style region1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style region2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style region3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Option 2: Multi-Primary Model

Multiple regions host primary instances
Different zones can have primaries in different regions
Use careful labeling to route zones to appropriate primaries

Network Considerations

Zone Transfer Network

Ensure network connectivity for zone transfers:

Primaries must reach secondaries on TCP port 53
Use VPN, peering, or allow public transfer with IP restrictions

Client Query Routing

Use one of:

GeoDNS - Route clients to nearest regional instance
Anycast - Same IP announced from multiple locations
Load Balancer - Distribute across regional endpoints

Failover Strategy

Automatic Failover

Kubernetes handles pod-level failures automatically:

spec:
  replicas: 2  # Multiple replicas for pod-level HA

Regional Failover

For regional failures:

Clients automatically query secondary instances in other regions
Zone data remains available via zone transfers
Updates queue until primary region recovers

Manual Failover

To manually promote a secondary to primary:

Update DNSZone to change primary servers
Update instance labels if needed
Verify zone transfers are working correctly

Monitoring Multi-Region Setup

Check instance distribution:

# View all instances and their regions
kubectl get bind9instances -n dns-system -L region

# Check zone distribution
kubectl describe dnszone example-com -n dns-system

Monitor zone transfers:

# Check transfer logs on secondaries
kubectl logs -n dns-system -l dns-role=secondary | grep "transfer of"

Best Practices

Use Odd Number of Regions: 3 or 5 regions for better quorum
Distribute Replicas: Spread replicas across availability zones
Monitor Latency: Watch zone transfer times between regions
Test Failover: Regularly test regional failover scenarios
Automate Updates: Use GitOps for consistent multi-region deployments

Next Steps

Configure Monitoring for multi-region health
Set Up DNSSEC across all regions
Implement Disaster Recovery procedures

Managing DNS Zones

DNS zones are the containers for DNS records. In Bindy, zones are defined using the DNSZone custom resource.

Zone Types

Primary Zones

Primary (master) zones contain the authoritative data:

Zone data is created and managed on the primary
Changes are made by creating/updating DNS record resources
Can be transferred to secondary servers

Secondary Zones

Secondary (slave) zones receive data from primary servers:

Zone data is received via AXFR/IXFR transfers
Read-only - cannot be modified directly
Automatically updated when primary changes

Zone Lifecycle

Create Bind9Instance resources to host zones
Create DNSZone resource with instance selector
Add DNS records (A, CNAME, MX, etc.)
Monitor status to ensure zone is active

Instance Selection

Zones are deployed to Bind9Instances using label selectors:

spec:
  instanceSelector:
    matchLabels:
      dns-role: primary
      environment: production

This deploys the zone to all instances matching both labels.

SOA Record

Every primary zone requires an SOA (Start of Authority) record:

spec:
  soaRecord:
    primaryNs: ns1.example.com.      # Primary nameserver
    adminEmail: admin@example.com    # Admin email (@ becomes .)
    serial: 2024010101               # Zone serial number
    refresh: 3600                    # Refresh interval
    retry: 600                       # Retry interval
    expire: 604800                   # Expiration time
    negativeTtl: 86400              # Negative caching TTL

Zone Configuration

TTL (Time To Live)

Set the default TTL for records in the zone:

spec:
  ttl: 3600  # 1 hour default TTL

Individual records can override this with their own TTL values.

Zone Status

Check zone status:

kubectl get dnszone -n dns-system
kubectl describe dnszone example-com -n dns-system

Status conditions indicate:

Whether the zone is ready
Which instances are hosting the zone
Any errors or warnings

Common Operations

Listing Zones

# List all zones
kubectl get dnszones -n dns-system

# Show zones with custom columns
kubectl get dnszones -n dns-system -o custom-columns=NAME:.metadata.name,ZONE:.spec.zoneName,TYPE:.spec.type

Viewing Zone Details

kubectl describe dnszone example-com -n dns-system

Updating Zones

Edit the zone configuration:

kubectl edit dnszone example-com -n dns-system

Or apply an updated YAML file:

kubectl apply -f zone.yaml

Deleting Zones

kubectl delete dnszone example-com -n dns-system

This removes the zone from all instances but doesn’t delete the instance itself.

Next Steps

Create Primary Zones
Understanding Label Selectors
Zone Configuration Options
Add DNS Records

Creating Zones

Learn how to create DNS zones in Bindy using the RNDC protocol.

Zone Architecture

Zones in Bindy follow a three-tier model:

Bind9Cluster - Cluster-level configuration (version, shared config, TSIG keys)
Bind9Instance - Individual BIND9 server deployment (references a cluster)
DNSZone - DNS zone (references an instance via clusterRef)

Prerequisites

Before creating a zone, ensure you have:

A Bind9Cluster resource deployed
A Bind9Instance resource deployed (referencing the cluster)
The instance is ready and running

Creating a Primary Zone

First, ensure you have a cluster and instance:

# Step 1: Create a Bind9Cluster (if not already created)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"

---
# Step 2: Create a Bind9Instance (if not already created)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # References the Bind9Cluster above
  role: primary
  replicas: 1

---
# Step 3: Create the DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns  # References the Bind9Instance above
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

How It Works

When you create a DNSZone:

Controller discovers pods - Finds BIND9 pods with label instance=primary-dns
Loads RNDC key - Retrieves Secret named primary-dns-rndc-key
Connects via RNDC - Establishes connection to primary-dns.dns-system.svc.cluster.local:953
Executes addzone - Runs rndc addzone example.com command
BIND9 creates zone - BIND9 creates the zone and starts serving it
Updates status - Controller updates DNSZone status to Ready

Verifying Zone Creation

Check the zone status:

kubectl get dnszones -n dns-system
kubectl describe dnszone example-com -n dns-system

Expected output:

Name:         example-com
Namespace:    dns-system
Labels:       <none>
Annotations:  <none>
API Version:  bindy.firestoned.io/v1alpha1
Kind:         DNSZone
Spec:
  Cluster Ref:  primary-dns
  Zone Name:    example.com
Status:
  Conditions:
    Type:    Ready
    Status:  True
    Reason:  Synchronized
    Message: Zone created for cluster: primary-dns

Next Steps

Add DNS Records to your zone
Configure Zone Transfers for secondaries
Learn about the RNDC Protocol

Cluster References

Bindy uses direct cluster references instead of label selectors for targeting DNS zones to BIND9 instances.

Overview

In Bindy’s three-tier architecture, resources reference each other directly by name:

Bind9Cluster ← clusterRef ← Bind9Instance
       ↑
   clusterRef ← DNSZone ← zoneRef ← DNS Records

This provides:

Explicit targeting - Clear, direct references instead of label matching
Simpler configuration - No complex selector logic
Better validation - References can be validated at admission time
Easier troubleshooting - Direct relationships are easier to understand

Cluster Reference Model

Bind9Cluster (Top-Level)

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false

Bind9Instance References Bind9Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: production-dns  # Direct reference to Bind9Cluster name
  role: primary  # Required: primary or secondary
  replicas: 2

DNSZone References Bind9Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: production-dns  # Direct reference to Bind9Cluster name
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.

How References Work

When you create a DNSZone with clusterRef: production-dns:

Controller finds the Bind9Cluster - Looks up Bind9Cluster named production-dns
Discovers instances - Finds all Bind9Instance resources referencing this cluster
Identifies primaries - Selects instances with role: primary
Loads RNDC keys - Retrieves RNDC keys from cluster configuration
Connects via RNDC - Connects to primary instance pods via RNDC
Creates zone - Executes rndc addzone command on primary instances

Example: Multi-Region Setup

East Region

# East Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dns-cluster-east
  namespace: dns-system
spec:
  version: "9.18"

---
# East Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-east
  namespace: dns-system
spec:
  clusterRef: dns-cluster-east
  role: primary  # Required: primary or secondary
  replicas: 2

---
# Zone on East Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-east
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-cluster-east  # Targets east cluster

West Region

# West Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: dns-cluster-west
  namespace: dns-system
spec:
  version: "9.18"

---
# West Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns-west
  namespace: dns-system
spec:
  clusterRef: dns-cluster-west
  role: primary  # Required: primary or secondary
  replicas: 2

---
# Zone on West Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-west
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-cluster-west  # Targets west cluster

Benefits Over Label Selectors

Simpler Configuration

Old approach (label selectors):

# Had to set labels on instance
labels:
  dns-role: primary
  region: us-east

# Had to use selector in zone
instanceSelector:
  matchLabels:
    dns-role: primary
    region: us-east

New approach (cluster references):

# Just reference by name
clusterRef: primary-dns

Better Validation

References can be validated at admission time
Typos are caught immediately
No ambiguity about which instance will host the zone

Clearer Relationships

# See exactly which instance hosts a zone
kubectl get dnszone example-com -o jsonpath='{.spec.clusterRef}'

# See which cluster an instance belongs to
kubectl get bind9instance primary-dns -o jsonpath='{.spec.clusterRef}'

Migrating from Label Selectors

If you have old DNSZone resources using instanceSelector, migrate them:

Before:

spec:
  zoneName: example.com
  instanceSelector:
    matchLabels:
      dns-role: primary

After:

spec:
  zoneName: example.com
  clusterRef: production-dns  # Direct reference to cluster name

Next Steps

Creating Zones - Learn how to create zones with cluster references
Multi-Region Setup - Deploy zones across multiple regions
RNDC-Based Architecture - Understand the RNDC protocol

Zone Configuration

Advanced zone configuration options.

Default TTL

Set the default TTL for all records in the zone:

spec:
  ttl: 3600  # 1 hour

SOA Record Details

spec:
  soaRecord:
    primaryNs: ns1.example.com.    # Primary nameserver FQDN (must end with .)
    adminEmail: admin@example.com  # Admin email (@ replaced with . in zone file)
    serial: 2024010101             # Serial number (YYYYMMDDnn format recommended)
    refresh: 3600                  # How often secondaries check for updates (seconds)
    retry: 600                     # How long to wait before retry after failed refresh
    expire: 604800                 # When to stop answering if no refresh (1 week)
    negativeTtl: 86400             # TTL for negative responses (NXDOMAIN)

Secondary Zone Configuration

For secondary zones, specify primary servers:

spec:
  type: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"

Managing DNS Records

DNS records are the actual data in your zones - IP addresses, mail servers, text data, etc.

Record Types

Bindy supports all common DNS record types:

A Records - IPv4 addresses
AAAA Records - IPv6 addresses
CNAME Records - Canonical name (alias)
MX Records - Mail exchange servers
TXT Records - Text data (SPF, DKIM, DMARC, verification)
NS Records - Nameserver delegation
SRV Records - Service location
CAA Records - Certificate authority authorization

Record Structure

All records share common fields:

apiVersion: bindy.firestoned.io/v1alpha1
kind: <RecordType>
metadata:
  name: <unique-name>
  namespace: dns-system
spec:
  # Zone reference - use ONE of these:
  zone: <zone-name>            # Match against DNSZone spec.zoneName
  # OR
  zoneRef: <zone-resource-name> # Direct reference to DNSZone metadata.name

  name: <record-name>          # Name within the zone
  ttl: <optional-ttl>          # Override zone default TTL
  # ... record-specific fields

Referencing DNS Zones

DNS records must reference an existing DNSZone. There are two ways to reference a zone:

Method 1: Using `zone` Field (Zone Name Lookup)

The zone field searches for a DNSZone by matching its spec.zoneName:

spec:
  zone: example.com  # Matches DNSZone with spec.zoneName: example.com
  name: www

How it works:

The controller lists all DNSZones in the namespace
Searches for one with spec.zoneName matching the provided value
More intuitive - you specify the actual DNS zone name

When to use:

Quick testing and development
When you’re not sure of the resource name
When readability is more important than performance

Method 2: Using `zoneRef` Field (Direct Reference)

The zoneRef field directly references a DNSZone by its Kubernetes resource name:

spec:
  zoneRef: example-com  # Matches DNSZone with metadata.name: example-com
  name: www

How it works:

The controller directly retrieves the DNSZone by metadata.name
No search required - single API call
More efficient

When to use:

Production environments (recommended)
Large namespaces with many zones
When performance matters
Infrastructure-as-code with known resource names

Choosing Between `zone` and `zoneRef`

Criteria	`zone`	`zoneRef`
Performance	Slower (list + search)	Faster (direct get)
Readability	More intuitive	Less obvious
Use Case	Development/testing	Production
API Calls	Multiple	Single
Best For	Humans writing YAML	Automation/templates

Important: You must specify exactly one of zone or zoneRef - not both, not neither.

Example: Same Record, Two Methods

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com        # Kubernetes resource name
  namespace: dns-system
spec:
  zoneName: example.com    # Actual DNS zone name
  clusterRef: primary-dns
  # ...

Create an A record using either method:

Using zone (matches spec.zoneName):

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zone: example.com     # ← Actual zone name
  name: www
  ipv4Address: "192.0.2.1"

Using zoneRef (matches metadata.name):

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com  # ← Resource name
  name: www
  ipv4Address: "192.0.2.1"

Both create the same DNS record: www.example.com → 192.0.2.1

Creating Records

After choosing your zone reference method, specify the record details:

spec:
  zoneRef: example-com  # Recommended for production
  name: www             # Creates www.example.com
  ipv4Address: "192.0.2.1"
  ttl: 300             # Optional - overrides zone default

Next Steps

A Records - IPv4 addresses
AAAA Records - IPv6 addresses
CNAME Records - Aliases
MX Records - Mail servers
TXT Records - Text data
NS Records - Delegation
SRV Records - Services
CAA Records - Certificate authority

A Records (IPv4)

A records map domain names to IPv4 addresses.

Creating an A Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300

This creates www.example.com -> 192.0.2.1.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Root Record

For the zone apex (example.com):

spec:
  zoneRef: example-com
  name: "@"
  ipv4Address: "192.0.2.1"

Multiple A Records

Create multiple records for the same name for load balancing:

kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-1
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-2
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.2"

AAAA Records (IPv6)

AAAA records map domain names to IPv6 addresses. They are the IPv6 equivalent of A records.

Creating an AAAA Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-ipv6
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

This creates www.example.com -> 2001:db8::1.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Root Record

For the zone apex (example.com):

spec:
  zoneRef: example-com
  name: "@"
  ipv6Address: "2001:db8::1"

Multiple AAAA Records

Create multiple records for the same name for load balancing:

kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6-1
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6-2
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::2"
EOF

DNS clients will receive both addresses (round-robin load balancing).

Dual-Stack Configuration

For dual-stack (IPv4 + IPv6) configuration, create both A and AAAA records:

# IPv4
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-ipv4
spec:
  zoneRef: example-com
  name: www
  ipv4Address: "192.0.2.1"
  ttl: 300
---
# IPv6
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-ipv6
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8::1"
  ttl: 300

Clients will use IPv6 if available, falling back to IPv4 otherwise.

IPv6 Address Formats

IPv6 addresses support various formats:

# Full format
ipv6Address: "2001:0db8:0000:0000:0000:0000:0000:0001"

# Compressed format (recommended)
ipv6Address: "2001:db8::1"

# Link-local address
ipv6Address: "fe80::1"

# Loopback
ipv6Address: "::1"

# IPv4-mapped IPv6
ipv6Address: "::ffff:192.0.2.1"

Common Use Cases

Web Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: web-ipv6
spec:
  zoneRef: example-com
  name: www
  ipv6Address: "2001:db8:1::443"
  ttl: 300

API Endpoint

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: api-ipv6
spec:
  zoneRef: example-com
  name: api
  ipv6Address: "2001:db8:2::443"
  ttl: 60  # Short TTL for faster updates

Mail Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: mail-ipv6
spec:
  zoneRef: example-com
  name: mail
  ipv6Address: "2001:db8:3::25"
  ttl: 3600

Best Practices

Use compressed format - 2001:db8::1 instead of 2001:0db8:0000:0000:0000:0000:0000:0001
Dual-stack when possible - Provide both A and AAAA records for compatibility
Match TTLs - Use the same TTL for A and AAAA records of the same name
Test IPv6 connectivity - Ensure your infrastructure supports IPv6 before advertising AAAA records

Status Monitoring

Check the status of your AAAA record:

kubectl get aaaarecord www-ipv6 -o yaml

Look for the status.conditions field:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Troubleshooting

Record not resolving

Check record status:

kubectl get aaaarecord www-ipv6 -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'

Verify zone exists:
```
kubectl get dnszone example-com
```

Test DNS resolution:

dig AAAA www.example.com @<dns-server-ip>

Invalid IPv6 address

The controller validates IPv6 addresses. Ensure your address is in valid format:

Use compressed notation: 2001:db8::1
Do not mix uppercase/lowercase unnecessarily
Ensure all segments are valid hexadecimal

Next Steps

DNS Records Overview - Complete guide to all record types
MX Records - Mail exchange records
TXT Records - Text records for SPF, DKIM, etc.
Monitoring DNS - Monitor your DNS infrastructure

CNAME Records

CNAME (Canonical Name) records create aliases to other domain names.

Creating a CNAME Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: blog-example-com
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: blog
  target: www.example.com.  # Must end with a dot
  ttl: 300

This creates blog.example.com -> www.example.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

Important CNAME Rules

Target Must Be Fully Qualified

The target field must be a fully qualified domain name (FQDN) ending with a dot:

# ✅ Correct
target: www.example.com.

# ❌ Incorrect - missing trailing dot
target: www.example.com

No CNAME at Zone Apex

CNAME records cannot be created at the zone apex (@):

# ❌ Not allowed - RFC 1034/1035 violation
spec:
  zoneRef: example-com
  name: "@"
  target: www.example.com.

For the zone apex, use A Records or AAAA Records instead.

No Other Records for Same Name

If a CNAME exists for a name, no other record types can exist for that same name (RFC 1034):

# ❌ Not allowed - www already has a CNAME
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-alias
spec:
  zoneRef: example-com
  name: www
  target: server.example.com.
---
# ❌ This will conflict with the CNAME above
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-a-record
spec:
  zoneRef: example-com
  name: www  # Same name as CNAME - not allowed
  ipv4Address: "192.0.2.1"

Common Use Cases

Aliasing to External Services

Point to external services like CDNs or cloud providers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: cdn
  target: d111111abcdef8.cloudfront.net.
  ttl: 3600

Subdomain Aliases

Create aliases for subdomains:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: shop-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: shop
  target: www.example.com.
  ttl: 300

This creates shop.example.com -> www.example.com.

Internal Service Discovery

Point to internal Kubernetes services:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cache-internal
  namespace: dns-system
spec:
  zoneRef: internal-local
  name: cache
  target: db.internal.local.
  ttl: 300

www to Non-www Redirect

Create a www alias:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zoneRef: example-com
  name: www
  target: example.com.
  ttl: 300

Note: This only works if example.com has an A or AAAA record, not another CNAME.

Field Reference

Field	Type	Required	Description
`zone`	string	Either `zone` or `zoneRef`	DNS zone name (e.g., “example.com”)
`zoneRef`	string	Either `zone` or `zoneRef`	Reference to DNSZone metadata.name
`name`	string	Yes	Record name within the zone (cannot be “@”)
`target`	string	Yes	Target FQDN ending with a dot
`ttl`	integer	No	Time To Live in seconds (default: zone TTL)

TTL Behavior

If ttl is not specified, the zone’s default TTL is used:

# Uses zone default TTL
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.

# Explicit TTL override
spec:
  zoneRef: example-com
  name: blog
  target: www.example.com.
  ttl: 600  # 10 minutes

Troubleshooting

CNAME Loop Detection

Avoid creating CNAME loops:

# ❌ Creates a loop
# a.example.com -> b.example.com
# b.example.com -> a.example.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cname-a
spec:
  zoneRef: example-com
  name: a
  target: b.example.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cname-b
spec:
  zoneRef: example-com
  name: b
  target: a.example.com.  # ❌ Loop!

Missing Trailing Dot

If your CNAME doesn’t resolve correctly, check for the trailing dot:

# Check the BIND9 zone file
kubectl exec -n dns-system bindy-primary-0 -- cat /etc/bind/zones/example.com.zone

# Should show:
# blog.example.com.  300  IN  CNAME  www.example.com.

If you see relative names, the target is missing the trailing dot:

# ❌ Wrong - becomes blog.example.com -> www.example.com.example.com
blog.example.com.  300  IN  CNAME  www.example.com

MX Records (Mail Exchange)

MX records specify the mail servers responsible for accepting email on behalf of a domain. Each MX record includes a priority value that determines the order in which mail servers are contacted.

Creating an MX Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-example
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"             # Zone apex - mail for @example.com
  priority: 10
  mailServer: mail.example.com.  # Must end with a dot (FQDN)
  ttl: 3600

This configures mail delivery for example.com to mail.example.com with priority 10.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details on choosing between zone and zoneRef.

FQDN Requirement

CRITICAL: The mailServer field MUST end with a dot (.) to indicate a fully qualified domain name (FQDN).

# ✅ CORRECT
mailServer: mail.example.com.

# ❌ WRONG - will be treated as relative to zone
mailServer: mail.example.com

Priority Values

Lower priority values are preferred. Mail servers with the lowest priority are contacted first.

Single Mail Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.

Multiple Mail Servers (Failover)

# Primary mail server (lowest priority)
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail1.example.com.
  ttl: 3600
---
# Backup mail server
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-backup
spec:
  zoneRef: example-com
  name: "@"
  priority: 20
  mailServer: mail2.example.com.
  ttl: 3600

Sending servers will try mail1.example.com first (priority 10), falling back to mail2.example.com (priority 20) if the primary is unavailable.

Load Balancing

Equal priority values enable round-robin load balancing:

# Server 1
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-1
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail1.example.com.
---
# Server 2 (same priority)
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-2
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail2.example.com.

Both servers share the load equally.

Subdomain Mail

Configure mail for a subdomain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: support-mail
spec:
  zoneRef: example-com
  name: support  # Email: user@support.example.com
  priority: 10
  mailServer: mail-support.example.com.

Common Configurations

Google Workspace (formerly G Suite)

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-1
spec:
  zoneRef: example-com
  name: "@"
  priority: 1
  mailServer: aspmx.l.google.com.
  ttl: 3600
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-2
spec:
  zoneRef: example-com
  name: "@"
  priority: 5
  mailServer: alt1.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-3
spec:
  zoneRef: example-com
  name: "@"
  priority: 5
  mailServer: alt2.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-4
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: alt3.aspmx.l.google.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-google-5
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: alt4.aspmx.l.google.com.

Microsoft 365

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-microsoft
spec:
  zoneRef: example-com
  name: "@"
  priority: 0
  mailServer: example-com.mail.protection.outlook.com.  # Replace 'example-com' with your domain
  ttl: 3600

Self-Hosted Mail Server

# Primary MX
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
---
# Corresponding A record for mail server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-server
spec:
  zoneRef: example-com
  name: mail
  ipv4Address: "203.0.113.10"

Best Practices

Always use FQDNs - End mailServer values with a dot (.)
Set appropriate TTLs - Use longer TTLs (3600-86400) for stable mail configurations
Configure backups - Use multiple MX records with different priorities for redundancy
Test mail delivery - Verify mail flow after DNS changes
Coordinate with SPF/DKIM - Update TXT records when adding mail servers

Required Supporting Records

MX records need corresponding A/AAAA records for the mail servers:

# MX record points to mail.example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-main
spec:
  zoneRef: example-com
  name: "@"
  priority: 10
  mailServer: mail.example.com.
---
# A record for mail.example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-server-ipv4
spec:
  zoneRef: example-com
  name: mail
  ipv4Address: "203.0.113.10"
---
# AAAA record for IPv6
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: mail-server-ipv6
spec:
  zoneRef: example-com
  name: mail
  ipv6Address: "2001:db8::10"

Status Monitoring

Check the status of your MX record:

kubectl get mxrecord mx-primary -o yaml

Look for the status.conditions field:

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Troubleshooting

Mail not being delivered

Check MX record status:

kubectl get mxrecord mx-primary -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}'

Verify DNS propagation:
```
dig MX example.com @<dns-server-ip>
```
Test from external servers:
```
nslookup -type=MX example.com 8.8.8.8
```
Check mail server A/AAAA records exist:
```
dig A mail.example.com
```

Common Mistakes

Missing trailing dot - mail.example.com instead of mail.example.com.
No A/AAAA record - MX points to a hostname that doesn’t resolve
Wrong priority - Higher priority when you meant lower (remember: lower = preferred)
Relative vs absolute - Without trailing dot, name is treated as relative to zone

Testing Mail Configuration

Test MX lookup

# Query MX records
dig MX example.com

# Expected output shows priority and mail server
;; ANSWER SECTION:
example.com.  3600  IN  MX  10 mail.example.com.
example.com.  3600  IN  MX  20 mail2.example.com.

Test mail server connectivity

# Test SMTP connection
telnet mail.example.com 25

# Or using openssl for TLS
openssl s_client -starttls smtp -connect mail.example.com:25

Next Steps

TXT Records - Configure SPF, DKIM, DMARC for mail authentication
A Records - Create A records for mail servers
DNS Records Overview - Complete guide to all record types
Monitoring DNS - Monitor your DNS infrastructure

TXT Records (Text)

TXT records store arbitrary text data in DNS. They’re commonly used for domain verification, email security (SPF, DKIM, DMARC), and other service configurations.

Creating a TXT Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: verification-txt
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"
  text: "v=spf1 include:_spf.example.com ~all"
  ttl: 3600

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

Common Use Cases

SPF (Sender Policy Framework)

Authorize mail servers to send email on behalf of your domain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
spec:
  zoneRef: example-com
  name: "@"
  text: "v=spf1 mx include:_spf.google.com ~all"
  ttl: 3600

Common SPF mechanisms:

mx - Allow servers in MX records
a - Allow A/AAAA records of domain
ip4:192.0.2.0/24 - Allow specific IPv4 range
include:domain.com - Include another domain’s SPF policy
~all - Soft fail (recommended)
-all - Hard fail (strict)

DKIM (Domain Keys Identified Mail)

Publish DKIM public keys:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim-selector
spec:
  zoneRef: example-com
  name: default._domainkey  # selector._domainkey format
  text: "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBA..."
  ttl: 3600

DMARC (Domain-based Message Authentication)

Set email authentication policy:

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc-policy
spec:
  zoneRef: example-com
  name: _dmarc
  text: "v=DMARC1; p=quarantine; rua=mailto:dmarc@example.com"
  ttl: 3600

DMARC policies:

p=none - Monitor only (recommended for testing)
p=quarantine - Treat failures as spam
p=reject - Reject failures outright

Domain Verification

Verify domain ownership for services:

# Google verification
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: google-verification
spec:
  zoneRef: example-com
  name: "@"
  text: "google-site-verification=1234567890abcdef"
---
# Microsoft verification
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: ms-verification
spec:
  zoneRef: example-com
  name: "@"
  text: "MS=ms12345678"

Service-Specific Records

Atlassian Domain Verification

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: atlassian-verify
spec:
  zoneRef: example-com
  name: "@"
  text: "atlassian-domain-verification=abc123"

Stripe Domain Verification

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: stripe-verify
spec:
  zoneRef: example-com
  name: "_stripe-verification"
  text: "stripe-verification=xyz789"

Multiple TXT Values

Some records require multiple TXT strings. Create separate records:

# SPF record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: txt-spf
spec:
  zoneRef: example-com
  name: "@"
  text: "v=spf1 include:_spf.google.com ~all"
---
# Domain verification (same name, different value)
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: txt-verify
spec:
  zoneRef: example-com
  name: "@"
  text: "google-site-verification=abc123"

Both records will exist under the same DNS name.

String Formatting

Long Strings

DNS TXT records have a 255-character limit per string. For longer values, the DNS server automatically splits them:

spec:
  text: "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."  # Can be long

Special Characters

Quote strings containing spaces or special characters:

spec:
  text: "This string contains spaces"
  text: "key=value; another-key=another value"

Best Practices

Keep TTLs moderate - 3600 (1 hour) is typical for TXT records
Test before deploying - Verify SPF/DKIM/DMARC records with online tools
Monitor DMARC reports - Set up rua and ruf addresses to receive reports
Start with soft policies - Use ~all for SPF and p=none for DMARC initially
Document record purposes - Use clear resource names

Status Monitoring

kubectl get txtrecord spf-record -o yaml

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test TXT record

# Query TXT records
dig TXT example.com

# Test SPF
dig TXT example.com | grep spf

# Test DKIM
dig TXT default._domainkey.example.com

# Test DMARC
dig TXT _dmarc.example.com

Online Validation Tools

Common Issues

SPF too long - Limit DNS lookups to 10 (use include wisely)
DKIM not found - Verify selector name matches mail server configuration
DMARC syntax error - Validate with online tools before deploying

Next Steps

MX Records - Configure mail servers
DNS Records Overview - Complete guide to all record types
Monitoring DNS - Monitor your DNS infrastructure

NS Records (Name Server)

NS records delegate a subdomain to a different set of nameservers. This is essential for subdomain delegation and zone distribution.

Creating an NS Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: subdomain-ns
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: sub              # Subdomain to delegate
  nameserver: ns1.subdomain-host.com.  # Must end with dot (FQDN)
  ttl: 3600

This delegates sub.example.com to ns1.subdomain-host.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

Subdomain Delegation

Delegate a subdomain to external nameservers:

# Primary nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: dev-ns1
spec:
  zoneRef: example-com
  name: dev
  nameserver: ns1.hosting-provider.com.
---
# Secondary nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: dev-ns2
spec:
  zoneRef: example-com
  name: dev
  nameserver: ns2.hosting-provider.com.

Now dev.example.com is managed by the hosting provider’s DNS servers.

Common Use Cases

Multi-Cloud Delegation

# Delegate subdomain to AWS Route 53
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: aws-ns1
spec:
  zoneRef: example-com
  name: aws
  nameserver: ns-123.awsdns-12.com.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: aws-ns2
spec:
  zoneRef: example-com
  name: aws
  nameserver: ns-456.awsdns-45.net.

Environment Separation

# Production environment
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: prod-ns1
spec:
  zoneRef: example-com
  name: prod
  nameserver: ns-prod1.example.com.
---
# Staging environment
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: staging-ns1
spec:
  zoneRef: example-com
  name: staging
  nameserver: ns-staging1.example.com.

FQDN Requirement

CRITICAL: The nameserver field MUST end with a dot (.):

# ✅ CORRECT
nameserver: ns1.example.com.

# ❌ WRONG
nameserver: ns1.example.com

Glue Records

When delegating to nameservers within the delegated zone, you need glue records (A/AAAA):

# NS delegation
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns
spec:
  zoneRef: example-com
  name: sub
  nameserver: ns1.sub.example.com.  # Nameserver is within delegated zone
---
# Glue record (A record for the nameserver)
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: sub-ns-glue
spec:
  zoneRef: example-com
  name: ns1.sub
  ipv4Address: "203.0.113.10"

Best Practices

Use multiple NS records - Always specify at least 2 nameservers for redundancy
FQDNs only - Always end nameserver values with a dot
Match TTLs - Use consistent TTLs across NS records for the same subdomain
Glue records - Provide A/AAAA records when NS points within delegated zone
Test delegation - Verify subdomain resolution after delegation

Status Monitoring

kubectl get nsrecord subdomain-ns -o yaml

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test NS delegation

# Query NS records
dig NS sub.example.com

# Test resolution through delegated nameservers
dig @ns1.subdomain-host.com www.sub.example.com

Common Issues

Missing glue records - Circular dependency if NS points within delegated zone
Wrong FQDN - Missing trailing dot causes relative name
Single nameserver - No redundancy if one server fails

Next Steps

DNS Records Overview - Complete guide to all record types
A Records - Create glue records for nameservers
Monitoring DNS - Monitor your DNS infrastructure

SRV Records (Service Location)

SRV records specify the location of services, including hostname and port number. They’re used for service discovery in protocols like SIP, XMPP, LDAP, and Minecraft.

Creating an SRV Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-server
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  service: xmpp-client  # Service name (without leading underscore)
  proto: tcp            # Protocol: tcp or udp
  name: "@"             # Domain (use @ for zone apex)
  priority: 10
  weight: 50
  port: 5222
  target: xmpp.example.com.  # Must end with dot (FQDN)
  ttl: 3600

This creates _xmpp-client._tcp.example.com pointing to xmpp.example.com:5222.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

SRV Record Format

The DNS name format is: _service._proto.name.domain

service: Service name (e.g., xmpp-client, sip, ldap)
proto: Protocol (tcp or udp)
name: Subdomain or @ for zone apex
priority: Lower values are preferred (like MX records)
weight: For load balancing among equal priorities (0-65535)
port: Service port number
target: Hostname providing the service (FQDN with trailing dot)

Common Services

XMPP (Jabber)

# Client connections
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-client
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 5
  weight: 0
  port: 5222
  target: xmpp.example.com.
---
# Server-to-server
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: xmpp-server
spec:
  zoneRef: example-com
  service: xmpp-server
  proto: tcp
  name: "@"
  priority: 5
  weight: 0
  port: 5269
  target: xmpp.example.com.

SIP (VoIP)

# SIP over TCP
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-tcp
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 10
  weight: 50
  port: 5060
  target: sip.example.com.
---
# SIP over UDP
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-udp
spec:
  zoneRef: example-com
  service: sip
  proto: udp
  name: "@"
  priority: 10
  weight: 50
  port: 5060
  target: sip.example.com.

LDAP

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: ldap-service
spec:
  zoneRef: example-com
  service: ldap
  proto: tcp
  name: "@"
  priority: 0
  weight: 100
  port: 389
  target: ldap.example.com.

Minecraft Server

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: minecraft
spec:
  zoneRef: example-com
  service: minecraft
  proto: tcp
  name: "@"
  priority: 0
  weight: 5
  port: 25565
  target: mc.example.com.

Priority and Weight

Failover with Priority

# Primary server (priority 10)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-primary
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 10
  weight: 0
  port: 5060
  target: sip1.example.com.
---
# Backup server (priority 20)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-backup
spec:
  zoneRef: example-com
  service: sip
  proto: tcp
  name: "@"
  priority: 20
  weight: 0
  port: 5060
  target: sip2.example.com.

Load Balancing with Weight

# Server 1 (weight 70 = 70% of traffic)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-1
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 10
  weight: 70
  port: 5222
  target: xmpp1.example.com.
---
# Server 2 (weight 30 = 30% of traffic)
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-2
spec:
  zoneRef: example-com
  service: xmpp-client
  proto: tcp
  name: "@"
  priority: 10
  weight: 30
  port: 5222
  target: xmpp2.example.com.

FQDN Requirement

CRITICAL: The target field MUST end with a dot (.):

# ✅ CORRECT
target: server.example.com.

# ❌ WRONG
target: server.example.com

Required Supporting Records

SRV records need corresponding A/AAAA records for targets:

# SRV record
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: service-srv
spec:
  zoneRef: example-com
  service: myservice
  proto: tcp
  name: "@"
  priority: 10
  weight: 0
  port: 8080
  target: server.example.com.
---
# A record for target
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: server
spec:
  zoneRef: example-com
  name: server
  ipv4Address: "203.0.113.50"

Best Practices

Always use FQDNs - End target values with a dot
Multiple servers - Use priority/weight for redundancy and load balancing
Match protocols - Create both TCP and UDP records if service supports both
Test clients - Verify client applications can discover services via SRV
Document services - Clearly name resources for maintainability

Status Monitoring

kubectl get srvrecord xmpp-server -o yaml

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test SRV record

# Query SRV record
dig SRV _xmpp-client._tcp.example.com

# Expected output shows priority, weight, port, and target
;; ANSWER SECTION:
_xmpp-client._tcp.example.com. 3600 IN SRV 5 0 5222 xmpp.example.com.

Common Issues

Service not auto-discovered - Verify client supports SRV lookups
Missing A/AAAA for target - Target hostname must resolve
Wrong service/proto names - Must match what client expects (check docs)

Next Steps

A Records - Create records for SRV targets
DNS Records Overview - Complete guide to all record types
Monitoring DNS - Monitor your DNS infrastructure

CAA Records (Certificate Authority Authorization)

CAA records specify which Certificate Authorities (CAs) are authorized to issue SSL/TLS certificates for your domain. This helps prevent unauthorized certificate issuance.

Creating a CAA Record

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: letsencrypt-caa
  namespace: dns-system
spec:
  zoneRef: example-com  # References DNSZone metadata.name (recommended)
  name: "@"             # Apply to entire domain
  flags: 0              # Typically 0 (non-critical)
  tag: issue            # Tag: issue, issuewild, or iodef
  value: letsencrypt.org
  ttl: 3600

This authorizes Let’s Encrypt to issue certificates for example.com.

Note: You can also use zone: example.com (matching DNSZone.spec.zoneName) instead of zoneRef. See Referencing DNS Zones for details.

CAA Tags

issue

Authorizes a CA to issue certificates for the domain:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org  # Authorize Let's Encrypt

issuewild

Authorizes a CA to issue wildcard certificates:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: letsencrypt.org  # Allow wildcard certificates

iodef

Specifies URL/email for reporting policy violations:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef-email
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: iodef
  value: mailto:security@example.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef-url
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: iodef
  value: https://example.com/caa-report

Common Configurations

Let’s Encrypt

# Standard certificates
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-le-issue
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
---
# Wildcard certificates
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-le-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: letsencrypt.org

DigiCert

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-digicert
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: digicert.com

AWS Certificate Manager

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-aws
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: amazon.com
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-aws-wildcard
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issuewild
  value: amazon.com

Multiple CAs

Authorize multiple Certificate Authorities:

# Let's Encrypt
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-letsencrypt
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: letsencrypt.org
---
# DigiCert
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-digicert
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: digicert.com

Deny All Issuance

Prevent any CA from issuing certificates:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-deny-all
spec:
  zoneRef: example-com
  name: "@"
  flags: 0
  tag: issue
  value: ";"  # Semicolon means no CA is authorized

Flags

0 - Non-critical (default, recommended)
128 - Critical - CA MUST understand all CAA properties or refuse issuance

Most deployments use flags: 0.

Subdomain CAA Records

Apply CAA policy to specific subdomains:

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-staging
spec:
  zoneRef: example-com
  name: staging  # staging.example.com
  flags: 0
  tag: issue
  value: letsencrypt.org  # Only Let's Encrypt for staging

Best Practices

Start with permissive policies - Allow your current CA before enforcing restrictions
Test thoroughly - Verify certificate renewal works after adding CAA
Use iodef - Configure reporting to catch unauthorized issuance attempts
Document authorized CAs - Maintain list of approved CAs in your security policy
Regular audits - Review CAA records periodically

Certificate Authority Values

Common CA values for the issue and issuewild tags:

Let’s Encrypt: letsencrypt.org
DigiCert: digicert.com
AWS ACM: amazon.com
GlobalSign: globalsign.com
Sectigo (Comodo): sectigo.com
GoDaddy: godaddy.com
Google Trust Services: pki.goog

Check your CA’s documentation for the correct value.

Status Monitoring

kubectl get caarecord letsencrypt-caa -o yaml

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
  observedGeneration: 1

Troubleshooting

Test CAA records

# Query CAA records
dig CAA example.com

# Expected output
;; ANSWER SECTION:
example.com. 3600 IN CAA 0 issue "letsencrypt.org"
example.com. 3600 IN CAA 0 issuewild "letsencrypt.org"

Certificate Issuance Failures

If certificate issuance fails after adding CAA:

Verify CA is authorized:
```
dig CAA example.com
```
Check for typos in CA value
Ensure both issue and issuewild are configured if using wildcards
Test with online tools:
- SSLMate CAA Test
- DigiCert CAA Check

Common Mistakes

Wrong CA value - Each CA has a specific value (check their docs)
Missing issuewild - Wildcard certificates need separate authorization
Critical flag - Using flags: 128 can cause issues if CA doesn’t understand all tags

Security Benefits

Prevent unauthorized issuance - CAs must check CAA before issuing
Incident detection - iodef tag provides violation notifications
Defense in depth - Additional layer beyond domain validation
Compliance - Many security standards recommend CAA records

Next Steps

TXT Records - Configure domain verification
DNS Records Overview - Complete guide to all record types
Monitoring DNS - Monitor your DNS infrastructure

Configuration

Configure the Bindy DNS operator and BIND9 instances for your environment.

Controller Configuration

The Bindy controller is configured through environment variables set in the deployment.

See Environment Variables for details on all available configuration options.

BIND9 Instance Configuration

Configure BIND9 instances through the Bind9Instance custom resource:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
spec:
  clusterRef: my-cluster
  role: primary
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.0.0/8"
    dnssec:
      enabled: true
      validation: true

Configuration Options

Container Image Configuration

Customize the BIND9 container image and pull configuration:

spec:
  # At instance level (overrides cluster)
  image:
    image: "my-registry.example.com/bind9:custom"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret

Or configure at the cluster level for all instances:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: my-cluster
spec:
  # Default image configuration for all instances
  image:
    image: "internetsystemsconsortium/bind9:9.18"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - shared-pull-secret

Fields:

image: Full container image reference (e.g., registry/image:tag)
imagePullPolicy: Always, IfNotPresent, or Never
imagePullSecrets: List of secret names for private registries

Custom Configuration Files

Use custom ConfigMaps for BIND9 configuration:

spec:
  # Reference custom ConfigMaps
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"
    namedConfZones: "my-custom-zones"  # Optional: for zone definitions

Create your custom ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel custom_log {
        file "/var/log/named/queries.log" versions 3 size 5m;
        severity info;
      };
      category queries { custom_log; };
    };

Zones Configuration File:

If you need to provide a custom zones file (e.g., for pre-configured zones), create a ConfigMap with named.conf.zones:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-zones
  namespace: dns-system
data:
  named.conf.zones: |
    // Zone definitions
    zone "example.com" {
      type primary;
      file "/etc/bind/zones/example.com.zone";
    };

    zone "internal.local" {
      type primary;
      file "/etc/bind/zones/internal.local.zone";
    };

Then reference it in your Bind9Instance:

spec:
  configMapRefs:
    namedConfZones: "my-custom-zones"

Default Behavior:

If configMapRefs is not specified, Bindy auto-generates configuration from the config block
If custom ConfigMaps are provided, they take precedence
The namedConfZones ConfigMap is optional - only include it if you need to pre-configure zones
If no namedConfZones is provided, no zones file will be included (zones can be added dynamically via RNDC)

Recursion

Control whether the DNS server performs recursive queries:

spec:
  config:
    recursion: false  # Disable for authoritative servers

For authoritative DNS servers, recursion should be disabled.

Query Access Control

Specify which networks can query the DNS server:

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"        # Allow from anywhere (public DNS)
      - "10.0.0.0/8"       # Private network only
      - "192.168.1.0/24"   # Specific subnet

Zone Transfer Access Control

Restrict zone transfers to authorized servers:

spec:
  config:
    allowTransfer:
      - "10.0.1.0/24"      # Secondary DNS network
      - "192.168.100.5"    # Specific secondary server

DNSSEC Configuration

Enable DNSSEC signing and validation:

spec:
  config:
    dnssec:
      enabled: true        # Enable DNSSEC signing
      validation: true     # Enable DNSSEC validation

RBAC Configuration

Configure Role-Based Access Control for the operator.

See RBAC for detailed RBAC setup.

Resource Limits

Set CPU and memory limits for BIND9 pods.

See Resource Limits for resource configuration.

Configuration Best Practices

Separate Primary and Secondary - Use different instances for primary and secondary roles
Limit Zone Transfers - Only allow transfers to known secondaries
Enable DNSSEC - Use DNSSEC for production zones
Set Appropriate Replicas - Use 2+ replicas for high availability
Use Labels - Organize instances with meaningful labels

Next Steps

Environment Variables - Controller configuration
RBAC Setup - Permissions and service accounts
Resource Limits - CPU and memory configuration

Environment Variables

Configure the Bindy controller using environment variables.

Controller Environment Variables

RUST_LOG

Control logging level:

env:
  - name: RUST_LOG
    value: "info"  # Options: error, warn, info, debug, trace

Levels:

error - Only errors
warn - Warnings and errors
info - Informational messages (default)
debug - Detailed debugging
trace - Very detailed tracing

RUST_LOG_FORMAT

Control logging output format:

env:
  - name: RUST_LOG_FORMAT
    value: "text"  # Options: text, json

Formats:

text - Human-readable compact text format (default)
json - Structured JSON format for log aggregation tools

Use JSON format for:

Kubernetes production deployments
Log aggregation systems (Loki, ELK, Splunk)
Centralized logging and monitoring
Automated log parsing and analysis

Example JSON output:

{
  "timestamp": "2025-11-30T10:00:00.123456Z",
  "level": "INFO",
  "message": "Starting BIND9 DNS Controller",
  "file": "main.rs",
  "line": 80,
  "threadName": "bindy-controller"
}

RECONCILE_INTERVAL

Set how often to reconcile resources (in seconds):

env:
  - name: RECONCILE_INTERVAL
    value: "300"  # 5 minutes

NAMESPACE

Limit operator to specific namespace:

env:
  - name: NAMESPACE
    valueFrom:
      fieldRef:
        fieldPath: metadata.namespace

Omit to watch all namespaces (requires ClusterRole).

Example Deployment Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy
  namespace: dns-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: bindy
  template:
    metadata:
      labels:
        app: bindy
    spec:
      serviceAccountName: bindy
      containers:
      - name: controller
        image: ghcr.io/firestoned/bindy:latest
        env:
        - name: RUST_LOG
          value: "info"
        - name: RUST_LOG_FORMAT
          value: "json"
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace

Best Practices

Use info level in production - Balance between visibility and noise
Enable debug for troubleshooting - Temporarily increase to debug level
Use JSON format in production - Enable structured logging for better log aggregation
Use text format for development - More readable for local debugging
Set reconcile interval appropriately - Don’t set too low to avoid API pressure
Use namespace scoping - Scope to specific namespace if not managing cluster-wide DNS

RBAC (Role-Based Access Control)

Configure Kubernetes RBAC for the Bindy controller.

Required Permissions

The Bindy controller needs permissions to:

Manage Bind9Instance, DNSZone, and DNS record resources
Create and manage Deployments, Services, ConfigMaps, and ServiceAccounts
Update resource status fields
Create events for logging

ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: bindy-role
rules:
  # Bindy CRDs
  - apiGroups: ["bindy.firestoned.io"]
    resources:
      - "bind9instances"
      - "bind9instances/status"
      - "dnszones"
      - "dnszones/status"
      - "arecords"
      - "arecords/status"
      - "aaaarecords"
      - "aaaarecords/status"
      - "cnamerecords"
      - "cnamerecords/status"
      - "mxrecords"
      - "mxrecords/status"
      - "txtrecords"
      - "txtrecords/status"
      - "nsrecords"
      - "nsrecords/status"
      - "srvrecords"
      - "srvrecords/status"
      - "caarecords"
      - "caarecords/status"
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  
  # Kubernetes resources
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  - apiGroups: [""]
    resources: ["services", "configmaps", "serviceaccounts"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]

  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "patch"]

ServiceAccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: bindy
  namespace: dns-system

ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: bindy-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: bindy-role
subjects:
- kind: ServiceAccount
  name: bindy
  namespace: dns-system

Namespace-Scoped RBAC

For namespace-scoped deployments, use Role instead of ClusterRole:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: bindy-role
  namespace: dns-system
rules:
  # Same rules as ClusterRole
  - apiGroups: ["bindy.firestoned.io"]
    resources: ["bind9instances", "dnszones", "*records"]
    verbs: ["*"]
  
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["*"]
  
  - apiGroups: [""]
    resources: ["services", "configmaps"]
    verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: bindy-rolebinding
  namespace: dns-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: bindy-role
subjects:
- kind: ServiceAccount
  name: bindy
  namespace: dns-system

Applying RBAC

# Apply all RBAC resources
kubectl apply -f deploy/rbac/

# Verify ServiceAccount
kubectl get serviceaccount bindy -n dns-system

# Verify ClusterRole
kubectl get clusterrole bindy-role

# Verify ClusterRoleBinding
kubectl get clusterrolebinding bindy-rolebinding

Security Best Practices

Least Privilege - Only grant necessary permissions
Namespace Scoping - Use namespace-scoped roles when possible
Separate ServiceAccounts - Don’t reuse default ServiceAccount
Audit Regularly - Review permissions periodically
Use Pod Security Policies - Restrict pod capabilities

Troubleshooting RBAC

Check if controller has required permissions:

# Check what the ServiceAccount can do
kubectl auth can-i list dnszones \
  --as=system:serviceaccount:dns-system:bindy

# Describe the ClusterRoleBinding
kubectl describe clusterrolebinding bindy-rolebinding

# Check controller logs for permission errors
kubectl logs -n dns-system deployment/bindy | grep -i forbidden

Resource Limits

Configure CPU and memory limits for BIND9 pods.

Setting Resource Limits

Configure resources in the Bind9Instance spec:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  replicas: 2
  resources:
    requests:
      cpu: "100m"
      memory: "128Mi"
    limits:
      cpu: "500m"
      memory: "512Mi"

Recommended Values

Small Deployment (Few zones)

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    cpu: "500m"
    memory: "512Mi"

Medium Deployment (Multiple zones)

resources:
  requests:
    cpu: "200m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"

Large Deployment (Many zones, high traffic)

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "2000m"
    memory: "2Gi"

Best Practices

Set both requests and limits - Ensures predictable performance
Start conservative - Begin with lower values and adjust based on monitoring
Monitor usage - Use metrics to right-size resources
Leave headroom - Don’t max out limits
Consider query volume - High-traffic DNS needs more resources

Monitoring Resource Usage

# View pod resource usage
kubectl top pods -n dns-system -l app=bind9

# Describe pod to see limits
kubectl describe pod -n dns-system <pod-name>

Monitoring

Monitor the health and performance of your Bindy DNS infrastructure.

Status Conditions

All Bindy resources report their status using standardized conditions:

# Check Bind9Instance status
kubectl get bind9instance primary-dns -n dns-system -o jsonpath='{.status.conditions}'

# Check DNSZone status
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.conditions}'

See Status Conditions for detailed condition types.

Logging

View controller and BIND9 logs:

# Controller logs
kubectl logs -n dns-system deployment/bindy

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns

# Follow logs
kubectl logs -n dns-system deployment/bindy -f

See Logging for log configuration.

Metrics

Monitor resource usage and performance:

# Pod resource usage
kubectl top pods -n dns-system

# Node resource usage
kubectl top nodes

See Metrics for detailed metrics.

Health Checks

BIND9 pods include liveness and readiness probes:

livenessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 5
  periodSeconds: 5

Check probe status:

kubectl describe pod -n dns-system <bind9-pod-name>

Monitoring Tools

Prometheus

Scrape metrics from BIND9 using bind_exporter:

# Add exporter sidecar to Bind9Instance
# (Future enhancement)

Grafana

Create dashboards for:

Query rate and latency
Zone transfer status
Resource usage
Error rates

Alerts

Set up alerts for:

Pod crashes or restarts
Failed zone transfers
High query latency
Resource exhaustion
DNSSEC validation failures

Next Steps

Status Conditions - Understanding resource status
Logging - Log configuration and analysis
Metrics - Detailed metrics collection
Troubleshooting - Debugging issues

Status Conditions

This document describes the standardized status conditions used across all Bindy CRDs.

Condition Types

All Bindy custom resources (Bind9Instance, DNSZone, and all DNS record types) use the following standardized condition types:

Ready

Description: Indicates whether the resource is fully operational and ready to serve its intended purpose
Common Use: Primary condition type used by all reconcilers
Status Values:
- True: Resource is ready and operational
- False: Resource is not ready (error or in progress)
- Unknown: Status cannot be determined

Available

Description: Indicates whether the resource is available for use
Common Use: Used to distinguish between “ready” and “available” when resources may be ready but not yet serving traffic
Status Values:
- True: Resource is available
- False: Resource is not available
- Unknown: Availability cannot be determined

Progressing

Description: Indicates whether the resource is currently being worked on
Common Use: During initial creation or updates
Status Values:
- True: Resource is being created or updated
- False: Resource is not currently progressing
- Unknown: Progress status cannot be determined

Degraded

Description: Indicates that the resource is functioning but in a degraded state
Common Use: When some replicas are down but service continues, or when non-critical features are unavailable
Status Values:
- True: Resource is degraded
- False: Resource is not degraded
- Unknown: Degradation status cannot be determined

Failed

Description: Indicates that the resource has failed and cannot fulfill its purpose
Common Use: Permanent failures that require intervention
Status Values:
- True: Resource has failed
- False: Resource has not failed
- Unknown: Failure status cannot be determined

Condition Structure

All conditions follow this structure:

status:
  conditions:
    - type: Ready              # One of: Ready, Available, Progressing, Degraded, Failed
      status: "True"           # One of: "True", "False", "Unknown"
      reason: Ready            # Machine-readable reason (typically same as type)
      message: "Bind9Instance configured with 2 replicas"  # Human-readable message
      lastTransitionTime: "2024-11-26T10:00:00Z"          # RFC3339 timestamp
  observedGeneration: 1        # Generation last observed by controller
  # Resource-specific fields (replicas, recordCount, etc.)

Current Usage

Bind9Instance

Uses Ready condition type
Status True when Deployment, Service, and ConfigMap are successfully created
Status False when resource creation fails
Additional status fields:
- replicas: Total number of replicas
- readyReplicas: Number of ready replicas

DNSZone

Uses Ready condition type
Status True when zone file is created and instances are matched
Status False when zone creation fails
Additional status fields:
- recordCount: Number of records in the zone
- observedGeneration: Last observed generation

DNS Records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

All use Ready condition type
Status True when record is successfully added to zone
Status False when record creation fails
Additional status fields:
- observedGeneration: Last observed generation

Best Practices

Always set the condition type: Use one of the five standardized types
Include timestamps: Set lastTransitionTime when condition status changes
Provide clear messages: The message field should be human-readable and actionable
Use appropriate reasons: The reason field should be machine-readable and consistent
Update observedGeneration: Always update to match the resource’s current generation
Multiple conditions: Resources can have multiple conditions simultaneously (e.g., Ready: True and Degraded: True)

Examples

Successful Bind9Instance

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Ready
      message: "Bind9Instance configured with 2 replicas"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  replicas: 2
  readyReplicas: 2

Failed DNSZone

status:
  conditions:
    - type: Ready
      status: "False"
      reason: Failed
      message: "No Bind9Instances matched selector"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

Progressing Deployment

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: Progressing
      message: "Deployment is rolling out"
      lastTransitionTime: "2024-11-26T10:00:00Z"
    - type: Ready
      status: "False"
      reason: Progressing
      message: "Waiting for deployment to complete"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 2
  replicas: 2
  readyReplicas: 1

Validation

All condition types are enforced via CRD validation. Attempting to use a condition type not in the enum will result in a validation error:

$ kubectl apply -f invalid-condition.yaml
Error from server (Invalid): error when creating "invalid-condition.yaml":
Bind9Instance.bindy.firestoned.io "test" is invalid:
status.conditions[0].type: Unsupported value: "CustomType":
supported values: "Ready", "Available", "Progressing", "Degraded", "Failed"

Logging

Configure and analyze logs from the Bindy controller and BIND9 instances.

Controller Logging

Log Levels

Set log level via RUST_LOG environment variable:

env:
  - name: RUST_LOG
    value: "info"  # error, warn, info, debug, trace

Log Format

Set log output format via RUST_LOG_FORMAT environment variable:

env:
  - name: RUST_LOG_FORMAT
    value: "json"  # text or json (default: text)

Text format (default):

Human-readable compact format
Ideal for development and local debugging
Includes timestamps, file locations, and line numbers

JSON format:

Structured JSON output
Recommended for production Kubernetes deployments
Easy integration with log aggregation tools (Loki, ELK, Splunk)
Enables programmatic log parsing and analysis

Viewing Controller Logs

# View recent logs
kubectl logs -n dns-system deployment/bindy --tail=100

# Follow logs in real-time
kubectl logs -n dns-system deployment/bindy -f

# Filter by log level
kubectl logs -n dns-system deployment/bindy | grep ERROR

# Search for specific resource
kubectl logs -n dns-system deployment/bindy | grep "example-com"

BIND9 Instance Logging

BIND9 instances are configured by default to log to stderr, making logs available through standard Kubernetes logging commands.

Default Logging Configuration

Bindy automatically configures BIND9 with the following logging channels:

stderr_log: All logs directed to stderr for container-native logging
Severity: Info level by default (configurable)
Categories: Default, queries, security, zone transfers (xfer-in/xfer-out)
Format: Includes timestamps, categories, and severity levels

Viewing BIND9 Logs

# Logs from all BIND9 pods
kubectl logs -n dns-system -l app=bind9

# Logs from specific instance
kubectl logs -n dns-system -l instance=primary-dns

# Follow logs
kubectl logs -n dns-system -l instance=primary-dns -f --tail=50

Common Log Messages

Successful Zone Load:

zone example.com/IN: loaded serial 2024010101

Zone Transfer:

transfer of 'example.com/IN' from 10.0.1.10#53: Transfer completed

Query Logging (if enabled):

client @0x7f... 192.0.2.1#53210: query: www.example.com IN A

Log Aggregation

Using Fluentd/Fluent Bit

Collect logs to centralized logging:

# Example Fluent Bit DaemonSet configuration
# Automatically collects pod logs

Using Loki

Store and query logs with Grafana Loki:

# Query logs for DNS zone
{namespace="dns-system", app="bind9"} |= "example.com"

# Query for errors
{namespace="dns-system"} |= "ERROR"

Structured Logging

JSON Format

Enable JSON logging with RUST_LOG_FORMAT=json:

env:
  - name: RUST_LOG_FORMAT
    value: "json"

Example JSON output:

{
  "timestamp": "2025-11-30T10:00:00.123456Z",
  "level": "INFO",
  "message": "Reconciling DNSZone: dns-system/example-com",
  "file": "dnszone.rs",
  "line": 142,
  "threadName": "bindy-controller"
}

Text Format

Default human-readable format (RUST_LOG_FORMAT=text or unset):

2025-11-30T10:00:00.123456Z dnszone.rs:142 INFO bindy-controller Reconciling DNSZone: dns-system/example-com

Log Retention

Configure log retention based on your needs:

Development: 7 days
Production: 30-90 days
Compliance: As required by regulations

Troubleshooting with Logs

Find Failed Reconciliations

kubectl logs -n dns-system deployment/bindy | grep "ERROR\|Failed"

Track Zone Transfer Issues

kubectl logs -n dns-system -l dns-role=secondary | grep "transfer"

Monitor Resource Creation

kubectl logs -n dns-system deployment/bindy | grep "Creating\|Updating"

Best Practices

Use appropriate log levels - info for production, debug for troubleshooting
Use JSON format in production - Enable structured logging for better integration with log aggregation tools
Use text format for development - More readable for local debugging and development
Centralize logs - Use log aggregation for easier analysis
Set up log rotation - Prevent disk space issues
Create alerts - Alert on ERROR level logs
Regular review - Periodically review logs for issues

Example Production Configuration

env:
  - name: RUST_LOG
    value: "info"
  - name: RUST_LOG_FORMAT
    value: "json"

Example Development Configuration

env:
  - name: RUST_LOG
    value: "debug"
  - name: RUST_LOG_FORMAT
    value: "text"

Changing Log Levels at Runtime

This guide explains how to change the controller’s log level without modifying code or redeploying the application.

Overview

The Bindy controller’s log level is configured via a ConfigMap (bindy-config), which allows runtime changes without code modifications. This is especially useful for:

Troubleshooting: Temporarily enable debug logging to investigate issues
Performance: Reduce log verbosity in production (info or warn)
Compliance: Meet PCI-DSS 3.4 requirements (no sensitive data in production logs)

Default Log Levels

Environment	Log Level	Log Format	Rationale
Production	`info`	`json`	PCI-DSS compliant, structured logging for SIEM
Staging	`info`	`json`	Production-like logging
Development	`debug`	`text`	Human-readable, detailed logging

Changing Log Level

Method 1: Update ConfigMap (Recommended)

# Change log level to debug
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "debug"}}'

# Restart controller pods to apply changes
kubectl rollout restart deployment/bindy -n dns-system

# Verify new log level
kubectl logs -n dns-system -l app=bindy --tail=20

Available Log Levels:

error - Only errors (critical issues)
warn - Warnings and errors
info - Normal operations (default for production)
debug - Detailed reconciliation steps (troubleshooting)
trace - Extremely verbose (rarely needed)

Method 2: Direct Deployment Patch (Temporary)

For temporary debugging without ConfigMap changes:

# Enable debug logging (overrides ConfigMap)
kubectl set env deployment/bindy RUST_LOG=debug -n dns-system

# Revert to ConfigMap value
kubectl set env deployment/bindy RUST_LOG- -n dns-system

Warning: This method bypasses the ConfigMap and is lost on next deployment. Use for quick debugging only.

Changing Log Format

# Change to JSON format (production)
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "json"}}'

# Change to text format (development)
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "text"}}'

# Restart to apply
kubectl rollout restart deployment/bindy -n dns-system

Log Formats:

json - Structured JSON logs (recommended for production, SIEM integration)
text - Human-readable logs (recommended for development)

Verifying Log Level Changes

# Check current ConfigMap values
kubectl get configmap bindy-config -n dns-system -o yaml

# Check environment variables in running pod
kubectl exec -n dns-system deployment/bindy -- printenv | grep RUST_LOG

# View recent logs to confirm verbosity
kubectl logs -n dns-system -l app=bindy --tail=100

Production Log Level Best Practices

✅ DO:

Use info level in production - Balances visibility with performance
Use json format in production - Enables structured logging and SIEM integration
Temporarily enable debug for troubleshooting - Use ConfigMap, document in incident log
Revert to info after troubleshooting - Debug logs impact performance

❌ DON’T:

Leave debug enabled in production - Performance impact, log volume explosion
Use trace level - Extremely verbose, only for deep troubleshooting
Hardcode log levels in deployment - Use ConfigMap for runtime changes

Audit Debug Logs for Sensitive Data

Before enabling debug logging in production, verify no sensitive data is logged:

# Audit debug logs for secrets, passwords, keys
kubectl logs -n dns-system -l app=bindy --tail=1000 | \
  grep -iE '(password|secret|key|token|credential)'

# If sensitive data found, fix in code before enabling debug

PCI-DSS 3.4 Requirement: Mask or remove PAN (Primary Account Number) from all logs.

Bindy Compliance: Controller does not handle payment card data directly, but RNDC keys and DNS zone data are considered sensitive.

Troubleshooting Scenarios

Scenario 1: Controller Not Reconciling Zones

# Enable debug logging
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "debug"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Watch logs for reconciliation details
kubectl logs -n dns-system -l app=bindy --follow

# Look for errors in reconciliation loop
kubectl logs -n dns-system -l app=bindy | grep -i error

Scenario 2: High Log Volume (Performance Issue)

# Reduce log level to warn
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-level": "warn"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Verify reduced log volume
kubectl logs -n dns-system -l app=bindy --tail=100

Scenario 3: SIEM Integration (Structured Logging)

# Ensure JSON format for SIEM
kubectl patch configmap bindy-config -n dns-system \
  --patch '{"data": {"log-format": "json"}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# Verify JSON output
kubectl logs -n dns-system -l app=bindy --tail=10 | jq .

Log Level Change Procedures (Compliance)

For compliance audits (SOX 404, PCI-DSS), document log level changes:

Change Request Template

# Log Level Change Request

**Date:** 2025-12-18
**Requester:** [Your Name]
**Approver:** [Security Team Lead]
**Environment:** Production

**Current State:**
- Log Level: info
- Log Format: json

**Requested Change:**
- Log Level: debug
- Log Format: json
- Duration: 2 hours (for troubleshooting)

**Justification:**
Investigating slow DNS zone reconciliation (Incident INC-12345)

**Rollback Plan:**
Revert to info level after 2 hours or when issue is resolved

**Approved by:** [Security Team Lead Signature]

Metrics

Monitor performance and health metrics for Bindy DNS infrastructure.

Operator Metrics

Bindy exposes Prometheus-compatible metrics on port 8080 at /metrics. These metrics provide comprehensive observability into the operator’s behavior and resource management.

Accessing Metrics

The metrics endpoint is exposed on all operator pods:

# Port forward to the operator
kubectl port-forward -n dns-system deployment/bindy-controller 8080:8080

# View metrics
curl http://localhost:8080/metrics

Available Metrics

All metrics use the namespace prefix bindy_firestoned_io_.

Reconciliation Metrics

bindy_firestoned_io_reconciliations_total (Counter) Total number of reconciliation attempts by resource type and outcome.

Labels:

resource_type: Kind of resource (Bind9Cluster, Bind9Instance, DNSZone, ARecord, AAAARecord, TXTRecord, CNAMERecord, MXRecord, NSRecord, SRVRecord, CAARecord)
status: Outcome (success, error, requeue)

# Reconciliation success rate
rate(bindy_firestoned_io_reconciliations_total{status="success"}[5m])

# Error rate by resource type
rate(bindy_firestoned_io_reconciliations_total{status="error"}[5m])

bindy_firestoned_io_reconciliation_duration_seconds (Histogram) Duration of reconciliation operations in seconds.

Labels:

resource_type: Kind of resource

Buckets: 0.001, 0.01, 0.1, 0.5, 1.0, 2.0, 5.0, 10.0, 30.0, 60.0

# Average reconciliation duration
rate(bindy_firestoned_io_reconciliation_duration_seconds_sum[5m])
/ rate(bindy_firestoned_io_reconciliation_duration_seconds_count[5m])

# 95th percentile latency
histogram_quantile(0.95, bindy_firestoned_io_reconciliation_duration_seconds_bucket)

bindy_firestoned_io_requeues_total (Counter) Total number of requeue operations.

Labels:

resource_type: Kind of resource
reason: Reason for requeue (error, rate_limit, dependency_wait)

# Requeue rate by reason
rate(bindy_firestoned_io_requeues_total[5m])

Resource Lifecycle Metrics

bindy_firestoned_io_resources_created_total (Counter) Total number of resources created.

Labels:

resource_type: Kind of resource

bindy_firestoned_io_resources_updated_total (Counter) Total number of resources updated.

Labels:

resource_type: Kind of resource

bindy_firestoned_io_resources_deleted_total (Counter) Total number of resources deleted.

Labels:

resource_type: Kind of resource

bindy_firestoned_io_resources_active (Gauge) Currently active resources being tracked.

Labels:

resource_type: Kind of resource

# Resource creation rate
rate(bindy_firestoned_io_resources_created_total[5m])

# Active resources by type
bindy_firestoned_io_resources_active

Error Metrics

bindy_firestoned_io_errors_total (Counter) Total number of errors by resource type and category.

Labels:

resource_type: Kind of resource
error_type: Category (api_error, validation_error, network_error, timeout, reconcile_error)

# Error rate by type
rate(bindy_firestoned_io_errors_total[5m])

# Errors by resource type
sum(rate(bindy_firestoned_io_errors_total[5m])) by (resource_type)

Leader Election Metrics

bindy_firestoned_io_leader_elections_total (Counter) Total number of leader election events.

Labels:

status: Event type (acquired, lost, renewed)

bindy_firestoned_io_leader_status (Gauge) Current leader election status (1 = leader, 0 = follower).

Labels:

pod_name: Name of the pod

# Current leader
bindy_firestoned_io_leader_status == 1

# Leader election rate
rate(bindy_firestoned_io_leader_elections_total[5m])

Performance Metrics

bindy_firestoned_io_generation_observation_lag_seconds (Histogram) Lag between resource spec generation change and controller observation.

Labels:

resource_type: Kind of resource

Buckets: 0.1, 0.5, 1.0, 2.0, 5.0, 10.0, 30.0, 60.0, 120.0

# Average observation lag
rate(bindy_firestoned_io_generation_observation_lag_seconds_sum[5m])
/ rate(bindy_firestoned_io_generation_observation_lag_seconds_count[5m])

Prometheus Configuration

The operator deployment includes Prometheus scrape annotations:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "8080"
  prometheus.io/path: "/metrics"

Prometheus will automatically discover and scrape these metrics if configured with Kubernetes service discovery.

Example Queries

# Reconciliation success rate (last 5 minutes)
sum(rate(bindy_firestoned_io_reconciliations_total{status="success"}[5m]))
/ sum(rate(bindy_firestoned_io_reconciliations_total[5m]))

# DNSZone reconciliation p95 latency
histogram_quantile(0.95,
  sum(rate(bindy_firestoned_io_reconciliation_duration_seconds_bucket{resource_type="DNSZone"}[5m])) by (le)
)

# Error rate by resource type (last hour)
topk(10,
  sum(rate(bindy_firestoned_io_errors_total[1h])) by (resource_type)
)

# Active resources per type
sum(bindy_firestoned_io_resources_active) by (resource_type)

# Requeue backlog
sum(rate(bindy_firestoned_io_requeues_total[5m])) by (resource_type, reason)

Grafana Dashboard

Import the Bindy operator dashboard (coming soon) or create custom panels using the queries above.

Recommended panels:

Reconciliation Rate - Total reconciliations/sec by resource type
Reconciliation Latency - P50, P95, P99 latencies
Error Rate - Errors/sec by resource type and error category
Active Resources - Gauge showing current active resources
Leader Status - Current leader pod and election events
Resource Lifecycle - Created/Updated/Deleted rates

Resource Metrics

Pod Metrics

View CPU and memory usage:

# All DNS pods
kubectl top pods -n dns-system

# Specific instance
kubectl top pods -n dns-system -l instance=primary-dns

# Sort by CPU
kubectl top pods -n dns-system --sort-by=cpu

# Sort by memory
kubectl top pods -n dns-system --sort-by=memory

Node Metrics

# Node resource usage
kubectl top nodes

# Detailed node info
kubectl describe node <node-name>

DNS Query Metrics

Using BIND9 Statistics

Enable BIND9 statistics channel (future enhancement):

spec:
  config:
    statisticsChannels:
      - address: "127.0.0.1"
        port: 8053

Query Counters

Monitor query rate and types:

Total queries received
Queries by record type (A, AAAA, MX, etc.)
Successful vs failed queries
NXDOMAIN responses

Performance Metrics

Query Latency

Measure DNS query response time:

# Test query latency
time dig @<dns-server-ip> example.com

# Multiple queries for average
for i in {1..10}; do time dig @<dns-server-ip> example.com +short; done

Zone Transfer Metrics

Monitor zone transfer performance:

Transfer duration
Transfer size
Transfer failures
Lag between primary and secondary

Kubernetes Metrics

Resource Utilization

# View resource requests vs limits
kubectl describe pod -n dns-system <pod-name> | grep -A5 "Limits:\|Requests:"

Pod Health

# Pod status and restarts
kubectl get pods -n dns-system -o wide

# Events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

Prometheus Integration

BIND9 Exporter

Deploy bind_exporter as sidecar (future enhancement):

containers:
- name: bind-exporter
  image: prometheuscommunity/bind-exporter:latest
  args:
    - "--bind.stats-url=http://localhost:8053"
  ports:
    - name: metrics
      containerPort: 9119

Service Monitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: bindy-metrics
spec:
  selector:
    matchLabels:
      app: bind9
  endpoints:
  - port: metrics
    interval: 30s

Key Metrics to Monitor

Query Rate - Queries per second
Query Latency - Response time
Error Rate - Failed queries percentage
Cache Hit Ratio - Cache effectiveness
Zone Transfer Status - Success/failure of transfers
Resource Usage - CPU and memory utilization
Pod Health - Running vs desired replicas

Grafana Dashboards

Create dashboards for:

DNS Overview

Total query rate
Average latency
Error rate
Top queried domains

Instance Health

Pod status
CPU/memory usage
Restart count
Network I/O

Zone Management

Zones count
Records per zone
Zone transfer status
Serial numbers

Alerting Thresholds

Recommended alert thresholds:

Metric	Warning	Critical
CPU Usage	> 70%	> 90%
Memory Usage	> 70%	> 90%
Query Latency	> 100ms	> 500ms
Error Rate	> 1%	> 5%
Pod Restarts	> 3/hour	> 10/hour

Best Practices

Baseline metrics - Establish normal operating ranges
Set appropriate alerts - Avoid alert fatigue
Monitor trends - Look for gradual degradation
Capacity planning - Use metrics to plan scaling
Regular review - Review dashboards weekly

Troubleshooting

Diagnose and resolve common issues with Bindy DNS operator.

Quick Diagnosis

Check Overall Health

# Check all resources
kubectl get all -n dns-system

# Check CRDs
kubectl get bind9instances,dnszones,arecords -A

# Check events
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -20

View Status Conditions

# Bind9Instance status
kubectl get bind9instance primary-dns -n dns-system -o yaml | yq '.status'

# DNSZone status
kubectl get dnszone example-com -n dns-system -o yaml | yq '.status'

Common Issues

See Common Issues for frequently encountered problems and solutions.

DNS Record Zone Reference Issues

If you’re seeing “DNSZone not found” errors:

Records can use zone (matches DNSZone.spec.zoneName) or zoneRef (matches DNSZone.metadata.name)
Common mistake: Using zone: internal-local when the zone name is internal.local
See DNS Record Issues - DNSZone Not Found for detailed troubleshooting

Debugging Steps

See Debugging Guide for detailed debugging procedures.

FAQ

See FAQ for answers to frequently asked questions.

Getting Help

Check Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy --tail=100

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns

Describe Resources

# Describe Bind9Instance
kubectl describe bind9instance primary-dns -n dns-system

# Describe pods
kubectl describe pod -n dns-system <pod-name>

Check Resource Status

# Get detailed status
kubectl get bind9instance primary-dns -n dns-system -o jsonpath='{.status}' | jq

Escalation

If issues persist:

Check Common Issues
Review Debugging Guide
Check FAQ
Search GitHub issues: https://github.com/firestoned/bindy/issues
Create a new issue with:
- Kubernetes version
- Bindy version
- Resource YAMLs
- Controller logs
- Error messages

Next Steps

Common Issues - Frequently encountered problems
Debugging - Step-by-step debugging
FAQ - Frequently asked questions

Error Handling and Retry Logic

Bindy implements robust error handling for DNS record reconciliation, ensuring the operator never crashes when encountering failures. Instead, it updates status conditions, creates Kubernetes Events, and automatically retries with configurable intervals.

Overview

When reconciling DNS records, several failure scenarios can occur:

DNSZone not found: No matching DNSZone resource exists
RNDC key loading fails: Cannot load the RNDC authentication Secret
BIND9 connection fails: Unable to connect to the BIND9 server
Record operation fails: BIND9 rejects the record operation

Bindy handles all these scenarios gracefully with:

✅ Status condition updates following Kubernetes conventions
✅ Kubernetes Events for visibility
✅ Automatic retry with exponential backoff
✅ Configurable retry intervals
✅ Idempotent operations safe for multiple retries

Configuration

Retry Interval

Control how long to wait before retrying failed DNS record operations:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy-operator
  namespace: bindy-system
spec:
  template:
    spec:
      containers:
      - name: bindy
        image: ghcr.io/firestoned/bindy:latest
        env:
        - name: BINDY_RECORD_RETRY_SECONDS
          value: "60"  # Default: 30 seconds

Recommendations:

Development: 10-15 seconds for faster iteration
Production: 30-60 seconds to avoid overwhelming the API server
High-load environments: 60-120 seconds to reduce reconciliation pressure

Error Scenarios

1. DNSZone Not Found

Scenario: DNS record references a zone that doesn’t exist

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
spec:
  zone: example.com  # No DNSZone with zoneName: example.com exists
  name: www
  ipv4Address: 192.0.2.1

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: ZoneNotFound
    message: "No DNSZone found for zone example.com in namespace dns-system"
    lastTransitionTime: "2025-11-29T23:45:00Z"
  observedGeneration: 1

Event:

Type     Reason         Message
Warning  ZoneNotFound   No DNSZone found for zone example.com in namespace dns-system

Resolution:

Create the DNSZone resource:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: bind9-primary

Or fix the zone reference in the record if it’s a typo

2. RNDC Key Load Failed

Scenario: Cannot load the RNDC authentication Secret

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: RndcKeyLoadFailed
    message: "Failed to load RNDC key for cluster bind9-primary: Secret bind9-primary-rndc-key not found"
    lastTransitionTime: "2025-11-29T23:45:00Z"

Event:

Type     Reason              Message
Warning  RndcKeyLoadFailed   Failed to load RNDC key for cluster bind9-primary

Resolution:

Check if the Secret exists:

kubectl get secret -n dns-system bind9-primary-rndc-key

Verify the Bind9Instance is running and has created its Secret:

kubectl get bind9instance -n dns-system bind9-primary -o yaml

If missing, the Bind9Instance reconciler should create it automatically

3. BIND9 Connection Failed

Scenario: Cannot connect to the BIND9 server (network issue, pod not ready, etc.)

Status:

status:
  conditions:
  - type: Ready
    status: "False"
    reason: RecordAddFailed
    message: "Cannot connect to BIND9 server at bind9-primary.dns-system.svc.cluster.local:953: connection refused. Will retry in 30s"
    lastTransitionTime: "2025-11-29T23:45:00Z"

Event:

Type     Reason           Message
Warning  RecordAddFailed  Cannot connect to BIND9 server at bind9-primary.dns-system.svc.cluster.local:953

Resolution:

Check BIND9 pod status:

kubectl get pods -n dns-system -l app=bind9-primary

Check BIND9 logs:

kubectl logs -n dns-system -l app=bind9-primary --tail=50

Verify network connectivity:

kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- \
  nc -zv bind9-primary.dns-system.svc.cluster.local 953

The operator will automatically retry after the configured interval

4. Record Created Successfully

Scenario: DNS record successfully created in BIND9

Status:

status:
  conditions:
  - type: Ready
    status: "True"
    reason: RecordCreated
    message: "A record www.example.com created successfully"
    lastTransitionTime: "2025-11-29T23:45:00Z"
  observedGeneration: 1

Event:

Type    Reason         Message
Normal  RecordCreated  A record www.example.com created successfully

Monitoring

View Record Status

# List all DNS records with status
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -A

# Check specific record status
kubectl get arecord www-example -n dns-system -o jsonpath='{.status.conditions[0]}' | jq .

# Find failing records
kubectl get arecords -A -o json | \
  jq -r '.items[] | select(.status.conditions[0].status == "False") |
  "\(.metadata.namespace)/\(.metadata.name): \(.status.conditions[0].reason) - \(.status.conditions[0].message)"'

View Events

# Recent events in namespace
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -20

# Watch events in real-time
kubectl get events -n dns-system --watch

# Filter for DNS record events
kubectl get events -n dns-system --field-selector involvedObject.kind=ARecord

Prometheus Metrics

Bindy exposes reconciliation metrics (if enabled):

# Reconciliation errors by reason
bindy_reconcile_errors_total{resource="ARecord", reason="ZoneNotFound"}

# Reconciliation duration
histogram_quantile(0.95, bindy_reconcile_duration_seconds_bucket{resource="ARecord"})

Status Reason Codes

Reason	Status	Meaning	Action Required
`RecordCreated`	`Ready=True`	DNS record successfully created in BIND9	None - record is operational
`ZoneNotFound`	`Ready=False`	No matching DNSZone resource exists	Create DNSZone or fix zone reference
`RndcKeyLoadFailed`	`Ready=False`	Cannot load RNDC key Secret	Verify Bind9Instance is running and Secret exists
`RecordAddFailed`	`Ready=False`	Failed to communicate with BIND9 or add record	Check BIND9 pod status and network connectivity

Idempotent Operations

All BIND9 operations are idempotent, making them safe for controller retries:

add_zones / add_primary_zone / add_secondary_zone

add_zones: Centralized dispatcher that routes to add_primary_zone or add_secondary_zone based on zone type
add_primary_zone: Checks if zone exists before attempting to add primary zone
add_secondary_zone: Checks if zone exists before attempting to add secondary zone
All functions return success if zone already exists
Safe to call multiple times (idempotent)

reload_zone

Returns clear error if zone doesn’t exist
Otherwise performs reload operation
Safe to call multiple times

Record Operations

All record add/update operations are idempotent
Retrying a failed operation won’t create duplicates
Controller can safely requeue failed reconciliations

Best Practices

1. Monitor Status Conditions

Always check status conditions when debugging DNS record issues:

kubectl describe arecord www-example -n dns-system

Look for the Status section showing current conditions.

2. Use Events for Troubleshooting

Events provide a timeline of what happened:

kubectl get events -n dns-system --field-selector involvedObject.name=www-example

3. Adjust Retry Interval for Your Needs

Fast feedback during development: BINDY_RECORD_RETRY_SECONDS=10
Production stability: BINDY_RECORD_RETRY_SECONDS=60
High-load clusters: BINDY_RECORD_RETRY_SECONDS=120

4. Create DNSZones Before Records

To avoid ZoneNotFound errors, always create DNSZone resources before creating DNS records:

# 1. Create DNSZone
kubectl apply -f dnszone.yaml

# 2. Wait for it to be ready
kubectl wait --for=condition=Ready dnszone/example-com -n dns-system --timeout=60s

# 3. Create DNS records
kubectl apply -f records/

5. Use Labels for Organization

Tag related resources for easier monitoring:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example
  namespace: dns-system
  labels:
    app: web-frontend
    environment: production
spec:
  zone: example.com
  name: www
  ipv4Address: 192.0.2.1

Then filter:

kubectl get arecords -n dns-system -l environment=production

Troubleshooting Guide

Record Stuck in “ZoneNotFound”

Verify DNSZone exists:
```
kubectl get dnszones -A
```

Check zone name matches:

kubectl get dnszone example-com -n dns-system -o jsonpath='{.spec.zoneName}'

Ensure they’re in the same namespace

Record Stuck in “RndcKeyLoadFailed”

Check Secret exists:

kubectl get secret -n dns-system {cluster-name}-rndc-key

Verify Bind9Instance is Ready:

kubectl get bind9instance -n dns-system

Check Bind9Instance logs:

kubectl logs -n bindy-system -l app=bindy-operator

Record Stuck in “RecordAddFailed”

Check BIND9 pod is running:

kubectl get pods -n dns-system -l app={cluster-name}

Test network connectivity:

kubectl run -it --rm debug --image=nicolaka/netshoot -- \
  nc -zv {cluster-name}.dns-system.svc.cluster.local 953

Check BIND9 logs for errors:

kubectl logs -n dns-system -l app={cluster-name} | grep -i error

Verify RNDC is listening on port 953:

kubectl exec -n dns-system {bind9-pod} -- ss -tlnp | grep 953

Common Issues

Solutions to frequently encountered problems.

Bind9Instance Issues

Pods Not Starting

Symptom: Bind9Instance created but pods not running

Diagnosis:

kubectl get pods -n dns-system -l instance=primary-dns
kubectl describe pod -n dns-system <pod-name>

Common Causes:

Image pull errors - Check image name and registry access
Resource limits - Insufficient CPU/memory on nodes
RBAC issues - ServiceAccount lacks permissions

Solution:

# Check events
kubectl get events -n dns-system

# Fix resource limits
kubectl edit bind9instance primary-dns -n dns-system
# Increase resources.requests and resources.limits

# Verify RBAC
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy

ConfigMap Not Created

Symptom: ConfigMap missing for Bind9Instance

Diagnosis:

kubectl get configmap -n dns-system
kubectl logs -n dns-system deployment/bindy | grep ConfigMap

Solution:

# Check controller logs for errors
kubectl logs -n dns-system deployment/bindy --tail=50

# Delete and recreate instance
kubectl delete bind9instance primary-dns -n dns-system
kubectl apply -f instance.yaml

DNSZone Issues

No Instances Match Selector

Symptom: DNSZone status shows “No Bind9Instances matched selector”

Diagnosis:

kubectl get bind9instances -n dns-system --show-labels
kubectl get dnszone example-com -n dns-system -o yaml | yq '.spec.instanceSelector'

Solution:

# Verify labels on instances
kubectl label bind9instance primary-dns dns-role=primary -n dns-system

# Or update zone selector
kubectl edit dnszone example-com -n dns-system

Zone File Not Created

Symptom: Zone exists but no zone file in BIND9

Diagnosis:

kubectl exec -n dns-system deployment/primary-dns -- ls -la /var/lib/bind/zones/
kubectl logs -n dns-system deployment/bindy | grep "example-com"

Solution:

# Check if zone reconciliation succeeded
kubectl describe dnszone example-com -n dns-system

# Trigger reconciliation by updating zone
kubectl annotate dnszone example-com reconcile=true -n dns-system

DNS Record Issues

DNSZone Not Found

Symptom: Controller logs show “DNSZone not found” errors for a zone that exists

Example Error:

ERROR Failed to find DNSZone for zone 'internal-local' in namespace 'dns-system'

Root Cause: Mismatch between how the record references the zone and the actual DNSZone fields.

Diagnosis:

# Check what the record is trying to reference
kubectl get arecord www-example -n dns-system -o yaml | grep -A2 spec:

# Check available DNSZones
kubectl get dnszones -n dns-system

# Check the DNSZone details
kubectl get dnszone example-com -n dns-system -o yaml

Understanding the Problem:

DNS records can reference zones using two different fields:

zone field - Matches against DNSZone.spec.zoneName (the actual DNS zone name like example.com)
zoneRef field - Matches against DNSZone.metadata.name (the Kubernetes resource name like example-com)

Common mistakes:

Using zone: internal-local when spec.zoneName: internal.local (dots vs dashes)
Using zone: example-com when it should be zone: example.com
Using zoneRef: example.com when it should be zoneRef: example-com

Solution:

Option 1: Use zone field with the actual DNS zone name

spec:
  zone: example.com  # Must match DNSZone spec.zoneName
  name: www

Option 2: Use zoneRef field with the resource name (recommended)

spec:
  zoneRef: example-com  # Must match DNSZone metadata.name
  name: www

Example Fix:

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: internal-local      # ← Resource name
  namespace: dns-system
spec:
  zoneName: internal.local  # ← Actual zone name

Wrong:

spec:
  zone: internal-local  # ✗ This looks for spec.zoneName = "internal-local"

Correct:

# Method 1: Use actual zone name
spec:
  zone: internal.local  # ✓ Matches spec.zoneName

# Method 2: Use resource name (more efficient)
spec:
  zoneRef: internal-local  # ✓ Matches metadata.name

Verification:

# After fixing, check the record reconciles
kubectl describe arecord www-example -n dns-system

# Should see no errors in events
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -10

See Records Guide - Referencing DNS Zones for more details.

Record Not Appearing in Zone

Symptom: ARecord created but not in zone file

Diagnosis:

# Check record status
kubectl get arecord www-example -n dns-system -o yaml

# Check zone file
kubectl exec -n dns-system deployment/primary-dns -- cat /var/lib/bind/zones/example.com.zone

Solution:

# Verify zone reference is correct (use zone or zoneRef)
kubectl get arecord www-example -n dns-system -o yaml | grep -E 'zone:|zoneRef:'

# Check available DNSZones
kubectl get dnszones -n dns-system

# Update if incorrect - use zone (matches spec.zoneName) or zoneRef (matches metadata.name)
kubectl edit arecord www-example -n dns-system

DNS Query Not Resolving

Symptom: dig/nslookup fails to resolve

Diagnosis:

# Get DNS service IP
SERVICE_IP=$(kubectl get svc primary-dns -n dns-system -o jsonpath='{.spec.clusterIP}')

# Test query
dig @$SERVICE_IP www.example.com

# Check BIND9 logs
kubectl logs -n dns-system -l instance=primary-dns | tail -20

Solutions:

Record doesn’t exist:

kubectl get arecords -n dns-system
kubectl apply -f record.yaml

Zone not loaded:

kubectl logs -n dns-system -l instance=primary-dns | grep "loaded serial"

Network policy blocking:

kubectl get networkpolicies -n dns-system

Zone Transfer Issues

Secondary Not Receiving Transfers

Symptom: Secondary instance not getting zone updates

Diagnosis:

# Check secondary logs
kubectl logs -n dns-system -l dns-role=secondary | grep transfer

# Check if zone has secondary IPs configured
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'

# Check if secondaries are discovered
kubectl get bind9instance -n dns-system -l role=secondary -o jsonpath='{.items[*].status.podIP}'

Automatic Configuration:

As of v0.1.0, Bindy automatically discovers secondary IPs and configures zone transfers:

Secondary pods are discovered via Kubernetes API using label selectors (role=secondary)
Primary zones are configured with also-notify and allow-transfer directives
Secondary IPs are stored in DNSZone.status.secondaryIps for tracking
When secondary pods restart/reschedule and get new IPs, zones are automatically updated

Manual Verification:

# Check if zone has secondary IPs in status
kubectl get dnszone example-com -n dns-system -o yaml | yq '.status.secondaryIps'

# Expected output: List of secondary pod IPs
# - 10.244.1.5
# - 10.244.2.8

# Verify zone configuration on primary
kubectl exec -n dns-system deployment/primary-dns -- \
  curl -s localhost:8080/api/zones/example.com | jq '.alsoNotify, .allowTransfer'

If Automatic Configuration Fails:

Verify secondary instances are labeled correctly:

kubectl get bind9instance -n dns-system -o yaml | yq '.items[].metadata.labels'

# Expected labels for secondaries:
# role: secondary
# cluster: <cluster-name>

Check DNSZone reconciler logs:

kubectl logs -n dns-system deployment/bindy | grep "secondary"

Verify network connectivity:

# Test AXFR from secondary to primary
kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @primary-dns-service AXFR example.com

Recovery After Secondary Pod Restart:

When secondary pods are rescheduled and get new IPs:

Detection: Reconciler automatically detects IP change within 5-10 minutes (next reconciliation)
Update: Zones are deleted and recreated with new secondary IPs
Transfer: Zone transfers resume automatically with new IPs

Manual Trigger (if needed):

# Force reconciliation by updating zone annotation
kubectl annotate dnszone example-com -n dns-system \
  reconcile.bindy.firestoned.io/trigger="$(date +%s)" --overwrite

Performance Issues

High Query Latency

Symptom: DNS queries taking too long

Diagnosis:

# Test query time
time dig @$SERVICE_IP example.com

# Check resource usage
kubectl top pods -n dns-system -l instance=primary-dns

Solutions:

Increase resources:

spec:
  resources:
    limits:
      cpu: "1000m"
      memory: "1Gi"

Add more replicas:

spec:
  replicas: 3

Enable caching (if appropriate for your use case)

RBAC Issues

Forbidden Errors in Logs

Symptom: Controller logs show “Forbidden” errors

Diagnosis:

kubectl logs -n dns-system deployment/bindy | grep Forbidden

# Check permissions
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy \
  -n dns-system

Solution:

# Reapply RBAC
kubectl apply -f deploy/rbac/

# Verify ClusterRoleBinding
kubectl get clusterrolebinding bindy-rolebinding -o yaml

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

Next Steps

Debugging Guide - Detailed debugging procedures
FAQ - Frequently asked questions
Logging - Log analysis

Debugging

Step-by-step guide to debugging Bindy DNS operator issues.

Debug Workflow

1. Identify the Problem

Determine what’s not working:

Bind9Instance not creating pods?
DNSZone not loading?
DNS records not resolving?
Zone transfers failing?

2. Check Resource Status

# Get high-level status
kubectl get bind9instances,dnszones,arecords -A

# Check specific resource
kubectl describe bind9instance primary-dns -n dns-system
kubectl describe dnszone example-com -n dns-system

3. Review Events

# Recent events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

# Events for specific resource
kubectl describe dnszone example-com -n dns-system | grep -A10 Events

4. Examine Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy --tail=100

# BIND9 instance logs
kubectl logs -n dns-system -l instance=primary-dns --tail=50

# Follow logs in real-time
kubectl logs -n dns-system deployment/bindy -f

Debugging Bind9Instance

Issue: Pods Not Starting

# 1. Check pod status
kubectl get pods -n dns-system -l instance=primary-dns

# 2. Describe pod
kubectl describe pod -n dns-system <pod-name>

# 3. Check events
kubectl get events -n dns-system --field-selector involvedObject.name=<pod-name>

# 4. Check logs if pod is running
kubectl logs -n dns-system <pod-name>

# 5. Check deployment
kubectl describe deployment primary-dns -n dns-system

Issue: ConfigMap Not Created

# 1. List ConfigMaps
kubectl get configmaps -n dns-system

# 2. Check controller logs
kubectl logs -n dns-system deployment/bindy | grep -i configmap

# 3. Check RBAC permissions
kubectl auth can-i create configmaps \
  --as=system:serviceaccount:dns-system:bindy \
  -n dns-system

# 4. Manually trigger reconciliation
kubectl annotate bind9instance primary-dns reconcile=true -n dns-system --overwrite

Debugging DNSZone

Issue: No Instances Match Selector

# 1. Check zone selector
kubectl get dnszone example-com -n dns-system -o yaml | grep -A5 instanceSelector

# 2. List instances with labels
kubectl get bind9instances -n dns-system --show-labels

# 3. Test selector match
kubectl get bind9instances -n dns-system \
  -l dns-role=primary,environment=production

# 4. Fix labels or selector
kubectl label bind9instance primary-dns dns-role=primary -n dns-system
# Or edit zone selector
kubectl edit dnszone example-com -n dns-system

Issue: Zone File Missing

# 1. Check if zone reconciliation succeeded
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.conditions}'

# 2. Exec into pod and check zones directory
kubectl exec -n dns-system deployment/primary-dns -- ls -la /var/lib/bind/zones/

# 3. Check BIND9 configuration
kubectl exec -n dns-system deployment/primary-dns -- cat /etc/bind/named.conf

# 4. Check BIND9 logs
kubectl logs -n dns-system -l instance=primary-dns | grep "example.com"

# 5. Reload BIND9 configuration
kubectl exec -n dns-system deployment/primary-dns -- rndc reload

Debugging DNS Records

Issue: Record Not in Zone File

# 1. Verify record exists
kubectl get arecord www-example -n dns-system

# 2. Check record status
kubectl get arecord www-example -n dns-system -o jsonpath='{.status}'

# 3. Verify zone reference
kubectl get arecord www-example -n dns-system -o jsonpath='{.spec.zone}'
# Should match a DNSZone resource name

# 4. Check zone file contents
kubectl exec -n dns-system deployment/primary-dns -- \
  cat /var/lib/bind/zones/example.com.zone

# 5. Trigger record reconciliation
kubectl annotate arecord www-example reconcile=true -n dns-system --overwrite

Issue: DNS Query Not Resolving

# 1. Get DNS service IP
SERVICE_IP=$(kubectl get svc primary-dns -n dns-system -o jsonpath='{.spec.clusterIP}')

# 2. Test query from within cluster
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- \
  dig @$SERVICE_IP www.example.com

# 3. Test query from BIND9 pod directly
kubectl exec -n dns-system deployment/primary-dns -- \
  dig @localhost www.example.com

# 4. Check if zone is loaded
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc status | grep "zones loaded"

# 5. Query zone status
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc zonestatus example.com

Debugging Zone Transfers

Issue: Secondary Not Receiving Transfers

# 1. Check primary allows transfers
kubectl get bind9instance primary-dns -n dns-system \
  -o jsonpath='{.spec.config.allowTransfer}'

# 2. Check secondary configuration
kubectl get dnszone example-com-secondary -n dns-system \
  -o jsonpath='{.spec.secondaryConfig}'

# 3. Test network connectivity
kubectl exec -n dns-system deployment/secondary-dns -- \
  nc -zv primary-dns-service 53

# 4. Attempt manual transfer
kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @primary-dns-service example.com AXFR

# 5. Check transfer logs
kubectl logs -n dns-system -l dns-role=secondary | grep -i transfer

# 6. Check NOTIFY messages
kubectl logs -n dns-system -l dns-role=primary | grep -i notify

Enable Debug Logging

Controller Debug Logging

# Edit controller deployment
kubectl set env deployment/bindy RUST_LOG=debug -n dns-system

# Or patch deployment
kubectl patch deployment bindy -n dns-system \
  -p '{"spec":{"template":{"spec":{"containers":[{"name":"controller","env":[{"name":"RUST_LOG","value":"debug"}]}]}}}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# View debug logs
kubectl logs -n dns-system deployment/bindy -f

Enable JSON Logging

For easier parsing and integration with log aggregation tools:

# Set JSON format
kubectl set env deployment/bindy RUST_LOG_FORMAT=json -n dns-system

# Or patch deployment for both debug level and JSON format
kubectl patch deployment bindy -n dns-system \
  -p '{"spec":{"template":{"spec":{"containers":[{"name":"controller","env":[{"name":"RUST_LOG","value":"debug"},{"name":"RUST_LOG_FORMAT","value":"json"}]}]}}}}'

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

# View JSON logs (can be piped to jq for parsing)
kubectl logs -n dns-system deployment/bindy -f | jq .

BIND9 Debug Logging

# Enable query logging
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc querylog on

# View queries
kubectl logs -n dns-system -l instance=primary-dns -f | grep "query:"

# Disable query logging
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc querylog off

Network Debugging

Test DNS Resolution

# From debug pod
kubectl run -it --rm debug --image=nicolaka/netshoot --restart=Never -- /bin/bash

# Inside pod:
dig @primary-dns-service.dns-system.svc.cluster.local www.example.com
nslookup www.example.com primary-dns-service.dns-system.svc.cluster.local
host www.example.com primary-dns-service.dns-system.svc.cluster.local

Check Network Policies

# List network policies
kubectl get networkpolicies -n dns-system

# Describe policy
kubectl describe networkpolicy <policy-name> -n dns-system

# Temporarily remove policy for testing
kubectl delete networkpolicy <policy-name> -n dns-system

Performance Debugging

Check Resource Usage

# Pod resource usage
kubectl top pods -n dns-system

# Node pressure
kubectl describe nodes | grep -A5 "Conditions:\|Allocated resources:"

# Detailed pod metrics
kubectl describe pod <pod-name> -n dns-system | grep -A10 "Limits:\|Requests:"

Profile DNS Queries

# Measure query latency
for i in {1..100}; do
  dig @$SERVICE_IP www.example.com +stats | grep "Query time:"
done | awk '{sum+=$4; count++} END {print "Average:", sum/count, "ms"}'

# Test concurrent queries
seq 1 100 | xargs -I{} -P10 dig @$SERVICE_IP www.example.com +short

Collect Diagnostic Information

Create Support Bundle

#!/bin/bash
# collect-diagnostics.sh

NAMESPACE="dns-system"
OUTPUT_DIR="bindy-diagnostics-$(date +%Y%m%d-%H%M%S)"

mkdir -p $OUTPUT_DIR

# Collect resources
kubectl get all -n $NAMESPACE -o yaml > $OUTPUT_DIR/resources.yaml
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords -A -o yaml > $OUTPUT_DIR/crds.yaml

# Collect logs
kubectl logs -n $NAMESPACE deployment/bindy --tail=1000 > $OUTPUT_DIR/controller.log
kubectl logs -n $NAMESPACE -l app=bind9 --tail=1000 > $OUTPUT_DIR/bind9.log

# Collect events
kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' > $OUTPUT_DIR/events.txt

# Collect status
kubectl describe bind9instances -A > $OUTPUT_DIR/bind9instances-describe.txt
kubectl describe dnszones -A > $OUTPUT_DIR/dnszones-describe.txt

# Create archive
tar -czf $OUTPUT_DIR.tar.gz $OUTPUT_DIR/

echo "Diagnostics collected in $OUTPUT_DIR.tar.gz"

Next Steps

Common Issues - Known problems and solutions
FAQ - Frequently asked questions
Logging - Log configuration and analysis

FAQ (Frequently Asked Questions)

General

What is Bindy?

Bindy is a Kubernetes operator that manages BIND9 DNS servers using Custom Resource Definitions (CRDs). It allows you to manage DNS zones and records declaratively using Kubernetes resources.

Why use Bindy instead of manual BIND9 configuration?

Declarative: Define DNS infrastructure as Kubernetes resources
GitOps-friendly: Version control your DNS configuration
Kubernetes-native: Uses familiar kubectl commands
Automated: Controller handles BIND9 configuration and reloading
Scalable: Easy multi-region, multi-instance deployments

What BIND9 versions are supported?

Bindy supports BIND 9.16 and 9.18. The version is configurable per Bind9Instance.

Installation

Can I run Bindy in a namespace other than dns-system?

Yes, you can deploy Bindy in any namespace. Update the namespace in deployment YAMLs and RBAC resources.

Do I need cluster-admin permissions?

You need permissions to:

Create CRDs (cluster-scoped)
Create ClusterRole and ClusterRoleBinding
Create resources in the operator namespace

A cluster administrator can pre-install CRDs and RBAC, then delegate namespace management.

Configuration

How do I update BIND9 configuration?

Edit the Bind9Instance resource:

kubectl edit bind9instance primary-dns -n dns-system

The controller will automatically update the ConfigMap and restart pods if needed.

Can I use external BIND9 servers?

No, Bindy manages BIND9 instances running in Kubernetes. For external servers, consider DNS integration tools.

How do I enable query logging?

Currently, enable it manually in the BIND9 pod:

kubectl exec -n dns-system deployment/primary-dns -- rndc querylog on

Future versions may support configuration through Bind9Instance spec.

DNS Zones

How many zones can one instance host?

BIND9 can handle thousands of zones. Practical limits depend on:

Resource allocation (CPU/memory)
Query volume
Zone size

Start with 100-500 zones per instance and scale as needed.

Can I host the same zone on multiple instances?

Yes! Use label selectors to target multiple instances:

instanceSelector:
  matchLabels:
    environment: production

This deploys the zone to all matching instances.

How do I migrate zones between instances?

Update the DNSZone’s instance Selector:

instanceSelector:
  matchLabels:
    dns-role: new-primary

The zone will be created on new instances and you can delete from old ones.

DNS Records

How do I create multiple A records for the same name?

Create multiple ARecord resources with different names but same spec.name:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-1
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-2
spec:
  zone: example-com
  name: www
  ipv4Address: "192.0.2.2"

Can I import existing zone files?

Not directly. You need to convert zone files to Bindy CRD resources. Future versions may include an import tool.

How do I delete all records in a zone?

kubectl delete arecords,aaaarecords,cnamerecords,mxrecords,txtrecords \
  -n dns-system -l zone=example-com

(If you label records with their zone)

Operations

How do I upgrade Bindy?

Update CRDs: kubectl apply -k deploy/crds/
Update controller: kubectl set image deployment/bindy controller=new-image
Monitor rollout: kubectl rollout status deployment/bindy -n dns-system

How do I backup DNS configuration?

# Export all CRDs
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -A -o yaml > bindy-backup.yaml

Store in version control or backup storage.

How do I restore from backup?

kubectl apply -f bindy-backup.yaml

Can I run Bindy in high availability mode?

Yes, run multiple controller replicas:

spec:
  replicas: 2  # Multiple controller replicas

Only one will be active (leader election), others are standby.

Troubleshooting

Pods are crashlooping

Check pod logs and events:

kubectl logs -n dns-system <pod-name>
kubectl describe pod -n dns-system <pod-name>

Common causes:

Invalid BIND9 configuration
Insufficient resources
Image pull errors

DNS queries timing out

Check:

Service is correctly exposing pods
Pods are ready
Query is reaching BIND9 (check logs)
Zone is loaded
Record exists

kubectl get svc -n dns-system
kubectl get pods -n dns-system
kubectl logs -n dns-system -l instance=primary-dns

Zone transfers not working

Ensure:

Primary allows transfers: spec.config.allowTransfer
Network connectivity between primary and secondary
Secondary has correct primary server IPs
Firewall rules allow TCP port 53

Performance

How do I optimize for high query volume?

Increase replicas: More pods = more capacity
Add resources: Increase CPU/memory limits
Use caching: If appropriate for your use case
Geographic distribution: Deploy instances near clients
Load balancing: Use service load balancing

What are typical resource requirements?

Deployment Size	CPU Request	Memory Request	CPU Limit	Memory Limit
Small (<50 zones)	100m	128Mi	500m	512Mi
Medium (50-500 zones)	200m	256Mi	1000m	1Gi
Large (500+ zones)	500m	512Mi	2000m	2Gi

Adjust based on actual usage monitoring.

Security

Is DNSSEC supported?

Yes, enable DNSSEC in Bind9Instance spec:

spec:
  config:
    dnssec:
      enabled: true
      validation: true

How do I restrict access to DNS queries?

Use allowQuery in Bind9Instance spec:

spec:
  config:
    allowQuery:
      - "10.0.0.0/8"  # Only internal network

Are zone transfers secure?

Zone transfers occur over TCP and can be restricted by IP address using allowTransfer. For additional security, consider:

Network policies
IPsec or VPN between regions
TSIG keys (future enhancement)

Integration

Can I use Bindy with external-dns?

Bindy manages internal DNS infrastructure. external-dns manages external DNS providers. They serve different purposes and can coexist.

Does Bindy work with Linkerd?

Yes, Bindy DNS servers can be used by Linkerd for internal DNS resolution. The DNS service has Linkerd injection disabled (DNS doesn’t work well with mesh sidecars), while management services can be Linkerd-injected for secure mTLS communication.

Can I integrate with existing DNS infrastructure?

Yes, configure Bindy instances as secondaries receiving transfers from existing primaries, or vice versa.

Next Steps

Troubleshooting - Debug issues
Common Issues - Known problems
Debugging - Detailed debugging steps

Replacing CoreDNS with Bind9GlobalCluster

Bind9GlobalCluster provides a powerful alternative to CoreDNS for cluster-wide DNS infrastructure. This guide explores using Bindy as a CoreDNS replacement in Kubernetes clusters.

Why Consider Replacing CoreDNS?

CoreDNS is the default DNS solution for Kubernetes, but you might want an alternative if you need:

Enterprise DNS Features: Advanced BIND9 capabilities like DNSSEC, dynamic updates via RNDC, and comprehensive zone management
Centralized DNS Management: Declarative DNS infrastructure managed via Kubernetes CRDs
GitOps-Ready DNS: DNS configuration as code, versioned and auditable
Integration with Existing Infrastructure: Organizations already using BIND9 for external DNS
Compliance Requirements: Full audit trails, signed releases, and documented controls (SOX, NIST 800-53)
Advanced Zone Management: Programmatic control over zones and records without editing configuration files

Architecture Comparison

CoreDNS (Default)

┌─────────────────────────────────────────┐
│ CoreDNS DaemonSet/Deployment            │
│ - Serves cluster.local queries          │
│ - Configured via ConfigMap               │
│ - Limited to Corefile syntax             │
└─────────────────────────────────────────┘

Characteristics:

Simple, built-in solution
ConfigMap-based configuration
Limited declarative management
Manual ConfigMap edits for changes

Bindy with Bind9GlobalCluster

┌──────────────────────────────────────────────────┐
│ Bind9GlobalCluster (cluster-scoped)             │
│ - Cluster-wide DNS infrastructure                │
│ - Platform team managed                          │
└──────────────────────────────────────────────────┘
         │
         ├─ Creates → Bind9Cluster (per namespace)
         │            └─ Creates → Bind9Instance (BIND9 pods)
         │
         └─ Referenced by DNSZones (any namespace)
                       └─ Records (A, AAAA, CNAME, MX, TXT, etc.)

Characteristics:

Declarative infrastructure-as-code
GitOps-ready (all configuration in YAML)
Dynamic updates via RNDC API (no restarts)
Full DNSSEC support
Programmatic record management
Multi-tenancy with RBAC

Use Cases

1. Platform DNS Service

Replace CoreDNS with a platform-managed DNS service accessible to all namespaces:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: platform-dns
  labels:
    app.kubernetes.io/component: dns
    app.kubernetes.io/part-of: platform-services
spec:
  version: "9.18"
  primary:
    replicas: 3  # HA for cluster DNS
    service:
      spec:
        type: ClusterIP
        clusterIP: 10.96.0.10  # Standard kube-dns ClusterIP
  secondary:
    replicas: 2
  global:
    recursion: true  # Important for cluster DNS
    allowQuery:
      - "0.0.0.0/0"
    forwarders:  # Forward external queries
      - "8.8.8.8"
      - "8.8.4.4"

Benefits:

High availability with multiple replicas
Declarative configuration (no ConfigMap editing)
Version-controlled DNS infrastructure
Gradual migration path from CoreDNS

2. Hybrid DNS Architecture

Use Bindy for application DNS while keeping CoreDNS for cluster.local:

# CoreDNS continues handling cluster.local
# Bindy handles application-specific zones

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: app-dns
spec:
  version: "9.18"
  primary:
    replicas: 2
  secondary:
    replicas: 1
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: internal-services
  namespace: platform
spec:
  zoneName: internal.example.com
  globalClusterRef: app-dns
  soaRecord:
    primaryNs: ns1.internal.example.com.
    adminEmail: platform.example.com.

Benefits:

Zero risk to existing cluster DNS
Application teams get advanced DNS features
Incremental adoption
Clear separation of concerns

3. Service Mesh Integration

Provide DNS for service mesh configurations:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: mesh-dns
  labels:
    linkerd.io/control-plane-ns: linkerd
spec:
  version: "9.18"
  primary:
    replicas: 2
    service:
      annotations:
        linkerd.io/inject: enabled
  global:
    recursion: false  # Authoritative only
    allowQuery:
      - "10.0.0.0/8"  # Service mesh network
---
# Application teams create zones
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-zone
  namespace: api-team
spec:
  zoneName: api.mesh.local
  globalClusterRef: mesh-dns
---
# Dynamic service records
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: api-v1
  namespace: api-team
spec:
  zoneRef: api-zone
  name: v1
  ipv4Address: "10.0.1.100"

Benefits:

Service mesh can use DNS for routing
Dynamic record updates without mesh controller changes
Platform team manages DNS infrastructure
Application teams manage their service records

Migration Strategies

Strategy 1: Parallel Deployment (Recommended)

Run Bindy alongside CoreDNS during migration:

Deploy Bindy Global Cluster:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: platform-dns-migration
spec:
  version: "9.18"
  primary:
    replicas: 2
    service:
      spec:
        type: ClusterIP  # Different IP from CoreDNS
  global:
    recursion: true
    forwarders:
      - "8.8.8.8"

Test DNS Resolution:

# Get Bindy DNS service IP
kubectl get svc -n dns-system -l app.kubernetes.io/name=bind9

# Test queries
dig @<bindy-service-ip> kubernetes.default.svc.cluster.local
dig @<bindy-service-ip> google.com

Gradually Migrate Applications: Update pod specs to use Bindy DNS:

spec:
  dnsPolicy: None
  dnsConfig:
    nameservers:
      - <bindy-service-ip>
    searches:
      - default.svc.cluster.local
      - svc.cluster.local
      - cluster.local

Switch Cluster Default (final step):

# Update kubelet DNS config
# Change --cluster-dns to Bindy service IP
# Rolling restart nodes

Strategy 2: Zone-by-Zone Migration

Keep CoreDNS for cluster.local, migrate application zones:

Keep CoreDNS for Cluster Services:

# CoreDNS ConfigMap unchanged
# Handles *.cluster.local, *.svc.cluster.local

Create Application Zones in Bindy:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: apps-zone
  namespace: platform
spec:
  zoneName: apps.example.com
  globalClusterRef: platform-dns

Configure Forwarding (CoreDNS → Bindy):

# CoreDNS Corefile
apps.example.com:53 {
  forward . <bindy-service-ip>
}

Benefits:

Zero risk to cluster stability
Incremental testing
Easy rollback
Coexistence of both solutions

Configuration for Cluster DNS

Essential Settings

For cluster DNS replacement, configure these settings:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: cluster-dns
spec:
  version: "9.18"
  primary:
    replicas: 3  # HA requirement
    service:
      spec:
        type: ClusterIP
        clusterIP: 10.96.0.10  # kube-dns default
  global:
    # CRITICAL: Enable recursion for cluster DNS
    recursion: true

    # Allow queries from all pods
    allowQuery:
      - "0.0.0.0/0"

    # Forward external queries to upstream DNS
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"

    # Cluster.local zone configuration
    zones:
      - name: cluster.local
        type: forward
        forwarders:
          - "10.96.0.10"  # Forward to Bindy itself for cluster zones

Recommended Zones

Create these zones for Kubernetes cluster DNS:

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: cluster-local
  namespace: dns-system
spec:
  zoneName: cluster.local
  globalClusterRef: cluster-dns
  soaRecord:
    primaryNs: ns1.cluster.local.
    adminEmail: dns-admin.cluster.local.
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: svc-cluster-local
  namespace: dns-system
spec:
  zoneName: svc.cluster.local
  globalClusterRef: cluster-dns
  soaRecord:
    primaryNs: ns1.svc.cluster.local.
    adminEmail: dns-admin.svc.cluster.local.

Advantages Over CoreDNS

1. Declarative Infrastructure

CoreDNS:

# Manual ConfigMap editing
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
data:
  Corefile: |
    .:53 {
        errors
        health
        # ... manual editing required
    }

Bindy:

# Infrastructure as code
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
# ... declarative specs
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
# ... versioned, reviewable YAML

2. Dynamic Updates

CoreDNS:

Requires ConfigMap changes
Requires pod restarts
No programmatic API

Bindy:

Dynamic record updates via RNDC
Zero downtime changes
Programmatic API (Kubernetes CRDs)

3. Multi-Tenancy

CoreDNS:

Single shared ConfigMap
No namespace isolation
Platform team controls everything

Bindy:

Platform team: Manages Bind9GlobalCluster
Application teams: Manage DNSZone and records in their namespace
RBAC-enforced isolation

4. Enterprise Features

Bindy Provides:

✅ DNSSEC with automatic key management
✅ Zone transfers (AXFR/IXFR)
✅ Split-horizon DNS (views/ACLs)
✅ Audit logging for compliance
✅ SOA record management
✅ Full BIND9 feature set

CoreDNS:

❌ Limited DNSSEC support
❌ No zone transfers
❌ Basic view support
❌ Limited audit capabilities

Operational Considerations

Performance

Memory Usage:

CoreDNS: ~30-50 MB per pod
Bindy (BIND9): ~100-200 MB per pod
Trade-off: More features, slightly higher resource use

Query Performance:

Both handle 10K+ queries/sec per pod
BIND9 excels at authoritative zones
CoreDNS excels at simple forwarding

Recommendation: Use Bindy where you need advanced features; CoreDNS is lighter for simple forwarding.

High Availability

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: ha-dns
spec:
  primary:
    replicas: 3  # Spread across zones
    affinity:
      podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app.kubernetes.io/name: bind9
            topologyKey: kubernetes.io/hostname
  secondary:
    replicas: 2  # Read replicas for query load

Monitoring

# Check DNS cluster status
kubectl get bind9globalcluster -o wide

# Check instance health
kubectl get bind9instances -n dns-system

# Query metrics (if Prometheus enabled)
kubectl port-forward -n dns-system svc/bindy-metrics 8080:8080
curl localhost:8080/metrics | grep bindy_

Limitations

Not Suitable For:

Clusters requiring ultra-low resource usage
- CoreDNS is lighter for simple forwarding
Simple forwarding-only scenarios
- CoreDNS is simpler if you don’t need BIND9 features
Rapid pod scaling (1000s/sec)
- CoreDNS has slightly faster startup time

Well-Suited For:

Enterprise environments with compliance requirements
Multi-tenant platforms with RBAC requirements
Complex DNS requirements (DNSSEC, zone transfers, dynamic updates)
GitOps workflows where DNS is infrastructure-as-code
Organizations standardizing on BIND9 across infrastructure

Best Practices

1. Start with Hybrid Approach

Keep CoreDNS for cluster.local, add Bindy for application zones:

# CoreDNS: cluster.local, svc.cluster.local
# Bindy: apps.example.com, internal.example.com

2. Use Health Checks

spec:
  primary:
    livenessProbe:
      tcpSocket:
        port: 53
      initialDelaySeconds: 30
    readinessProbe:
      exec:
        command: ["/usr/bin/dig", "@127.0.0.1", "health.check.local"]

3. Enable Audit Logging

spec:
  global:
    logging:
      channels:
        - name: audit_log
          file: /var/log/named/audit.log
          severity: info
      categories:
        - name: update
          channels: [audit_log]

4. Plan for Disaster Recovery

# Backup DNS zones
kubectl get dnszones -A -o yaml > dns-zones-backup.yaml

# Backup records
kubectl get arecords,cnamerecords,mxrecords -A -o yaml > dns-records-backup.yaml

Conclusion

Bind9GlobalCluster provides a powerful, enterprise-grade alternative to CoreDNS for Kubernetes clusters. While CoreDNS remains an excellent choice for simple forwarding scenarios, Bindy excels when you need:

Declarative DNS infrastructure-as-code
GitOps workflows for DNS management
Multi-tenancy with namespace isolation
Enterprise features (DNSSEC, zone transfers, dynamic updates)
Compliance and audit requirements
Integration with existing BIND9 infrastructure

Recommendation: Start with a hybrid approach—keep CoreDNS for cluster services, and adopt Bindy for application DNS zones. This provides a safe migration path with the ability to leverage advanced DNS features where needed.

Next Steps

Multi-Tenancy Guide - RBAC setup for platform and application teams
Choosing a Cluster Type - When to use Bind9GlobalCluster vs Bind9Cluster
High Availability - HA configuration for production DNS
DNSSEC - Enabling DNSSEC for secure DNS

High Availability

Design and implement highly available DNS infrastructure with Bindy.

Overview

High availability (HA) DNS ensures continuous DNS service even during:

Pod failures
Node failures
Availability zone outages
Regional outages
Planned maintenance

HA Architecture Components

1. Multiple Replicas

Run multiple replicas of each Bind9Instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  replicas: 3  # Multiple replicas for pod-level HA

Benefits:

Survives pod crashes
Load distribution
Zero-downtime updates

2. Multiple Instances

Deploy separate primary and secondary instances:

# Primary instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    dns-role: primary
spec:
  replicas: 2
---
# Secondary instance  
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  labels:
    dns-role: secondary
spec:
  replicas: 2

Benefits:

Role separation
Independent scaling
Failover capability

3. Geographic Distribution

Deploy instances across multiple regions:

# US East primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east
  labels:
    dns-role: primary
    region: us-east-1
spec:
  replicas: 2
---
# US West secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  labels:
    dns-role: secondary
    region: us-west-2
spec:
  replicas: 2
---
# EU secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  labels:
    dns-role: secondary
    region: eu-west-1
spec:
  replicas: 2

Benefits:

Regional failure tolerance
Lower latency for global users
Regulatory compliance (data locality)

HA Patterns

Pattern 1: Active-Passive

One active primary, multiple passive secondaries:

graph LR
    primary["Primary<br/>(Active)<br/>us-east-1"]
    sec1["Secondary<br/>(Passive)<br/>us-west-2"]
    sec2["Secondary<br/>(Passive)<br/>eu-west-1"]
    clients["Clients query any"]

    primary -->|AXFR| sec1
    sec1 -->|AXFR| sec2
    primary --> clients
    sec1 --> clients
    sec2 --> clients

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style clients fill:#fff9c4,stroke:#f57f17,stroke-width:2px

Updates go to primary only
Secondaries receive via zone transfer
Clients query any available instance

Pattern 2: Multi-Primary

Multiple primaries in different regions:

graph LR
    primary1["Primary<br/>(zone-a)<br/>us-east-1"]
    primary2["Primary<br/>(zone-b)<br/>eu-west-1"]

    primary1 <-->|Sync| primary2

    style primary1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style primary2 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px

Different zones on different primaries
Geographic distribution of updates
Careful coordination required

Pattern 3: Anycast

Same IP announced from multiple locations:

graph TB
    client["Client Query (192.0.2.53)"]
    dns_us["DNS<br/>US"]
    dns_eu["DNS<br/>EU"]
    dns_apac["DNS<br/>APAC"]

    client --> dns_us
    client --> dns_eu
    client --> dns_apac

    style client fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style dns_us fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style dns_eu fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style dns_apac fill:#f3e5f5,stroke:#4a148c,stroke-width:2px

Requires BGP routing
Lowest latency routing
Automatic failover

Pod-Level HA

Anti-Affinity

Spread pods across nodes:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: primary-dns
spec:
  replicas: 3
  template:
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: instance
                  operator: In
                  values:
                  - primary-dns
              topologyKey: kubernetes.io/hostname

Topology Spread

Distribute across availability zones:

spec:
  topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: DoNotSchedule
    labelSelector:
      matchLabels:
        instance: primary-dns

Service-Level HA

Liveness and Readiness Probes

Ensure only healthy pods serve traffic:

livenessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 30
  periodSeconds: 10
  
readinessProbe:
  exec:
    command: ["dig", "@localhost", "version.bind", "txt", "chaos"]
  initialDelaySeconds: 5
  periodSeconds: 5

Pod Disruption Budgets

Limit concurrent disruptions:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      instance: primary-dns

Monitoring HA

Check Instance Distribution

# View instances across regions
kubectl get bind9instances -A -L region

# View pod distribution
kubectl get pods -n dns-system -o wide

# Check zone spread
kubectl get pods -n dns-system \
  -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName,ZONE:.spec.nodeSelector

Test Failover

# Simulate pod failure
kubectl delete pod -n dns-system <pod-name>

# Verify automatic recovery
kubectl get pods -n dns-system -w

# Test DNS during failover
while true; do dig @$SERVICE_IP example.com +short; sleep 1; done

Disaster Recovery

Backup Strategy

# Regular backups of all CRDs
kubectl get bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -A -o yaml > backup-$(date +%Y%m%d).yaml

Recovery Procedures

Single Pod Failure - Kubernetes automatically recreates
Instance Failure - Clients fail over to other instances
Regional Failure - Zone data available from other regions
Complete Loss - Restore from backup

# Restore from backup
kubectl apply -f backup-20241126.yaml

Operator High Availability

The Bindy operator itself can run in high availability mode with automatic leader election. This ensures continuous DNS management even if operator pods fail.

Leader Election

Multiple operator instances use Kubernetes Lease objects for distributed leader election:

graph TB
    op1["Operator<br/>Instance 1<br/>(Leader)"]
    op2["Operator<br/>Instance 2<br/>(Standby)"]
    op3["Operator<br/>Instance 3<br/>(Standby)"]
    lease["Kubernetes API<br/>Lease Object"]

    op1 --> lease
    op2 --> lease
    op3 --> lease

    style op1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style op2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style op3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style lease fill:#fff9c4,stroke:#f57f17,stroke-width:2px

How it works:

All operator instances attempt to acquire the lease
One instance becomes the leader and starts reconciling resources
Standby instances wait and monitor the lease
If the leader fails, a standby automatically takes over (~15 seconds)

HA Operator Deployment

Deploy multiple operator replicas with leader election enabled:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bindy
  namespace: dns-system
spec:
  replicas: 3  # Run 3 instances for HA
  selector:
    matchLabels:
      app: bindy
  template:
    metadata:
      labels:
        app: bindy
    spec:
      serviceAccountName: bindy
      containers:
      - name: operator
        image: ghcr.io/firestoned/bindy:latest
        env:
        # Leader election configuration
        - name: ENABLE_LEADER_ELECTION
          value: "true"
        - name: LEASE_NAME
          value: "bindy-leader"
        - name: LEASE_NAMESPACE
          value: "dns-system"
        - name: LEASE_DURATION_SECONDS
          value: "15"
        - name: LEASE_RENEW_DEADLINE_SECONDS
          value: "10"
        - name: LEASE_RETRY_PERIOD_SECONDS
          value: "2"
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name

Configuration Options

Environment variables for leader election:

Variable	Default	Description
`ENABLE_LEADER_ELECTION`	`true`	Enable/disable leader election
`LEASE_NAME`	`bindy-leader`	Name of the Lease resource
`LEASE_NAMESPACE`	`dns-system`	Namespace for the Lease
`LEASE_DURATION_SECONDS`	`15`	How long leader holds lease
`LEASE_RENEW_DEADLINE_SECONDS`	`10`	Leader must renew before this
`LEASE_RETRY_PERIOD_SECONDS`	`2`	How often to attempt lease acquisition
`POD_NAME`	`$HOSTNAME`	Unique identity for this operator instance

Monitoring Leader Election

Check which operator instance is the current leader:

# View the lease object
kubectl get lease -n dns-system bindy-leader -o yaml

# Output shows current leader
spec:
  holderIdentity: bindy-7d8f9c5b4d-x7k2m  # Current leader pod
  leaseDurationSeconds: 15
  renewTime: "2025-11-30T12:34:56Z"

Monitor operator logs to see leadership changes:

# Watch operator logs
kubectl logs -n dns-system deployment/bindy -f

# Look for leadership events
INFO Attempting to acquire lease bindy-leader
INFO Lease acquired, this instance is now the leader
INFO Starting all controllers
WARN Leadership lost! Stopping all controllers...
INFO Lease acquired, this instance is now the leader

Failover Testing

Test automatic failover:

# Find current leader
LEADER=$(kubectl get lease -n dns-system bindy-leader -o jsonpath='{.spec.holderIdentity}')
echo "Current leader: $LEADER"

# Delete the leader pod
kubectl delete pod -n dns-system $LEADER

# Watch for new leader election (typically ~15 seconds)
kubectl get lease -n dns-system bindy-leader -w

# Verify DNS operations continue uninterrupted
kubectl get bind9instances -A

RBAC Requirements

Leader election requires additional permissions in the operator’s Role:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: bindy
  namespace: dns-system
rules:
# Leases for leader election
- apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
  verbs: ["get", "create", "update", "patch"]

Troubleshooting

Operator not reconciling resources:

# Check which instance is leader
kubectl get lease -n dns-system bindy-leader -o jsonpath='{.spec.holderIdentity}'

# Verify that pod exists and is running
kubectl get pods -n dns-system

# Check operator logs
kubectl logs -n dns-system deployment/bindy -f

Multiple operators reconciling (split brain):

This should never happen with proper leader election. If you suspect it:

# Check lease configuration
kubectl get lease -n dns-system bindy-leader -o yaml

# Verify all operators use the same LEASE_NAME
kubectl get deployment -n dns-system bindy -o yaml | grep LEASE_NAME

# Force lease release (recreate it)
kubectl delete lease -n dns-system bindy-leader

Leader election disabled but multiple replicas running:

This will cause conflicts. Either:

Enable leader election: Set ENABLE_LEADER_ELECTION=true
Or run single replica: kubectl scale deployment bindy --replicas=1

Performance Impact

Leader election adds minimal overhead:

Failover time: ~15 seconds (configurable via LEASE_DURATION_SECONDS)
Network traffic: 1 lease renewal every 2 seconds from leader only
CPU/Memory: Negligible (<1% increase)

Best Practices

Run 3+ Operator Replicas - For operator HA with leader election
Run 3+ DNS Instance Replicas - Odd numbers for quorum
Multi-AZ Deployment - Spread across availability zones
Geographic Redundancy - At least 2 regions for critical zones
Monitor Continuously - Alert on degraded HA
Test Failover - Regular disaster recovery drills (both operator and DNS instances)
Automate Recovery - Use Kubernetes self-healing
Document Procedures - Runbooks for incidents
Enable Leader Election - Always run operator with ENABLE_LEADER_ELECTION=true in production
Monitor Lease Health - Alert if lease ownership changes frequently (indicates instability)

Next Steps

Zone Transfers - Configure zone replication
Replication - Multi-region replication strategies
Performance - Optimize for high availability

Zone Transfers

Configure and optimize DNS zone transfers between primary and secondary instances.

Overview

Zone transfers replicate DNS zone data from primary to secondary servers using AXFR (full transfer) or IXFR (incremental transfer).

Configuring Zone Transfers

Primary Instance Setup

Allow zone transfers to secondary servers:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"        # Secondary network
      - "192.168.100.0/24"  # Specific secondary subnet

Secondary Instance Setup

Configure secondary zones to transfer from primary:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"  # Primary DNS server IP
      - "10.0.1.11"  # Backup primary IP

Transfer Types

Full Transfer (AXFR)

Transfers entire zone:

Used for initial zone load
Triggered manually or when IXFR unavailable
More bandwidth intensive

Incremental Transfer (IXFR)

Transfers only changes since last serial:

More efficient for large zones
Requires serial number tracking
Automatically used when available

Transfer Triggers

NOTIFY Messages

Primary sends NOTIFY when zone changes:

graph TB
    primary["Primary Updates Zone"]
    sec1["Secondary 1"]
    sec2["Secondary 2"]
    sec3["Secondary 3"]
    transfer["Secondaries initiate IXFR/AXFR"]

    primary -->|NOTIFY| sec1
    primary -->|NOTIFY| sec2
    primary -->|NOTIFY| sec3
    sec1 --> transfer
    sec2 --> transfer
    sec3 --> transfer

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style transfer fill:#fff9c4,stroke:#f57f17,stroke-width:2px

Refresh Timer

Secondary checks for updates periodically:

soaRecord:
  refresh: 3600  # Check every hour
  retry: 600     # Retry after 10 minutes if failed

Manual Trigger

Force zone transfer:

# On secondary pod
kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc retransfer example.com

Monitoring Zone Transfers

Check Transfer Status

# View transfer logs
kubectl logs -n dns-system -l dns-role=secondary | grep "transfer of"

# Successful transfer
# transfer of 'example.com/IN' from 10.0.1.10#53: Transfer completed: 1 messages, 42 records

# Check zone status
kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc zonestatus example.com

Verify Serial Numbers

# Primary serial
kubectl exec -n dns-system deployment/primary-dns -- \
  dig @localhost example.com SOA +short | awk '{print $3}'

# Secondary serial  
kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @localhost example.com SOA +short | awk '{print $3}'

# Should match when in sync

Transfer Performance

Optimize Transfer Speed

Use IXFR - Only transfer changes
Increase Bandwidth - Adequate network resources
Compress Transfers - Enable BIND9 compression
Parallel Transfers - Multiple zones transfer concurrently

Transfer Limits

Configure maximum concurrent transfers:

# In BIND9 config (future enhancement)
options {
  transfers-in 10;   # Max incoming transfers
  transfers-out 10;  # Max outgoing transfers
};

Security

Access Control

Restrict transfers by IP:

spec:
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Only this network

TSIG Authentication

Use TSIG keys for authenticated transfers:

# 1. Create a Kubernetes Secret with RNDC/TSIG credentials
apiVersion: v1
kind: Secret
metadata:
  name: transfer-key-secret
  namespace: dns-system
type: Opaque
stringData:
  key-name: transfer-key
  secret: K2xkajflkajsdf09asdfjlaksjdf==  # base64-encoded HMAC key

---
# 2. Reference the secret in Bind9Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256  # Algorithm for this key

The secret will be used for authenticated zone transfers between primary and secondary servers.

Troubleshooting

Transfer Failures

Check network connectivity:

kubectl exec -n dns-system deployment/secondary-dns -- \
  nc -zv primary-dns-service 53

Test manual transfer:

kubectl exec -n dns-system deployment/secondary-dns -- \
  dig @primary-dns-service example.com AXFR

Check ACLs:

kubectl get bind9instance primary-dns -o jsonpath='{.spec.config.allowTransfer}'

Slow Transfers

Check zone size:

kubectl exec -n dns-system deployment/primary-dns -- \
  wc -l /var/lib/bind/zones/example.com.zone

Monitor transfer time:

kubectl logs -n dns-system -l dns-role=secondary | \
  grep "transfer of" | grep "msecs"

Transfer Lag

Check refresh interval:

kubectl get dnszone example-com -o jsonpath='{.spec.soaRecord.refresh}'

Force immediate transfer:

kubectl exec -n dns-system deployment/secondary-dns -- \
  rndc retransfer example.com

Best Practices

Use IXFR - More efficient than full transfers
Set Appropriate Refresh - Balance freshness vs load
Monitor Serial Numbers - Detect sync issues
Secure Transfers - Use ACLs and TSIG
Test Failover - Verify secondaries work when primary fails
Log Transfers - Monitor for failures
Geographic Distribution - Secondaries in different regions

Example: Complete Setup

# Primary Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.0.0/8"
---
# Primary Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
spec:
  zoneName: example.com
  type: primary
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin@example.com
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Secondary Instance  
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  labels:
    dns-role: secondary
spec:
  replicas: 2
---
# Secondary Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
spec:
  zoneName: example.com
  type: secondary
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "primary-dns-service.dns-system.svc.cluster.local"

Next Steps

Replication - Multi-region replication strategies
High Availability - HA architecture
Performance - Optimize zone transfer performance

Replication

Implement multi-region DNS replication strategies for global availability.

Replication Models

Hub-and-Spoke

One central primary, multiple regional secondaries:

graph TB
    primary["Primary (us-east-1)"]
    sec1["Secondary<br/>(us-west)"]
    sec2["Secondary<br/>(eu-west)"]
    sec3["Secondary<br/>(ap-south)"]

    primary --> sec1
    primary --> sec2
    primary --> sec3

    style primary fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Simple, clear source of truth Cons: Single point of failure, latency for distant regions

Multi-Primary

Multiple primaries in different regions:

graph TB
    primaryA["Primary A<br/>(us-east)"]
    primaryB["Primary B<br/>(eu-west)"]
    sec1["Secondary<br/>(us-west)"]
    sec2["Secondary<br/>(ap-south)"]

    primaryA <-->|Sync| primaryB
    primaryA --> sec1
    primaryB --> sec2

    style primaryA fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style primaryB fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style sec1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style sec2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Regional updates, better latency Cons: Complex synchronization, conflict resolution

Hierarchical

Tiered replication structure:

graph TB
    global["Global Primary"]
    reg1["Regional<br/>Primary"]
    reg2["Regional<br/>Primary"]
    reg3["Regional<br/>Primary"]
    local1["Local<br/>Secondary"]
    local2["Local<br/>Secondary"]
    local3["Local<br/>Secondary"]

    global --> reg1
    global --> reg2
    global --> reg3
    reg1 --> local1
    reg2 --> local2
    reg3 --> local3

    style global fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
    style reg1 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reg2 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style reg3 fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style local1 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style local2 fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style local3 fill:#e1f5ff,stroke:#01579b,stroke-width:2px

Pros: Scales well, reduces global load Cons: More complex, longer propagation time

Configuration Examples

Hub-and-Spoke Setup

# Central Primary (us-east-1)
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: global-primary
  labels:
    dns-role: primary
    region: us-east-1
spec:
  replicas: 3
  config:
    allowTransfer:
      - "10.0.0.0/8"  # Allow all regional networks
---
# Regional Secondaries
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west
  labels:
    dns-role: secondary
    region: us-west-2
spec:
  replicas: 2
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west
  labels:
    dns-role: secondary
    region: eu-west-1
spec:
  replicas: 2

Replication Latency

Measuring Propagation Time

# Update record on primary
kubectl apply -f new-record.yaml

# Check serial on primary
PRIMARY_SERIAL=$(kubectl exec -n dns-system deployment/global-primary -- \
  dig @localhost example.com SOA +short | awk '{print $3}')

# Wait and check secondary
SECONDARY_SERIAL=$(kubectl exec -n dns-system deployment/secondary-eu-west -- \
  dig @localhost example.com SOA +short | awk '{print $3}')

# Calculate lag
echo "Primary: $PRIMARY_SERIAL, Secondary: $SECONDARY_SERIAL"

Optimizing Propagation

Reduce refresh interval - More frequent checks
Enable NOTIFY - Immediate notification of changes
Use IXFR - Faster incremental transfers
Optimize network - Low-latency connections between regions

Automatic Zone Transfer Configuration

New in v0.1.0: Bindy automatically configures zone transfers between primary and secondary instances.

When you create a DNSZone resource, Bindy automatically:

Discovers secondary instances - Finds all Bind9Instance resources labeled with role=secondary in the cluster
Configures zone transfers - Adds also-notify and allow-transfer directives with secondary IP addresses
Tracks secondary IPs - Stores current secondary IPs in DNSZone.status.secondaryIps
Detects IP changes - Monitors for secondary pod IP changes (due to restarts, rescheduling, scaling)
Auto-updates zones - Automatically reconfigures zones when secondary IPs change

Example:

# Check automatically configured secondary IPs
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'
# Output: ["10.244.1.5","10.244.2.8"]

# Verify zone configuration on primary
kubectl exec -n dns-system deployment/primary-dns -- \
  curl -s localhost:8080/api/zones/example.com | jq '.alsoNotify, .allowTransfer'

Self-Healing: When secondary pods are rescheduled and get new IPs:

Detection happens within 5-10 minutes (next reconciliation cycle)
Zones are automatically updated with new secondary IPs
Zone transfers resume automatically with no manual intervention

No manual configuration needed! The old approach of manually configuring allowTransfer networks is no longer required for Kubernetes-managed instances.

Conflict Resolution

When using multi-primary setups, handle conflicts:

Prevention

Separate zones per primary
Use different subdomains per region
Implement locking mechanism

Detection

# Compare zones between primaries
diff <(kubectl exec deployment/primary-us -- cat /var/lib/bind/zones/example.com.zone) \
     <(kubectl exec deployment/primary-eu -- cat /var/lib/bind/zones/example.com.zone)

Monitoring Replication

Replication Dashboard

Monitor:

Serial number sync status
Replication lag per region
Transfer success/failure rate
Zone size and growth

Alerts

Set up alerts for:

Serial number drift > threshold
Failed zone transfers
Replication lag > SLA
Network connectivity issues

Best Practices

Document topology - Clear replication map
Monitor lag - Track propagation time
Test failover - Regular DR drills
Use consistent serials - YYYYMMDDnn format
Automate updates - GitOps for all regions
Capacity planning - Account for replication traffic

Next Steps

High Availability - HA architecture
Zone Transfers - Transfer configuration
Performance - Optimize replication performance

Security

Secure your Bindy DNS infrastructure against threats and unauthorized access.

Security Layers

1. Network Security

Firewall rules limiting DNS access
Network policies in Kubernetes
Private networks for zone transfers

2. Access Control

Query restrictions (allowQuery)
Transfer restrictions (allowTransfer)
RBAC for Kubernetes resources

3. DNSSEC

Cryptographic validation
Zone signing
Trust chain verification

4. Pod Security

Pod Security Standards
SecurityContext settings
Read-only filesystems

Best Practices

Principle of Least Privilege - Minimal permissions
Defense in Depth - Multiple security layers
Regular Updates - Keep BIND9 and controller updated
Audit Logging - Track all changes
Encryption - TLS for management, DNSSEC for queries

Quick Security Checklist

Enable DNSSEC for public zones
Restrict allowQuery to expected networks
Limit allowTransfer to secondary servers only
Use RBAC for Kubernetes access
Enable Pod Security Standards
Regular security audits
Monitor for suspicious queries
Keep software updated

Next Steps

DNSSEC - Enable cryptographic validation
Access Control - Configure query and transfer restrictions

DNSSEC

Enable DNS Security Extensions (DNSSEC) for cryptographic validation of DNS responses.

Overview

DNSSEC adds cryptographic signatures to DNS records, preventing:

Cache poisoning
Man-in-the-middle attacks
Response tampering

Enabling DNSSEC

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
spec:
  config:
    dnssec:
      enabled: true      # Enable DNSSEC signing
      validation: true   # Enable DNSSEC validation

DNSSEC Record Types

DNSKEY - Public signing keys
RRSIG - Resource record signatures
NSEC/NSEC3 - Proof of non-existence
DS - Delegation signer (at parent zone)

Verification

Check DNSSEC Status

# Query with DNSSEC validation
dig @$SERVICE_IP example.com +dnssec

# Check for ad (authentic data) flag
dig @$SERVICE_IP example.com +dnssec | grep "flags.*ad"

# Verify RRSIG records
dig @$SERVICE_IP example.com RRSIG

Validate Chain of Trust

# Check DS record at parent
dig @parent-dns example.com DS

# Verify DNSKEY matches DS
dig @$SERVICE_IP example.com DNSKEY

Key Management

Automatic Key Rotation

BIND9 handles automatic key rotation (future enhancement for Bindy configuration).

Manual Key Management

# Generate keys (inside BIND9 pod)
kubectl exec -n dns-system deployment/primary-dns -- \
  dnssec-keygen -a RSASHA256 -b 2048 -n ZONE example.com

# Sign zone
kubectl exec -n dns-system deployment/primary-dns -- \
  dnssec-signzone -o example.com /var/lib/bind/zones/example.com.zone

Troubleshooting

DNSSEC Validation Failures

# Check validation logs
kubectl logs -n dns-system -l instance=primary-dns | grep dnssec

# Test with validation disabled
dig @$SERVICE_IP example.com +cd

# Verify time synchronization (critical for DNSSEC)
kubectl exec -n dns-system deployment/primary-dns -- date

Best Practices

Enable on primaries - Sign at source
Monitor expiration - Alert on expiring signatures
Test before enabling - Verify in staging first
Keep clocks synced - NTP critical for DNSSEC
Plan key rotation - Regular key updates

Next Steps

Security - Overall security strategy
Access Control - Query restrictions

Access Control

Configure fine-grained access control for DNS queries and zone transfers.

Query Access Control

Restrict who can query your DNS servers:

Public DNS (Allow All)

spec:
  config:
    allowQuery:
      - "0.0.0.0/0"  # IPv4 - anyone
      - "::/0"       # IPv6 - anyone

Internal DNS (Restricted)

spec:
  config:
    allowQuery:
      - "10.0.0.0/8"      # RFC1918 private
      - "172.16.0.0/12"   # RFC1918 private
      - "192.168.0.0/16"  # RFC1918 private

Specific Networks

spec:
  config:
    allowQuery:
      - "192.168.1.0/24"   # Office network
      - "10.100.0.0/16"    # VPN network
      - "172.20.5.10"      # Specific host

Zone Transfer Access Control

Restrict zone transfers to authorized servers:

spec:
  config:
    allowTransfer:
      - "10.0.1.0/24"      # Secondary DNS subnet
      - "192.168.100.5"    # Specific secondary
      - "192.168.100.6"    # Another secondary

Block All Transfers

spec:
  config:
    allowTransfer: []  # No transfers allowed

ACL Best Practices

Default Deny - Start restrictive, open as needed
Use CIDR Blocks - More maintainable than individual IPs
Document ACLs - Note why each entry exists
Regular Review - Remove obsolete entries
Test Changes - Verify before production

Network Policies

Kubernetes NetworkPolicies add another layer:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns-ingress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app: bind9
  policyTypes:
  - Ingress
  ingress:
  - from:
    - namespaceSelector: {}  # Allow from all namespaces
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53

Testing Access Control

# From allowed network (should work)
dig @$SERVICE_IP example.com

# From blocked network (should timeout or refuse)
dig @$SERVICE_IP example.com
# ;; communications error: connection timed out

# Test zone transfer restriction
dig @$SERVICE_IP example.com AXFR
# Transfer should fail if not in allowTransfer list

Next Steps

Security - Overall security
DNSSEC - Cryptographic validation

Performance

Optimize Bindy DNS infrastructure for maximum performance and efficiency.

Performance Metrics

Key metrics to monitor:

Query latency - Time to respond to DNS queries
Throughput - Queries per second (QPS)
Resource usage - CPU and memory utilization
Cache hit ratio - Percentage of cached responses
Reconciliation loops - Unnecessary status updates

Controller Performance

Status Update Optimization

The Bindy operator implements status change detection in all reconcilers to prevent tight reconciliation loops. This optimization:

Reduces Kubernetes API calls by skipping unnecessary status updates
Prevents reconciliation storms that can occur when status updates trigger new reconciliations
Improves overall system performance by reducing CPU and network overhead

All reconcilers check if the status has actually changed before updating the status subresource. Status updates only occur when:

Condition type changes
Status value changes
Message changes
Status doesn’t exist yet

This optimization is implemented across all resource types:

Bind9Cluster
Bind9Instance
DNSZone
All DNS record types (A, AAAA, CNAME, MX, NS, SRV, TXT, CAA)

For more details, see the Reconciliation Logic documentation.

Optimization Strategies

1. Resource Allocation

Provide adequate CPU and memory:

spec:
  resources:
    requests:
      cpu: "500m"
      memory: "512Mi"
    limits:
      cpu: "2000m"
      memory: "2Gi"

2. Horizontal Scaling

Add more replicas for higher capacity:

spec:
  replicas: 5  # More replicas = more capacity

3. Geographic Distribution

Place DNS servers near clients:

Reduced network latency
Better user experience
Regional load distribution

4. Caching Strategy

Configure BIND9 caching (when appropriate):

Longer TTLs reduce upstream queries
Negative caching for NXDOMAIN
Prefetching for popular domains

Performance Testing

Baseline Testing

# Single query latency
time dig @$SERVICE_IP example.com

# Sustained load (100 QPS for 60 seconds)
dnsp erf -s $SERVICE_IP -d example.com -q 100 -t 60

Load Testing

# Using dnsperf
dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 1000

# Using custom script
for i in {1..1000}; do
  dig @$SERVICE_IP test$i.example.com &
done
wait

Resource Optimization

CPU Optimization

Use efficient query algorithms
Enable query parallelization
Optimize zone file format

Memory Optimization

Right-size zone cache
Limit journal size
Regular zone file cleanup

Network Optimization

Use UDP for queries (TCP for transfers)
Enable TCP Fast Open
Optimize MTU size

Monitoring Performance

# Real-time resource usage
kubectl top pods -n dns-system -l app=bind9

# Query statistics
kubectl exec -n dns-system deployment/primary-dns -- \
  rndc stats

# View statistics file
kubectl exec -n dns-system deployment/primary-dns -- \
  cat /var/cache/bind/named.stats

Performance Targets

Metric	Target	Good	Excellent
Query Latency	< 50ms	< 20ms	< 10ms
Throughput	> 1000 QPS	> 5000 QPS	> 10000 QPS
CPU Usage	< 70%	< 50%	< 30%
Memory Usage	< 80%	< 60%	< 40%
Cache Hit Ratio	> 60%	> 80%	> 90%

Next Steps

Tuning - Detailed tuning parameters
Benchmarking - Performance testing methodology

Tuning

Fine-tune BIND9 and Kubernetes parameters for optimal performance.

BIND9 Tuning

Query Performance

# Future enhancement - BIND9 tuning via Bind9Instance spec
spec:
  config:
    tuning:
      maxCacheSize: "512M"
      maxCacheTTL: 86400
      recursiveClients: 1000

Zone Transfer Tuning

Concurrent transfers: transfers-in, transfers-out
Transfer timeout: Adjust for large zones
Compression: Enable for faster transfers

Kubernetes Tuning

Pod Resources

Right-size based on load:

# Light load
resources:
  requests: {cpu: "100m", memory: "128Mi"}
  limits: {cpu: "500m", memory: "512Mi"}

# Medium load
resources:
  requests: {cpu: "500m", memory: "512Mi"}
  limits: {cpu: "2000m", memory: "2Gi"}

# Heavy load
resources:
  requests: {cpu: "2000m", memory: "2Gi"}
  limits: {cpu: "4000m", memory: "4Gi"}

HPA (Horizontal Pod Autoscaling)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: bind9-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-dns
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Node Affinity

Place DNS pods on optimized nodes:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: workload-type
          operator: In
          values:
          - dns

Network Tuning

Service Type

Consider NodePort or LoadBalancer for external access:

apiVersion: v1
kind: Service
spec:
  type: LoadBalancer  # Or NodePort
  externalTrafficPolicy: Local  # Preserve source IP

DNS Caching

Adjust TTL values:

# Short TTL for dynamic records
spec:
  ttl: 60  # 1 minute

# Long TTL for static records
spec:
  ttl: 86400  # 24 hours

OS-Level Tuning

File Descriptors

Increase limits for high query volume:

# In pod security context (future enhancement)
securityContext:
  limits:
    nofile: 65536

Network Buffers

Optimize for DNS traffic (node-level):

# Increase UDP buffer sizes
sysctl -w net.core.rmem_max=8388608
sysctl -w net.core.wmem_max=8388608

Monitoring Tuning Impact

# Before tuning - baseline
kubectl top pods -n dns-system
time dig @$SERVICE_IP example.com

# Apply tuning
kubectl apply -f tuned-config.yaml

# After tuning - compare
kubectl top pods -n dns-system
time dig @$SERVICE_IP example.com

Tuning Checklist

Right-sized pod resources
Optimal replica count
HPA configured
Appropriate TTL values
Network policies optimized
Node placement configured
Monitoring enabled
Performance tested

Next Steps

Performance - Performance overview
Benchmarking - Testing methodology

Benchmarking

Measure and analyze DNS performance using industry-standard tools.

Tools

dnsperf

Industry-standard DNS benchmarking:

# Install dnsperf
apt-get install dnsperf

# Create query file
cat > queries.txt <<'QUERIES'
example.com A
www.example.com A
mail.example.com MX
QUERIES

# Run benchmark
dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 1000

resperf

Response rate testing:

# Test maximum QPS
resperf -s $SERVICE_IP -d queries.txt -m 10000

dig

Simple latency testing:

# Measure query time
dig @$SERVICE_IP example.com | grep "Query time"

# Multiple queries for average
for i in {1..100}; do
  dig @$SERVICE_IP example.com +stats | grep "Query time"
done | awk '{sum+=$4; count++} END {print "Average:", sum/count, "ms"}'

Benchmark Scenarios

Scenario 1: Baseline Performance

Single client, sequential queries:

dnsperf -s $SERVICE_IP -d queries.txt -l 60 -Q 100

Expected: < 10ms latency, > 90% success

Scenario 2: Load Test

Multiple clients, high QPS:

dnsperf -s $SERVICE_IP -d queries.txt -l 300 -Q 5000 -c 50

Expected: < 50ms latency under load

Scenario 3: Stress Test

Maximum capacity test:

resperf -s $SERVICE_IP -d queries.txt -m 50000

Expected: Find maximum QPS before degradation

Metrics to Collect

Response Time

Minimum latency
Average latency
95th percentile
99th percentile
Maximum latency

Throughput

Queries per second
Successful responses
Failed queries
Timeout rate

Resource Usage

# During benchmark
kubectl top pods -n dns-system

# CPU and memory trends
kubectl top pods -n dns-system --use-protocol-buffers

Sample Benchmark Report

Benchmark: Load Test
Date: 2024-11-26
Duration: 300 seconds
Target QPS: 5000

Results:
- Queries sent: 1,500,000
- Queries completed: 1,498,500
- Success rate: 99.9%
- Average latency: 12.3ms
- 95th percentile: 24.1ms
- 99th percentile: 45.2ms
- Max latency: 89.5ms

Resource Usage:
- Average CPU: 1.2 cores
- Average Memory: 512MB
- Peak CPU: 1.8 cores
- Peak Memory: 768MB

Continuous Benchmarking

Automated Testing

apiVersion: batch/v1
kind: CronJob
metadata:
  name: dns-benchmark
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: dnsperf
            image: dnsperf:latest
            command:
            - /bin/sh
            - -c
            - dnsperf -s primary-dns -d /queries.txt -l 60 >> /results/benchmark.log

Trend Analysis

Track performance over time:

Daily benchmarks
Compare before/after changes
Identify degradation early
Capacity planning

Best Practices

Consistent tests - Same queries, duration
Isolated environment - Minimize external factors
Multiple runs - Average results
Document changes - Link to config changes
Realistic load - Match production patterns

Next Steps

Performance - Performance overview
Tuning - Optimization parameters

Integration

Integrate Bindy with other Kubernetes and DNS systems.

Integration Patterns

1. Internal Service Discovery

Use Bindy for internal service DNS.

2. Hybrid DNS

Combine Bindy with external DNS providers.

3. GitOps

Manage DNS configuration through Git.

Kubernetes Integration

CoreDNS Integration

Use Bindy alongside CoreDNS:

# CoreDNS for cluster.local
# Bindy for custom domains

Linkerd Service Mesh

Integrate with Linkerd:

Custom DNS resolution for internal services
Service discovery integration
Traffic routing with DNS-based endpoints
mTLS-secured management communication (RNDC API)

Next Steps

External DNS - External provider integration
Service Discovery - Kubernetes service discovery

External DNS Integration

Integrate Bindy with external DNS management systems.

Use Cases

Hybrid Cloud - Internal DNS in Bindy, external in cloud provider
Public/Private Split - Public zones external, private in Bindy
Migration - Gradual migration from external to Bindy

Integration with external-dns

External-dns manages external providers (Route53, CloudDNS), Bindy manages internal BIND9.

Separate Domains

# external-dns manages example.com (public)
# Bindy manages internal.example.com (private)

Forwarding

Configure external DNS to forward to Bindy for internal zones.

Best Practices

Clear boundaries - Document which system owns which zones
Consistent records - Synchronize where needed
Separate responsibilities - External for public, Bindy for internal

Next Steps

Integration - Integration overview
Service Discovery - Kubernetes service discovery

Service Discovery

Use Bindy for Kubernetes service discovery and internal DNS.

Kubernetes Service DNS

Automatic Service Records

Create DNS records for Kubernetes services:

apiVersion: v1
kind: Service
metadata:
  name: myapp
  namespace: production
spec:
  selector:
    app: myapp
  ports:
  - port: 80
---
# Create corresponding DNS record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: myapp
spec:
  zone: internal-local
  name: myapp.production
  ipv4Address: "10.100.5.10"  # Service ClusterIP

Service Discovery Pattern

graph TB
    app["Application Query:<br/>myapp.production.internal.local"]
    dns["Bindy DNS Server"]
    result["Returns: 10.100.5.10"]
    svc["Kubernetes Service"]

    app --> dns
    dns --> result
    result --> svc

    style app fill:#fff9c4,stroke:#f57f17,stroke-width:2px
    style dns fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px
    style result fill:#e1f5ff,stroke:#01579b,stroke-width:2px
    style svc fill:#f3e5f5,stroke:#4a148c,stroke-width:2px

Dynamic Updates

Automatically update DNS when services change (future enhancement):

# Controller watches Services and creates DNS records

Best Practices

Consistent naming - Match service names to DNS names
Namespace separation - Use subdomains per namespace
TTL management - Short TTLs for dynamic services
Health checks - Only advertise healthy services

Next Steps

Integration - Integration patterns
External DNS - External DNS integration

Development Setup

Set up your development environment for contributing to Bindy.

Prerequisites

Required Tools

Rust - 1.70 or later
Kubernetes - 1.27 or later (for testing)
kubectl - Matching your Kubernetes version
Docker - For building images
kind - For local Kubernetes testing (optional)

Install Rust

# Install rustup
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Verify installation
rustc --version
cargo --version

Install Development Tools

# Install cargo tools
cargo install cargo-watch  # Auto-rebuild on changes
cargo install cargo-tarpaulin  # Code coverage

# Install mdbook for documentation
cargo install mdbook

Clone Repository

git clone https://github.com/firestoned/bindy.git
cd bindy

Project Structure

bindy/
├── src/              # Rust source code
│   ├── main.rs       # Entry point
│   ├── crd.rs        # CRD definitions
│   ├── reconcilers/  # Reconciliation logic
│   └── bind9.rs      # BIND9 integration
├── deploy/           # Kubernetes manifests
│   ├── crds/         # CRD definitions
│   ├── rbac/         # RBAC resources
│   └── controller/   # Controller deployment
├── tests/            # Integration tests
├── examples/         # Example configurations
├── docs/             # Documentation
└── Cargo.toml        # Rust dependencies

Dependencies

Key dependencies:

kube - Kubernetes client
tokio - Async runtime
serde - Serialization
tracing - Logging

See Cargo.toml for full list.

IDE Setup

VS Code

Recommended extensions:

rust-analyzer
crates
Even Better TOML
Kubernetes

IntelliJ IDEA / CLion

Install Rust plugin
Install Kubernetes plugin

Verify Setup

# Build the project
cargo build

# Run tests
cargo test

# Run clippy (linter)
cargo clippy

# Format code
cargo fmt

If all commands succeed, your development environment is ready!

Next Steps

Building from Source - Build the controller
Running Tests - Test your changes
Development Workflow - Daily development workflow

Building from Source

Build the Bindy controller from source code.

Build Debug Version

For development with debug symbols:

cargo build

Binary location: target/debug/bindy

Build Release Version

Optimized for production:

cargo build --release

Binary location: target/release/bindy

Run Locally

# Set log level
export RUST_LOG=info

# Run controller (requires kubeconfig)
cargo run --release

Build Docker Image

# Build image
docker build -t bindy:dev .

# Or use make
make docker-build TAG=dev

Build for Different Platforms

Cross-Compilation

# Install cross
cargo install cross

# Build for Linux (from macOS/Windows)
cross build --release --target x86_64-unknown-linux-gnu

Multi-Architecture Images

# Build for multiple architectures
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t bindy:multi \
  --push .

Build Documentation

Rustdoc (API docs)

cargo doc --no-deps --open

mdBook (User guide)

Prerequisites:

The documentation uses Mermaid diagrams which require the mdbook-mermaid preprocessor:

# Install mdbook-mermaid
cargo install mdbook-mermaid

# Ensure ~/.cargo/bin is in your PATH
export PATH="$HOME/.cargo/bin:$PATH"

# Initialize Mermaid support (first time only)
mdbook-mermaid install .

Build and serve:

# Build book
mdbook build

# Serve locally
mdbook serve --open

Combined Documentation

make docs

Optimization

Profile-Guided Optimization

# Generate profile data
cargo build --release
./target/release/bindy  # Run workload

# Build with PGO
cargo build --release

Size Optimization

# In Cargo.toml
[profile.release]
opt-level = 'z'     # Optimize for size
lto = true          # Link-time optimization
codegen-units = 1   # Better optimization
strip = true        # Strip symbols

Troubleshooting

Build Errors

OpenSSL not found:

# Ubuntu/Debian
apt-get install libssl-dev pkg-config

# macOS
brew install openssl

Linker errors:

# Install build essentials
apt-get install build-essential

Next Steps

Running Tests - Test your build
Development Workflow - Daily development

Running Tests

Run and write tests for Bindy.

Unit Tests

# Run all tests
cargo test

# Run specific test
cargo test test_name

# Run with output
cargo test -- --nocapture

Integration Tests

# Requires Kubernetes cluster
cargo test --test simple_integration -- --ignored

# Or use make
make test-integration

Test Coverage

# Install tarpaulin
cargo install cargo-tarpaulin

# Generate coverage
cargo tarpaulin --out Html

# Open report
open tarpaulin-report.html

Writing Tests

See Testing Guidelines for details.

Bindy DNS Controller - Testing Guide

Complete guide for testing the Bindy DNS Controller, including unit tests and integration tests with Kind (Kubernetes in Docker).

Quick Start

# Unit tests (fast, no Kubernetes required)
make test

# Integration tests (automated with Kind cluster)
make kind-integration-test

# View results
# Unit: 62 tests passing
# Integration: All 8 DNS record types + infrastructure tests

Test Overview
Unit Tests
Integration Tests
Makefile Targets
Troubleshooting
CI/CD Integration

Test Overview

Test Results

Unit Tests: 62 PASSING ✅

test result: ok. 62 passed; 0 failed; 0 ignored

Integration Tests: Automated with Kind

Kubernetes connectivity ✅
CRD verification ✅
All 8 DNS record types ✅
Resource lifecycle ✅

Test Structure

bindy/
├── src/
│   ├── crd_tests.rs              # CRD structure tests (28 tests)
│   └── reconcilers/
│       └── tests.rs              # Bind9Manager tests (34 tests)
├── tests/
│   ├── simple_integration.rs     # Rust integration tests
│   ├── integration_test.sh       # Full integration test suite
│   └── common/mod.rs            # Shared test utilities
└── deploy/
    ├── kind-deploy.sh           # Deploy to Kind cluster
    ├── kind-test.sh             # Basic functional tests
    └── kind-cleanup.sh          # Cleanup Kind cluster

Unit Tests

Unit tests run locally without Kubernetes (< 1 second).

Running Unit Tests

# All unit tests
make test
# or
cargo test

# Specific module
cargo test crd_tests::
cargo test bind9::tests::

# With output
cargo test -- --nocapture

Unit Test Coverage (62 tests)

CRD Tests (28 tests)

Label selectors and matching
SOA record structure
DNSZone specs (primary/secondary)
All DNS record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
Bind9Instance configurations
DNSSEC settings

Bind9Manager Tests (34 tests)

Zone file creation
Email formatting for DNS
All DNS record types (with/without TTL)
Secondary zone configuration
Zone lifecycle (create, exists, delete)
Edge cases and workflows

Integration Tests

Integration tests run against Kind (Kubernetes in Docker) clusters.

Prerequisites

# Docker
docker --version  # 20.10+

# Kind
kind --version    # 0.20.0+
brew install kind  # macOS

# kubectl
kubectl version --client  # 1.24+

Running Integration Tests

Full Integration Suite (Recommended)

make kind-integration-test

This automatically:

Creates Kind cluster (if needed)
Builds and deploys controller
Runs all integration tests
Cleans up test resources

Step-by-Step

# 1. Deploy to Kind
make kind-deploy

# 2. Run functional tests
make kind-test

# 3. Run comprehensive integration tests
make kind-integration-test

# 4. View logs
make kind-logs

# 5. Cleanup
make kind-cleanup

Integration Test Coverage

Rust Integration Tests

test_kubernetes_connectivity - Cluster access
test_crds_installed - CRD verification
test_create_and_cleanup_namespace - Namespace lifecycle

Full Integration Suite (integration_test.sh)

Bind9Instance creation
DNSZone creation
A Record (IPv4)
AAAA Record (IPv6)
CNAME Record
MX Record
TXT Record
NS Record
SRV Record
CAA Record

Expected Output

🧪 Running Bindy Integration Tests

✅ Using existing cluster 'bindy-test'

1️⃣  Running Rust integration tests...
test test_kubernetes_connectivity ... ok
test test_crds_installed ... ok
test test_create_and_cleanup_namespace ... ok

2️⃣  Running functional tests with kubectl...
Testing Bind9Instance creation...
Testing DNSZone creation...
Testing all DNS record types...

3️⃣  Verifying resources...
  ✓ Bind9Instance created
  ✓ DNSZone created
  ✓ arecord created
  ✓ aaaarecord created
  ✓ cnamerecord created
  ✓ mxrecord created
  ✓ txtrecord created
  ✓ nsrecord created
  ✓ srvrecord created
  ✓ caarecord created

✅ All integration tests passed!

Makefile Targets

Test Targets

make test                   # Run unit tests
make test-lib              # Library tests only
make test-integration      # Rust integration tests
make test-all             # Unit + Rust integration tests
make test-cov             # Coverage report (HTML)
make test-cov-view        # Generate and open coverage

Kind Targets

make kind-create          # Create Kind cluster
make kind-deploy          # Deploy controller
make kind-test            # Basic functional tests
make kind-integration-test # Full integration suite
make kind-logs            # View controller logs
make kind-cleanup         # Delete cluster

Other Targets

make lint                 # Run clippy and fmt check
make format               # Format code
make build                # Build release binary
make docker-build         # Build Docker image

Troubleshooting

Unit Tests

Tests fail to compile

cargo clean
cargo test

Specific test fails

cargo test test_name -- --nocapture

Integration Tests

“Cluster not found”

# Auto-created by integration test, or:
./deploy/kind-deploy.sh

“Controller not ready”

# Check status
kubectl get pods -n dns-system

# View logs
kubectl logs -n dns-system -l app=bindy

# Redeploy
./deploy/kind-deploy.sh

“CRDs not installed”

# Check CRDs
kubectl get crds | grep bindy.firestoned.io

# Install
kubectl apply -k deploy/crds

Resource creation fails

# Controller logs
kubectl logs -n dns-system -l app=bindy --tail=50

# Resource status
kubectl describe bind9instance <name> -n dns-system

# Events
kubectl get events -n dns-system --sort-by='.lastTimestamp'

Manual Cleanup

# Delete test resources
kubectl delete bind9instances,dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords --all -n dns-system

# Delete cluster
kind delete cluster --name bindy-test

# Clean build
cargo clean

CI/CD Integration

GitHub Actions

Current PR workflow (.github/workflows/pr.yaml):

Lint (formatting, clippy)
Test (unit tests)
Build (stable, beta)
Docker (build and push to ghcr.io)
Security audit
Coverage

Add Integration Tests

integration-tests:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: dtolnay/rust-toolchain@stable

    - name: Install Kind
      run: |
        curl -Lo ./kind https://kind.sigs.k8s.io/dl/latest/kind-linux-amd64
        chmod +x ./kind
        sudo mv ./kind /usr/local/bin/kind

    - name: Run Integration Tests
      run: |
        chmod +x tests/integration_test.sh
        ./tests/integration_test.sh

Test Development

Writing Unit Tests

Add to src/crd_tests.rs or src/reconcilers/tests.rs:

#![allow(unused)]
fn main() {
#[test]
fn test_my_feature() {
    // Arrange
    let (_temp_dir, manager) = create_test_manager();

    // Act
    let result = manager.my_operation();

    // Assert
    assert!(result.is_ok());
}
}

Writing Integration Tests

Add to tests/simple_integration.rs:

#![allow(unused)]
fn main() {
#[tokio::test]
#[ignore]  // Always mark as ignored
async fn test_my_scenario() {
    let client = match get_kube_client_or_skip().await {
        Some(c) => c,
        None => return,  // Skip if no cluster
    };

    // Test code here
}
}

Using Test Helpers

From tests/common/mod.rs:

#![allow(unused)]
fn main() {
use common::*;

let client = setup_dns_test_environment("my-test-ns").await?;
create_bind9_instance(&client, "ns", "dns", None).await?;
wait_for_ready(Duration::from_secs(10)).await;
cleanup_test_namespace(&client, "ns").await?;
}

Performance Testing

Coverage

make test-cov-view
# Opens coverage/tarpaulin-report.html

Load Testing

# Create many resources
for i in {1..100}; do
  kubectl apply -f - <<EOF
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: test-${i}
  namespace: dns-system
spec:
  zone: example.com
  name: host-${i}
  ipv4Address: "192.0.2.${i}"
EOF
done

# Monitor
kubectl top pod -n dns-system

Best Practices

Unit Tests

Test one thing at a time
Fast (< 1s each)
No external dependencies
Descriptive names

Integration Tests

Always use #[ignore]
Check cluster connectivity first
Unique namespaces
Always cleanup
Good error messages

General

Run cargo fmt before committing
Run cargo clippy to catch issues
Keep tests updated
Document complex scenarios

Additional Resources

Support

GitHub Issues: https://github.com/firestoned/bindy/issues
Controller logs: make kind-logs
Test with output: cargo test -- --nocapture

Test Coverage

Test Statistics

Total Unit Tests: 95 (96 including helper tests)

Test Breakdown by Module

bind9 Module (34 tests)

Zone file and DNS record management tests:

Zone creation and management (primary/secondary)
All 8 DNS record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
Record lifecycle (add, update, delete)
TTL handling
Special characters and edge cases
Complete workflow tests

bind9_resources Module (21 tests)

Kubernetes resource builder tests:

Label generation and consistency
ConfigMap creation with BIND9 configuration
Deployment creation with proper specs
Service creation with TCP/UDP ports
Pod specification validation
Volume and volume mount configuration
Health and readiness probes
BIND9 configuration options:
- Recursion settings
- ACL configuration (allowQuery, allowTransfer)
- DNSSEC configuration
- Multiple ACL entries
Resource naming conventions
Selector matching (Deployment ↔ Service)

crd_tests Module (28 tests)

CRD structure and validation tests:

Label selectors and requirements
SOA record structure
Secondary zone configuration
All DNS record specs (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
BIND9 configuration structures
DNSSEC configuration
Bind9Instance specifications
Status structures for all resource types

Status and Condition Tests (17 new tests)

Comprehensive condition type validation:

All 5 condition types: Ready, Available, Progressing, Degraded, Failed
All 3 status values: True, False, Unknown
Condition field validation (type, status, reason, message, lastTransitionTime)
Multiple conditions support
Status structures for:
- Bind9Instance (with replicas tracking)
- DNSZone (with record count)
- All DNS record types
Condition serialization/deserialization
Observed generation tracking
Edge cases (no conditions, empty status)

Integration Tests (4 tests, 3 ignored)

Kubernetes connectivity (ignored - requires cluster)
CRD installation verification (ignored - requires cluster)
Namespace creation/cleanup (ignored - requires cluster)
Unit test verification (always runs)

Test Categories

Unit Tests (95)

Pure Functions: All resource builders, configuration generators
Data Structures: All CRD types, status structures, conditions
Business Logic: Zone management, record handling
Validation: Condition types, status values, configuration options

Integration Tests (3 ignored + 1 running)

Kubernetes cluster connectivity
CRD deployment
Resource lifecycle
End-to-end workflows

Coverage by Feature

CRD Validation

✅ All 10 CRDs have proper structure tests
✅ Condition types validated (Ready, Available, Progressing, Degraded, Failed)
✅ Status values validated (True, False, Unknown)
✅ Required fields enforced in CRD definitions
✅ Serialization/deserialization tested

BIND9 Configuration

✅ Named configuration file generation
✅ Options configuration with all settings
✅ Recursion control
✅ ACL management (query, transfer)
✅ DNSSEC configuration (enable, validation)
✅ Default value handling
✅ Multiple ACL entries
✅ Empty ACL lists

Kubernetes Resources

✅ Deployment creation with proper replica counts
✅ Service creation with TCP/UDP ports
✅ ConfigMap creation with BIND9 config
✅ Label consistency across resources
✅ Selector matching
✅ Volume and volume mount configuration
✅ Health probes (liveness, readiness)
✅ Container image version handling

DNS Records

✅ All 8 record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)
✅ Record creation with TTL
✅ Default TTL handling
✅ Multiple records per zone
✅ Special characters in records
✅ Record deletion
✅ Zone apex vs subdomain records

Status Management

✅ Condition creation with all fields
✅ Multiple conditions per resource
✅ Observed generation tracking
✅ Replica count tracking (Bind9Instance)
✅ Record count tracking (DNSZone)
✅ Status transitions (Ready ↔ Failed)
✅ Degraded state handling

Running Tests

All Tests

cargo test

Unit Tests Only

cargo test --lib

Specific Module

cargo test --lib bind9_resources
cargo test --lib crd_tests

Integration Tests

cargo test --test simple_integration -- --ignored

With Coverage

cargo tarpaulin --verbose --all-features --workspace --timeout 120 --out Xml

Test Quality Metrics

Coverage: High coverage of core functionality
Isolation: All unit tests are isolated and independent
Speed: All unit tests complete in < 0.01 seconds
Deterministic: No flaky tests, all results are reproducible
Comprehensive: Tests cover happy paths, edge cases, and error conditions

Recent Additions (26 new tests)

bind9_resources Module (+14 tests)

test_build_pod_spec - Pod specification validation
test_build_deployment_replicas - Replica count configuration
test_build_deployment_version - BIND9 version handling
test_build_service_ports - TCP/UDP port configuration
test_configmap_contains_all_files - ConfigMap completeness
test_options_conf_with_recursion_enabled - Recursion configuration
test_options_conf_with_multiple_acls - Multiple ACL entries
test_labels_consistency - Label validation
test_configmap_naming - Naming conventions
test_deployment_selector_matches_labels - Selector consistency
test_service_selector_matches_deployment - Service selector matching
test_dnssec_config_enabled - DNSSEC enable flag
test_dnssec_config_validation_only - DNSSEC validation flag
test_options_conf_with_empty_transfer - Empty transfer lists

crd_tests Module (+17 tests)

test_condition_types - All 5 condition types validation
test_condition_status_values - All 3 status values validation
test_condition_with_all_fields - Complete condition structure
test_multiple_conditions - Multiple conditions support
test_dnszone_status_with_conditions - DNSZone status
test_record_status_with_condition - Record status
test_degraded_condition - Degraded state handling
test_failed_condition - Failed state handling
test_available_condition - Available state
test_progressing_condition - Progressing state
test_condition_serialization - JSON serialization
test_status_with_no_conditions - Empty conditions list
test_observed_generation_tracking - Generation tracking
test_bind9_config - BIND9 configuration structure
test_dnssec_config - DNSSEC configuration
test_bind9instance_spec - Instance specification
test_bind9instance_status_default - Status defaults

Next Steps

Potential Test Additions

Integration tests for actual BIND9 deployment
Integration tests for zone transfer between primary/secondary
Performance tests for large zone files
Stress tests with many concurrent updates
Property-based tests for configuration generation
Mock reconciler tests
Controller loop tests

Test Infrastructure

Add benchmarks for critical paths
Add mutation testing
Add fuzz testing for DNS record parsing
Set up continuous coverage tracking
Add test fixtures and helpers

Continuous Integration

All tests run automatically in GitHub Actions:

PR Workflow: Runs on every pull request
Main Workflow: Runs on pushes to main branch
Coverage: Uploaded to Codecov after each run
Integration: Runs in dedicated workflow with Kind cluster

Development Workflow

Daily development workflow for Bindy contributors.

Development Cycle

Create feature branch

git checkout -b feature/my-feature

Make changes

Edit code in src/
If modifying CRDs, edit Rust types in src/crd.rs
Add tests
Update documentation

Regenerate CRDs (if modified)

# If you modified src/crd.rs, regenerate YAML files
cargo run --bin crdgen
# or
make crds

Test locally

cargo test
cargo clippy -- -D warnings
cargo fmt

Validate CRDs

# Ensure generated CRDs are valid
kubectl apply --dry-run=client -f deploy/crds/

Commit changes

git add .
git commit -m "Add feature: description"

Push and create PR

git push origin feature/my-feature
# Create PR on GitHub

CRD Development

IMPORTANT: src/crd.rs is the source of truth. CRD YAML files in deploy/crds/ are auto-generated.

Modifying Existing CRDs

Edit the Rust type in src/crd.rs:

#![allow(unused)]
fn main() {
#[derive(CustomResource, Clone, Debug, Serialize, Deserialize, JsonSchema)]
#[kube(
    group = "bindy.firestoned.io",
    version = "v1alpha1",
    kind = "Bind9Cluster",
    namespaced
)]
#[serde(rename_all = "camelCase")]
pub struct Bind9ClusterSpec {
    pub version: Option<String>,
    // Add new fields here
    pub new_field: Option<String>,
}
}

Regenerate YAML files:

cargo run --bin crdgen
# or
make crds

Verify the generated YAML:

# Check the generated file
cat deploy/crds/bind9clusters.crd.yaml

# Validate it
kubectl apply --dry-run=client -f deploy/crds/bind9clusters.crd.yaml

Update documentation to describe the new field

Adding New CRDs

Define the CustomResource in src/crd.rs
Add to crdgen in src/bin/crdgen.rs:

#![allow(unused)]
fn main() {
generate_crd::<MyNewResource>("mynewresources.crd.yaml", output_dir)?;
}

Regenerate YAMLs: make crds
Export the type in src/lib.rs if needed

Generated YAML Format

All generated CRD files include:

Copyright header
SPDX license identifier
Auto-generated warning

Never edit YAML files directly - they will be overwritten!

Local Testing

# Start kind cluster
kind create cluster --name bindy-dev

# Deploy CRDs (regenerate first if modified)
make crds
kubectl apply -k deploy/crds/

# Run controller locally
RUST_LOG=debug cargo run

Hot Reload

# Auto-rebuild on changes
cargo watch -x 'run --release'

GitHub Pages Setup Guide

This guide explains how to enable GitHub Pages for the Bindy documentation.

Prerequisites

Repository must be pushed to GitHub
You must have admin access to the repository
The .github/workflows/docs.yaml workflow file must be present

Setup Steps

1. Enable GitHub Pages

Go to your repository on GitHub: https://github.com/firestoned/bindy
Click Settings (in the repository menu)
Scroll down to the Pages section in the left sidebar
Click on Pages

2. Configure Source

Under “Build and deployment”:

Source: Select “GitHub Actions”
This will use the workflow in .github/workflows/docs.yaml

That’s it! GitHub will automatically use the workflow.

3. Trigger the First Build

The documentation will be built and deployed automatically when you push to the main branch.

To trigger the first build:

Push any change to main:
```
git push origin main
```
Or manually trigger the workflow:
- Go to Actions tab
- Click on “Documentation” workflow
- Click “Run workflow”
- Select main branch
- Click “Run workflow”

4. Monitor the Build

Go to the Actions tab in your repository
Click on the “Documentation” workflow run
Watch the build progress
Once complete, the “deploy” job will show the URL

5. Access Your Documentation

Once deployed, your documentation will be available at:

https://firestoned.github.io/bindy/

Verification

Check Deployment Status

Go to Settings → Pages
You should see: “Your site is live at https://firestoned.github.io/bindy/”
Click “Visit site” to view the documentation

Verify Documentation Structure

Your deployed site should have:

Main documentation (mdBook): https://firestoned.github.io/bindy/
API reference (rustdoc): https://firestoned.github.io/bindy/rustdoc/

Troubleshooting

Build Fails

Check workflow logs:

Go to Actions tab
Click on the failed workflow run
Expand the failed step to see the error
Common issues:
- Rust compilation errors
- mdBook build errors
- Missing files

Fix and retry:

Fix the issue locally
Test with make docs
Push the fix to main
GitHub Actions will automatically retry

Pages Not Showing

Verify GitHub Pages is enabled:

Go to Settings → Pages
Ensure source is set to “GitHub Actions”
Check that at least one successful deployment has completed

Check permissions:

The workflow needs these permissions (already configured in docs.yaml):

permissions:
  contents: read
  pages: write
  id-token: write

404 Errors on Subpages

Check base URL configuration:

The book.toml has:

site-url = "/bindy/"

This must match your repository name. If your repository is named differently, update this value.

Custom Domain (Optional)

To use a custom domain:

Go to Settings → Pages
Under “Custom domain”, enter your domain
Update the CNAME field in book.toml:
```
cname = "docs.yourdomain.com"
```
Configure DNS:
- Add a CNAME record pointing to firestoned.github.io
- Or A records pointing to GitHub Pages IPs

Updating Documentation

Documentation is automatically deployed on every push to main:

# Make changes to documentation
vim docs/src/introduction.md

# Commit and push
git add docs/src/introduction.md
git commit -m "Update introduction"
git push origin main

# GitHub Actions will automatically build and deploy

Local Preview

Before pushing, preview your changes locally:

# Build and serve documentation
make docs-serve

# Or watch for changes
make docs-watch

# Open http://localhost:3000 in your browser

Workflow Details

The GitHub Actions workflow (.github/workflows/docs.yaml):

Build job:
- Checks out the repository
- Sets up Rust toolchain
- Installs mdBook
- Builds rustdoc API documentation
- Builds mdBook user documentation
- Combines both into a single site
- Uploads artifact to GitHub Pages
Deploy job (only on main):
- Deploys the artifact to GitHub Pages
- Updates the live site

Branch Protection (Recommended)

To ensure documentation quality:

Go to Settings → Branches
Add a branch protection rule for main:
- Require pull request reviews
- Require status checks (include “Documentation / Build Documentation”)
- This ensures the documentation builds before merging

Additional Configuration

Custom Theme

The documentation uses a custom theme defined in:

docs/theme/custom.css - Custom styling

To customize:

Edit the CSS file
Test locally with make docs-watch
Push to main

Search Configuration

Search is configured in book.toml:

[output.html.search]
enable = true
limit-results = 30

Adjust as needed for your use case.

Support

For issues with GitHub Pages deployment:

GitHub Pages Status: https://www.githubstatus.com/
GitHub Actions Documentation: https://docs.github.com/en/actions
GitHub Pages Documentation: https://docs.github.com/en/pages

For issues with the documentation content:

Create an issue: https://github.com/firestoned/bindy/issues
Start a discussion: https://github.com/firestoned/bindy/discussions

Architecture Deep Dive

Technical architecture of the Bindy DNS operator.

System Architecture

┌─────────────────────────────────────┐
│     Kubernetes API Server           │
└──────────────┬──────────────────────┘
               │ Watch/Update
     ┌─────────▼────────────┐
     │  Bindy Controller    │
     │  ┌────────────────┐  │
     │  │ Reconcilers    │  │
     │  │  - Bind9Inst   │  │
     │  │  - DNSZone     │  │
     │  │  - Records     │  │
     │  └────────────────┘  │
     └──────┬───────────────┘
            │ Manages
     ┌──────▼────────────────┐
     │  BIND9 Pods           │
     │  ┌──────────────────┐ │
     │  │ ConfigMaps       │ │
     │  │ Deployments      │ │
     │  │ Services         │ │
     │  └──────────────────┘ │
     └───────────────────────┘

Components

Controller

Watches CRD resources
Reconciles desired vs actual state
Manages Kubernetes resources

Reconcilers

Per-resource reconciliation logic
Idempotent operations
Error handling and retries

BIND9 Integration

Configuration generation
Zone file management
BIND9 lifecycle management

See detailed docs:

Controller Design
Reconciliation Logic
BIND9 Integration

Controller Design

Design and implementation of the Bindy controller.

Controller Pattern

Bindy implements the Kubernetes controller pattern:

Watch - Monitor CRD resources
Reconcile - Ensure actual state matches desired
Update - Apply changes to Kubernetes resources

Reconciliation Loop

#![allow(unused)]
fn main() {
loop {
    // Get resource from work queue
    let resource = queue.pop();
    
    // Reconcile
    match reconcile(resource).await {
        Ok(_) => {
            // Success - requeue with normal delay
            queue.requeue(resource, Duration::from_secs(300));
        }
        Err(e) => {
            // Error - retry with backoff
            queue.requeue_with_backoff(resource, e);
        }
    }
}
}

State Management

Controller maintains no local state - all state in Kubernetes:

CRD resources (desired state)
Deployments, Services, ConfigMaps (actual state)
Status fields (observed state)

Error Handling

Transient errors: Retry with exponential backoff
Permanent errors: Update status, log, requeue
Resource conflicts: Retry with latest version

Reconciliation Logic

Detailed reconciliation logic for each resource type.

Status Update Optimization

All reconcilers implement status change detection to prevent tight reconciliation loops. Before updating the status subresource, each reconciler checks if the status has actually changed. This prevents unnecessary API calls and reconciliation cycles.

Status is only updated when:

Condition type changes
Status value changes
Message changes
Status doesn’t exist yet

This optimization is implemented in:

Bind9Cluster reconciler (src/reconcilers/bind9cluster.rs:394-430)
Bind9Instance reconciler (src/reconcilers/bind9instance.rs:736-758)
DNSZone reconciler (src/reconcilers/dnszone.rs:535-565)
All record reconcilers (src/reconcilers/records.rs:1032-1072)

Bind9Instance Reconciliation

#![allow(unused)]
fn main() {
async fn reconcile_bind9instance(instance: Bind9Instance) -> Result<()> {
    // 1. Build desired resources
    let configmap = build_configmap(&instance);
    let deployment = build_deployment(&instance);
    let service = build_service(&instance);
    
    // 2. Apply or update ConfigMap
    apply_configmap(configmap).await?;
    
    // 3. Apply or update Deployment
    apply_deployment(deployment).await?;
    
    // 4. Apply or update Service
    apply_service(service).await?;
    
    // 5. Update status
    update_status(&instance, "Ready").await?;
    
    Ok(())
}
}

DNSZone Reconciliation

DNSZone reconciliation uses granular status updates to provide real-time progress visibility and better error reporting. The reconciliation follows a multi-phase approach with status updates at each phase.

Reconciliation Flow

#![allow(unused)]
fn main() {
async fn reconcile_dnszone(zone: DNSZone) -> Result<()> {
    // Phase 1: Set Progressing status before primary reconciliation
    update_condition(&zone, "Progressing", "True", "PrimaryReconciling",
                     "Configuring zone on primary instances").await?;

    // Phase 2: Configure zone on primary instances
    let primary_count = add_dnszone(client, &zone, zone_manager).await
        .map_err(|e| {
            // On failure: Set Degraded status (primary failure is fatal)
            update_condition(&zone, "Degraded", "True", "PrimaryFailed",
                           &format!("Failed to configure zone on primaries: {}", e)).await?;
            e
        })?;

    // Phase 3: Set Progressing status after primary success
    update_condition(&zone, "Progressing", "True", "PrimaryReconciled",
                     &format!("Configured on {} primary server(s)", primary_count)).await?;

    // Phase 4: Set Progressing status before secondary reconciliation
    let secondary_msg = format!("Configured on {} primary server(s), now configuring secondaries", primary_count);
    update_condition(&zone, "Progressing", "True", "SecondaryReconciling", &secondary_msg).await?;

    // Phase 5: Configure zone on secondary instances (non-fatal if fails)
    match add_dnszone_to_secondaries(client, &zone, zone_manager).await {
        Ok(secondary_count) => {
            // Phase 6: Success - Set Ready status
            let msg = format!("Configured on {} primary server(s) and {} secondary server(s)",
                            primary_count, secondary_count);
            update_status_with_secondaries(&zone, "Ready", "True", "ReconcileSucceeded",
                                          &msg, secondary_ips).await?;
        }
        Err(e) => {
            // Phase 6: Partial success - Set Degraded status (primaries work, secondaries failed)
            let msg = format!("Configured on {} primary server(s), but secondary configuration failed: {}",
                            primary_count, e);
            update_status_with_secondaries(&zone, "Degraded", "True", "SecondaryFailed",
                                          &msg, secondary_ips).await?;
        }
    }

    Ok(())
}
}

Status Conditions

DNSZone reconciliation uses three condition types:

Progressing - During reconciliation phases
- Reason: PrimaryReconciling - Before primary configuration
- Reason: PrimaryReconciled - After primary configuration succeeds
- Reason: SecondaryReconciling - Before secondary configuration
- Reason: SecondaryReconciled - After secondary configuration succeeds
Ready - Successful reconciliation
- Reason: ReconcileSucceeded - All phases completed successfully
Degraded - Partial or complete failure
- Reason: PrimaryFailed - Primary configuration failed (fatal, reconciliation aborts)
- Reason: SecondaryFailed - Secondary configuration failed (non-fatal, primaries still work)

Benefits

Real-time progress visibility - Users can see which phase is running
Better error reporting - Know exactly which phase failed (primary vs secondary)
Graceful degradation - Secondary failures don’t break the zone (primaries still work)
Accurate status - Endpoint counts reflect actual configured servers

Record Reconciliation

All record types (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA) follow a consistent pattern with granular status updates for better observability.

Reconciliation Flow

#![allow(unused)]
fn main() {
async fn reconcile_record(record: Record) -> Result<()> {
    // Phase 1: Set Progressing status before configuration
    update_record_status(&record, "Progressing", "True", "RecordReconciling",
                        "Configuring A record on zone endpoints").await?;

    // Phase 2: Get zone and configure record on all endpoints
    let zone = get_zone(&record.spec.zone).await?;

    match add_record_to_all_endpoints(&zone, &record).await {
        Ok(endpoint_count) => {
            // Phase 3: Success - Set Ready status with endpoint count
            let msg = format!("Record configured on {} endpoint(s)", endpoint_count);
            update_record_status(&record, "Ready", "True", "ReconcileSucceeded", &msg).await?;
        }
        Err(e) => {
            // Phase 3: Failure - Set Degraded status with error details
            let msg = format!("Failed to configure record: {}", e);
            update_record_status(&record, "Degraded", "True", "RecordFailed", &msg).await?;
            return Err(e);
        }
    }

    Ok(())
}
}

Status Conditions

All DNS record types use three condition types:

Progressing - During record configuration
- Reason: RecordReconciling - Before adding record to zone endpoints
Ready - Successful configuration
- Reason: ReconcileSucceeded - Record configured on all endpoints
- Message includes count of configured endpoints (e.g., “Record configured on 3 endpoint(s)”)
Degraded - Configuration failure
- Reason: RecordFailed - Failed to configure record (includes error details)

Benefits

Real-time progress - See when records are being configured
Better debugging - Know immediately if/why a record failed
Accurate reporting - Status shows exact number of endpoints configured
Consistent with zones - Same status pattern as DNSZone reconciliation

Supported Record Types

All 8 record types use this granular status approach:

A - IPv4 address records
AAAA - IPv6 address records
CNAME - Canonical name (alias) records
MX - Mail exchange records
TXT - Text records (SPF, DKIM, DMARC, etc.)
NS - Nameserver delegation records
SRV - Service location records
CAA - Certificate authority authorization records

Reconciler Hierarchy and Delegation

This document describes the simplified reconciler architecture in Bindy, showing how each controller watches for resources and delegates to sub-resources.

Overview

Bindy follows a hierarchical delegation pattern where each reconciler is responsible for creating and managing its immediate child resources. This creates a clean separation of concerns and makes the system easier to understand and maintain.

graph TD
    GC[Bind9GlobalCluster<br/>cluster-scoped] -->|creates| BC[Bind9Cluster<br/>namespace-scoped]
    BC -->|creates| BI[Bind9Instance<br/>namespace-scoped]
    BI -->|creates| RES[Kubernetes Resources<br/>ServiceAccount, Secret,<br/>ConfigMap, Deployment, Service]
    BI -.->|targets| DZ[DNSZone<br/>namespace-scoped]
    BI -.->|targets| REC[DNS Records<br/>namespace-scoped]
    DZ -->|creates zones via<br/>bindcar HTTP API| BIND9[BIND9 Pods]
    REC -->|creates records via<br/>hickory DNS UPDATE| BIND9
    REC -->|notifies via<br/>bindcar HTTP API| BIND9

    style GC fill:#e1f5ff
    style BC fill:#e1f5ff
    style BI fill:#e1f5ff
    style DZ fill:#fff4e1
    style REC fill:#fff4e1
    style RES fill:#e8f5e9
    style BIND9 fill:#f3e5f5

Reconciler Details

1. Bind9GlobalCluster Reconciler

Scope: Cluster-scoped resource

Purpose: Creates Bind9Cluster resources in desired namespaces to enable multi-tenant DNS infrastructure.

Watches: Bind9GlobalCluster resources

Creates: Bind9Cluster resources in the namespace specified in the spec, or defaults to dns-system

Change Detection:

Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
Desired vs actual state: Verifies all Bind9Cluster resources exist in target namespaces

Implementation: src/reconcilers/bind9globalcluster.rs

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9GlobalCluster
metadata:
  name: global-dns
spec:
  namespaces:
    - platform-dns
    - team-web
    - team-api
  primaryReplicas: 2
  secondaryReplicas: 3

Creates Bind9Cluster resources in each namespace: platform-dns, team-web, team-api.

2. Bind9Cluster Reconciler

Scope: Namespace-scoped resource

Purpose: Creates and manages Bind9Instance resources based on desired replica counts for primary and secondary servers.

Watches: Bind9Cluster resources

Creates:

Bind9Instance resources for primaries (e.g., my-cluster-primary-0, my-cluster-primary-1)
Bind9Instance resources for secondaries (e.g., my-cluster-secondary-0, my-cluster-secondary-1)
ConfigMap with shared BIND9 configuration (optional, for standalone configs)

Change Detection:

Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
Desired vs actual state:
- Verifies all Bind9Instance resources exist
- Scales instances up/down based on primaryReplicas and secondaryReplicas

Implementation: src/reconcilers/bind9cluster.rs

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: my-cluster
  namespace: platform-dns
spec:
  primaryReplicas: 2
  secondaryReplicas: 3

Creates:

my-cluster-primary-0, my-cluster-primary-1 (primaries)
my-cluster-secondary-0, my-cluster-secondary-1, my-cluster-secondary-2 (secondaries)

3. Bind9Instance Reconciler

Scope: Namespace-scoped resource

Purpose: Creates all Kubernetes resources needed to run a single BIND9 server pod.

Watches: Bind9Instance resources

Creates:

ServiceAccount: For pod identity and RBAC
Secret: Contains auto-generated RNDC key (HMAC-SHA256) for authentication
ConfigMap: BIND9 configuration (named.conf, zone files, etc.) - only for standalone instances
Deployment: Runs the BIND9 pod with bindcar HTTP API sidecar
Service: Exposes DNS (UDP/TCP 53) and HTTP API (TCP 8080) ports

Change Detection:

Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
Desired vs actual state (drift detection):
- Checks if Deployment resource exists
- Recreates missing resources if detected

Implementation: src/reconcilers/bind9instance.rs

Drift Detection Logic:

#![allow(unused)]
fn main() {
// Only reconcile resources if:
// 1. Spec changed (generation mismatch), OR
// 2. We haven't processed this resource yet (no observed_generation), OR
// 3. Resources are missing (drift detected)
let should_reconcile = should_reconcile(current_generation, observed_generation);

if !should_reconcile && deployment_exists {
    // Skip reconciliation - spec unchanged and resources exist
    return Ok(());
}

if !should_reconcile && !deployment_exists {
    // Drift detected - recreate missing resources
    info!("Spec unchanged but Deployment missing - drift detected, reconciling resources");
}
}

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: my-cluster-primary-0
  namespace: platform-dns
spec:
  role: Primary
  clusterRef: my-cluster
  replicas: 1

Creates: ServiceAccount, Secret, ConfigMap, Deployment, Service for my-cluster-primary-0.

4. DNSZone Reconciler

Scope: Namespace-scoped resource

Purpose: Creates DNS zones in ALL BIND9 instances (primary and secondary) via the bindcar HTTP API.

Watches: DNSZone resources

Creates: DNS zones in BIND9 using the bindcar HTTP API sidecar

Change Detection:

Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
Desired vs actual state:
- Checks if zone exists using zone_manager.zone_exists() via HTTP API
- Early returns if spec unchanged

Implementation: src/reconcilers/dnszone.rs

Protocol Details:

Zone operations: HTTP API via bindcar sidecar (port 8080)
Endpoints:
- POST /api/addzone/{zone} - Add primary/secondary zone
- DELETE /api/delzone/{zone} - Delete zone
- POST /api/notify/{zone} - Trigger zone transfer (NOTIFY)
- GET /api/zonestatus/{zone} - Check if zone exists

Logic Flow:

Finds all primary instances for the cluster (namespace-scoped or global)
Loads RNDC key for each instance (from Secret {instance}-rndc-key)
Calls zone_manager.add_zones() via HTTP API on all primary endpoints
Finds all secondary instances for the cluster
Calls zone_manager.add_secondary_zone() via HTTP API on all secondary endpoints
Notifies secondaries via zone_manager.notify_zone() to trigger zone transfer

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-zone
  namespace: platform-dns
spec:
  zoneName: example.com
  clusterRef: my-cluster
  soa:
    primaryNameServer: ns1.example.com
    adminEmail: admin.example.com
    ttl: 3600

Creates zone example.com in all instances of my-cluster via HTTP API.

5. DNS Record Reconcilers

Scope: Namespace-scoped resources

Purpose: Create DNS records in zones using hickory DNS UPDATE (RFC 2136) and notify secondaries via bindcar HTTP API.

Watches: ARecord, AAAARecord, CNAMERecord, TXTRecord, MXRecord, NSRecord, SRVRecord, CAARecord

Creates: DNS records in BIND9 using two protocols:

DNS UPDATE (RFC 2136) via hickory client - for creating records
HTTP API via bindcar sidecar - for notifying secondaries

Change Detection:

Spec changed: Uses should_reconcile() to compare metadata.generation with status.observed_generation
Desired vs actual state:
- Checks if zone exists using HTTP API before adding records
- Returns error if zone doesn’t exist

Implementation: src/reconcilers/records.rs

Protocol Details:

Operation	Protocol	Port	Authentication
Check zone exists	HTTP API (bindcar)	8080	ServiceAccount token
Add/update records	DNS UPDATE (hickory)	53 (TCP)	TSIG (RNDC key)
Notify secondaries	HTTP API (bindcar)	8080	ServiceAccount token

Logic Flow:

Looks up the DNSZone resource to get zone info
Finds all primary instances for the zone’s cluster
For each primary instance:
- Checks if zone exists via HTTP API (port 8080)
- Loads RNDC key from Secret
- Creates TSIG signer for authentication
- Sends DNS UPDATE message via hickory client (port 53 TCP)
After all records are added, notifies first primary via HTTP API to trigger zone transfer

Example:

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com
  namespace: platform-dns
spec:
  zone: example.com
  name: www
  ipv4: 192.0.2.1
  ttl: 300

Creates A record www.example.com → 192.0.2.1 in all primary instances via DNS UPDATE, then notifies secondaries via HTTP API.

Change Detection Logic

All reconcilers implement the “changed” detection pattern, which means they reconcile when:

Spec changed: metadata.generation ≠ status.observed_generation
First reconciliation: status.observed_generation is None
Drift detected: Desired state (YAML) ≠ actual state (cluster)

Implementation: `should_reconcile()`

Located in src/reconcilers/mod.rs:127-133:

#![allow(unused)]
fn main() {
pub fn should_reconcile(current_generation: Option<i64>, observed_generation: Option<i64>) -> bool {
    match (current_generation, observed_generation) {
        (Some(current), Some(observed)) => current != observed,
        (Some(_), None) => true, // First reconciliation
        _ => false,              // No generation tracking available
    }
}
}

Kubernetes Generation Semantics

metadata.generation: Incremented by Kubernetes API server only when spec changes
status.observed_generation: Set by controller to match metadata.generation after successful reconciliation
Status-only updates: Do NOT increment metadata.generation, preventing unnecessary reconciliations

Example: Reconciliation Flow

sequenceDiagram
    participant User
    participant K8s API
    participant Reconciler
    participant Status

    User->>K8s API: Create DNSZone (generation=1)
    K8s API->>Reconciler: Watch event (generation=1)
    Reconciler->>Reconciler: should_reconcile(1, None) → true
    Reconciler->>Reconciler: Create zone via HTTP API
    Reconciler->>Status: Update observed_generation=1

    User->>K8s API: Update DNSZone spec (generation=2)
    K8s API->>Reconciler: Watch event (generation=2)
    Reconciler->>Reconciler: should_reconcile(2, 1) → true
    Reconciler->>Reconciler: Update zone via HTTP API
    Reconciler->>Status: Update observed_generation=2

    Note over Reconciler: Status-only update (no spec change)
    Reconciler->>Status: Update phase=Ready (generation stays 2)
    Reconciler->>Reconciler: should_reconcile(2, 2) → false
    Reconciler->>Reconciler: Skip reconciliation ✓

Protocol Summary

Component	Creates	Protocol	Port	Authentication
Bind9GlobalCluster	Bind9Cluster	Kubernetes API	-	ServiceAccount
Bind9Cluster	Bind9Instance	Kubernetes API	-	ServiceAccount
Bind9Instance	K8s Resources	Kubernetes API	-	ServiceAccount
DNSZone	Zones in BIND9	HTTP API (bindcar)	8080	ServiceAccount token
DNS Records	Records in zones	DNS UPDATE (hickory)	53 TCP	TSIG (RNDC key)
DNS Records	Notify secondaries	HTTP API (bindcar)	8080	ServiceAccount token

Key Architectural Principles

1. Hierarchical Delegation

Each reconciler creates and manages only its immediate children:

Bind9GlobalCluster → Bind9Cluster
Bind9Cluster → Bind9Instance
Bind9Instance → Kubernetes resources

2. Namespace Scoping

All resources (except Bind9GlobalCluster) are namespace-scoped, enabling multi-tenancy:

Teams can manage their own DNS infrastructure in their namespaces
No cross-namespace resource access required

3. Change Detection

All reconcilers implement consistent change detection:

Skip work if spec unchanged and resources exist
Detect drift and recreate missing resources
Use generation tracking to avoid unnecessary reconciliations

4. Protocol Separation

HTTP API (bindcar): Zone-level operations (add, delete, notify)
DNS UPDATE (hickory): Record-level operations (add, update, delete records)
Kubernetes API: Resource lifecycle management

5. Idempotency

All operations are idempotent:

Adding an existing zone returns success
Adding an existing record updates it
Deleting a non-existent resource returns success

6. Error Handling

Each reconciler handles errors gracefully:

Updates status with error conditions
Retries on transient failures (exponential backoff)
Requeues on permanent errors with longer delays

Owner References and Resource Cleanup

Bindy implements proper Kubernetes owner references to ensure automatic cascade deletion and prevent resource leaks.

What are Owner References?

Owner references are Kubernetes metadata that establish parent-child relationships between resources. When set, Kubernetes automatically:

Garbage collects child resources when the parent is deleted
Blocks deletion of the parent if children still exist (when blockOwnerDeletion: true)
Shows ownership in resource metadata for easy tracking

Owner Reference Hierarchy in Bindy

graph TD
    GC[Bind9GlobalCluster<br/>cluster-scoped] -->|ownerReference| BC[Bind9Cluster<br/>namespace-scoped]
    BC -->|ownerReference| BI[Bind9Instance<br/>namespace-scoped]
    BI -->|ownerReferences| DEP[Deployment]
    BI -->|ownerReferences| SVC[Service]
    BI -->|ownerReferences| CM[ConfigMap]
    BI -->|ownerReferences| SEC[Secret]

    style GC fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style BC fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style BI fill:#e1f5ff,stroke:#0288d1,stroke-width:2px
    style DEP fill:#e8f5e9,stroke:#4caf50
    style SVC fill:#e8f5e9,stroke:#4caf50
    style CM fill:#e8f5e9,stroke:#4caf50
    style SEC fill:#e8f5e9,stroke:#4caf50

Implementation Details

1. Bind9GlobalCluster → Bind9Cluster

Location: src/reconcilers/bind9globalcluster.rs:340-352

#![allow(unused)]
fn main() {
// Create ownerReference to global cluster (cluster-scoped can own namespace-scoped)
let owner_ref = OwnerReference {
    api_version: API_GROUP_VERSION.to_string(),
    kind: KIND_BIND9_GLOBALCLUSTER.to_string(),
    name: global_cluster_name.clone(),
    uid: global_cluster.metadata.uid.clone().unwrap_or_default(),
    controller: Some(true),
    block_owner_deletion: Some(true),
};
}

Key Points:

Cluster-scoped resources CAN own namespace-scoped resources
controller: true means this is the primary controller for the child
block_owner_deletion: true prevents deleting parent while children exist
Finalizer ensures manual cleanup of Bind9Cluster resources before parent deletion

2. Bind9Cluster → Bind9Instance

Location: src/reconcilers/bind9cluster.rs:592-599

#![allow(unused)]
fn main() {
// Create ownerReference to the Bind9Cluster
let owner_ref = OwnerReference {
    api_version: API_GROUP_VERSION.to_string(),
    kind: KIND_BIND9_CLUSTER.to_string(),
    name: cluster_name.clone(),
    uid: cluster.metadata.uid.clone().unwrap_or_default(),
    controller: Some(true),
    block_owner_deletion: Some(true),
};
}

Key Points:

Both resources are namespace-scoped, so they must be in the same namespace
Finalizer ensures manual cleanup of Bind9Instance resources before parent deletion
Each instance created includes this owner reference

3. Bind9Instance → Kubernetes Resources

Location: src/bind9_resources.rs:188-197

#![allow(unused)]
fn main() {
pub fn build_owner_references(instance: &Bind9Instance) -> Vec<OwnerReference> {
    vec![OwnerReference {
        api_version: API_GROUP_VERSION.to_string(),
        kind: KIND_BIND9_INSTANCE.to_string(),
        name: instance.name_any(),
        uid: instance.metadata.uid.clone().unwrap_or_default(),
        controller: Some(true),
        block_owner_deletion: Some(true),
    }]
}
}

Resources with Owner References:

✅ Deployment: Managed by Bind9Instance
✅ Service: Managed by Bind9Instance
✅ ConfigMap: Managed by Bind9Instance (standalone instances only)
✅ Secret (RNDC key): Managed by Bind9Instance
❌ ServiceAccount: Shared resource, no owner reference (prevents conflicts)

Deletion Flow

When a Bind9GlobalCluster is deleted, the following cascade occurs:

sequenceDiagram
    participant User
    participant K8s as Kubernetes API
    participant GC as Bind9GlobalCluster<br/>Reconciler
    participant C as Bind9Cluster<br/>Reconciler
    participant I as Bind9Instance<br/>Reconciler
    participant GC_Obj as Garbage<br/>Collector

    User->>K8s: kubectl delete bind9globalcluster global-dns
    K8s->>GC: Reconcile (deletion_timestamp set)
    GC->>GC: Check finalizer present

    Note over GC: Step 1: Delete managed Bind9Cluster resources
    GC->>K8s: List Bind9Cluster with labels<br/>managed-by=Bind9GlobalCluster
    K8s-->>GC: Return managed clusters

    loop For each Bind9Cluster
        GC->>K8s: Delete Bind9Cluster
        K8s->>C: Reconcile (deletion_timestamp set)
        C->>C: Check finalizer present

        Note over C: Step 2: Delete managed Bind9Instance resources
        C->>K8s: List Bind9Instance with clusterRef
        K8s-->>C: Return managed instances

        loop For each Bind9Instance
            C->>K8s: Delete Bind9Instance
            K8s->>I: Reconcile (deletion_timestamp set)
            I->>I: Check finalizer present

            Note over I: Step 3: Delete Kubernetes resources
            I->>K8s: Delete Deployment, Service, ConfigMap, Secret
            K8s-->>I: Resources deleted

            I->>K8s: Remove finalizer from Bind9Instance
            K8s->>GC_Obj: Bind9Instance deleted
        end

        C->>K8s: Remove finalizer from Bind9Cluster
        K8s->>GC_Obj: Bind9Cluster deleted
    end

    GC->>K8s: Remove finalizer from Bind9GlobalCluster
    K8s->>GC_Obj: Bind9GlobalCluster deleted

    Note over GC_Obj: Kubernetes garbage collector<br/>cleans up any remaining<br/>resources with ownerReferences

Why Both Finalizers AND Owner References?

Bindy uses both finalizers and owner references for robust cleanup:

Mechanism	Purpose	When It Runs
Owner References	Automatic cleanup by Kubernetes	After parent deletion completes
Finalizers	Manual cleanup of children	Before parent deletion completes

The Flow:

Finalizer runs first: Lists and deletes managed children explicitly
Owner reference runs second: Kubernetes garbage collector cleans up any remaining resources

Why this combination?

Finalizers: Give control over deletion order and allow cleanup actions (like calling HTTP APIs)
Owner References: Provide safety net if finalizer fails or is bypassed
Together: Ensure no resource leaks under any circumstances

Verifying Owner References

You can verify owner references are set correctly:

# Check Bind9Cluster owner reference
kubectl get bind9cluster <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

# Check Bind9Instance owner reference
kubectl get bind9instance <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

# Check Deployment owner reference
kubectl get deployment <name> -n <namespace> -o yaml | grep -A 10 ownerReferences

Expected output:

ownerReferences:
- apiVersion: bindy.firestoned.io/v1alpha1
  blockOwnerDeletion: true
  controller: true
  kind: Bind9GlobalCluster  # or Bind9Cluster, Bind9Instance
  name: global-dns
  uid: 12345678-1234-1234-1234-123456789abc

Troubleshooting

Issue: Resources not being deleted

Check:

Verify owner references are set: kubectl get <resource> -o yaml | grep ownerReferences
Check if finalizers are blocking deletion: kubectl get <resource> -o yaml | grep finalizers
Verify garbage collector is running: kubectl get events --field-selector reason=Garbage

Solution:

If owner reference is missing, the resource was created before the fix (manual deletion required)
If finalizer is stuck, check reconciler logs for errors
If garbage collector is not running, check cluster health

Issue: Cannot delete parent resource

Symptom: kubectl delete hangs or shows “waiting for deletion”

Cause: Finalizer is running and cleaning up children

Expected Behavior: This is normal! Wait for the finalizer to complete.

Check Progress:

# Watch deletion progress
kubectl get bind9globalcluster <name> -w

# Check reconciler logs
kubectl logs -n bindy-system -l app=bindy -f

BIND9 Integration

How Bindy integrates with BIND9 DNS server.

Configuration Generation

Bindy generates BIND9 configuration from Bind9Instance specs:

named.conf

options {
    directory "/var/lib/bind";
    recursion no;
    allow-query { 0.0.0.0/0; };
};

zone "example.com" {
    type master;
    file "/var/lib/bind/zones/example.com.zone";
};

Zone Files

$TTL 3600
@   IN  SOA ns1.example.com. admin.example.com. (
        2024010101  ; serial
        3600        ; refresh
        600         ; retry
        604800      ; expire
        86400 )     ; negative TTL
    IN  NS  ns1.example.com.
www IN  A   192.0.2.1

Zone File Management

Operations:

Create new zones
Add/update records
Increment serial numbers
Reload BIND9 configuration

BIND9 Lifecycle

ConfigMap - Contains configuration files
Volume Mount - Mount ConfigMap to BIND9 pod
Init - BIND9 starts with configuration
Reload - rndc reload when configuration changes

Future Enhancements

Dynamic DNS updates (nsupdate)
TSIG key management
Zone transfer monitoring
Query statistics collection

Contributing

Thank you for contributing to Bindy!

Ways to Contribute

Report bugs
Suggest features
Improve documentation
Submit code changes
Review pull requests

Getting Started

Set up development environment
Read Code Style
Check Testing Guidelines
Follow PR Process

Code of Conduct

Be respectful, inclusive, and professional.

Reporting Issues

Use GitHub issues with:

Clear description
Steps to reproduce
Expected vs actual behavior
Environment details

Feature Requests

Open an issue describing:

Use case
Proposed solution
Alternatives considered

Questions

Ask questions in:

GitHub Discussions
Issues (tagged as question)

License

Contributor License Agreement

By contributing to Bindy, you agree that:

Your contributions will be licensed under the MIT License - The same license that covers the project
You have the right to submit the work - You own the copyright or have permission from the copyright holder
You grant a perpetual license - The project maintainers receive a perpetual, worldwide, non-exclusive, royalty-free, irrevocable license to use, modify, and distribute your contributions

What This Means

When you submit a pull request or contribution to Bindy:

✅ Your code will be licensed under the MIT License
✅ You retain copyright to your contributions
✅ Others can use your contributions under the MIT License terms
✅ Your contributions can be used in both open source and commercial projects
✅ You grant irrevocable permission for the project to use your work

SPDX License Identifiers

All source code files in Bindy include SPDX license identifiers. When adding new files, please include the following header:

For Rust files:

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

For shell scripts:

#!/usr/bin/env bash
# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

For YAML/configuration files:

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

For Makefiles and Dockerfiles:

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

Why SPDX Identifiers?

SPDX (Software Package Data Exchange) identifiers provide:

Machine-readable license information - Automated tools can scan and verify licenses
SBOM generation - Software Bill of Materials can be automatically created
License compliance - Makes it easier to track and verify licensing
Industry standard - Widely adopted across open source projects

Learn more: https://spdx.dev/

Third-Party Code

If you’re adding code from another source:

Ensure compatibility - The license must be compatible with MIT
Preserve original copyright - Keep the original copyright notice
Document the source - Note where the code came from
Check license requirements - Some licenses require attribution or notices

Compatible licenses include:

✅ MIT License
✅ Apache License 2.0
✅ BSD licenses (2-clause, 3-clause)
✅ ISC License
✅ Public Domain (CC0, Unlicense)

License Questions

If you have questions about:

Whether your contribution is compatible
License requirements for third-party code
Copyright or attribution

Please ask in your pull request or open a discussion before submitting.

Additional Resources

Full Project License - MIT License text
License Documentation - Comprehensive licensing information
SPDX License List - Standard license identifiers
Choose a License - Help choosing licenses for new projects

Code Style

Code style guidelines for Bindy.

Rust Style

Follow official Rust style guide:

# Format code
cargo fmt

# Check for issues
cargo clippy

Naming Conventions

snake_case for functions, variables
PascalCase for types, traits
SCREAMING_SNAKE_CASE for constants

Documentation

Document public APIs:

#![allow(unused)]
fn main() {
/// Reconciles a Bind9Instance resource.
///
/// Creates or updates Kubernetes resources for BIND9.
///
/// # Arguments
///
/// * `instance` - The Bind9Instance to reconcile
///
/// # Returns
///
/// Ok(()) on success, Err on failure
pub async fn reconcile(instance: Bind9Instance) -> Result<()> {
    // Implementation
}
}

Error Handling

Use anyhow::Result for errors:

#![allow(unused)]
fn main() {
use anyhow::{Context, Result};

fn do_thing() -> Result<()> {
    some_operation()
        .context("Failed to do thing")?;
    Ok(())
}
}

Testing

Write tests for all public functions:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_function() {
        assert_eq!(function(), expected);
    }
}
}

Testing Guidelines

Guidelines for writing tests in Bindy.

Test Structure

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    #[test]
    fn test_name() {
        // Arrange
        let input = create_input();
        
        // Act
        let result = function_under_test(input);
        
        // Assert
        assert_eq!(result, expected);
    }
}
}

Unit Tests

Test individual functions:

#![allow(unused)]
fn main() {
#[test]
fn test_build_configmap() {
    let instance = create_test_instance();
    let configmap = build_configmap(&instance);
    
    assert_eq!(configmap.metadata.name, Some("test".to_string()));
}
}

Integration Tests

Test with Kubernetes:

#![allow(unused)]
fn main() {
#[tokio::test]
#[ignore]  // Requires cluster
async fn test_full_reconciliation() {
    let client = Client::try_default().await.unwrap();
    // Test logic
}
}

Test Coverage

Aim for >80% coverage on new code.

CI Tests

All tests run on:

Pull requests
Main branch commits

Pull Request Process

Process for submitting and reviewing pull requests.

Before Submitting

Create issue (for non-trivial changes)
Create branch from main
Make changes with tests
Run checks locally:

cargo test
cargo clippy
cargo fmt

PR Requirements

Tests pass
Code formatted
Documentation updated
Commit messages clear
PR description complete

PR Template

## Description
Brief description of changes

## Related Issue
Fixes #123

## Changes
- Added feature X
- Fixed bug Y

## Testing
How changes were tested

## Checklist
- [ ] Tests added/updated
- [ ] Documentation updated
- [ ] Changelog updated (if needed)

Review Process

Automated checks must pass
Maintainer review required
Address feedback
Merge when approved

After Merge

Changes included in next release.

Security & Compliance

Bindy is designed to operate in highly regulated environments, including banking, financial services, healthcare, and government sectors. This section covers both security practices and compliance frameworks implemented throughout the project.

Security

The Security section documents the technical controls, threat models, and security architecture implemented in Bindy:

Architecture - Security architecture and design principles
Threat Model - Threat modeling and attack surface analysis
Incident Response - Security incident response procedures
Vulnerability Management - CVE tracking and vulnerability remediation
Build Reproducibility - Reproducible builds and supply chain security
Secret Access Audit - Kubernetes secret access auditing and monitoring
Audit Log Retention - Audit log retention policies and compliance

These documents provide technical guidance for security engineers, platform teams, and auditors reviewing Bindy’s security posture.

Compliance

The Compliance section maps Bindy’s implementation to specific regulatory frameworks and industry standards:

Overview - High-level compliance summary and roadmap
SOX 404 (Sarbanes-Oxley) - Financial reporting controls for public companies
PCI-DSS (Payment Card Industry) - Payment card data security standards
Basel III (Banking Regulations) - International banking regulatory framework
SLSA (Supply Chain Security) - Software supply chain integrity framework
NIST Cybersecurity Framework - NIST 800-53 control mappings

These documents provide evidence and traceability for compliance audits, including control implementation details and evidence collection procedures.

Who Should Read This?

Security Engineers: Focus on the Security section for technical controls and threat models
Compliance Officers: Focus on the Compliance section for regulatory framework mappings
Auditors: Review both sections for complete security and compliance evidence
Platform Engineers: Reference Security section for operational security practices
Risk Managers: Review Compliance section for risk management frameworks

Key Principles

Bindy’s security and compliance approach is built on these core principles:

Zero Trust Architecture: Never trust, always verify - all access is authenticated and authorized
Least Privilege: Minimal RBAC permissions, time-limited credentials, no shared secrets
Defense in Depth: Multiple layers of security controls (network, application, data)
Auditability: Comprehensive logging, immutable audit trails, cryptographic signatures
Automation: Security controls enforced through CI/CD, not manual processes
Transparency: Open documentation, public security policies, no security through obscurity

Continuous Improvement

Security and compliance are ongoing processes, not one-time achievements. Bindy maintains:

Weekly vulnerability scans with automated dependency updates
Quarterly security audits by independent third parties
Annual compliance reviews for all regulatory frameworks
Continuous monitoring of security controls and audit logs
Incident response drills to validate procedures and playbooks

For security issues, see our Vulnerability Disclosure Policy.

Security Architecture - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 6.4.1, Basel III

Overview
Security Domains
Data Flow Diagrams
Trust Boundaries
Authentication & Authorization
Secrets Management
Network Security
Container Security
Supply Chain Security

Overview

This document describes the security architecture of the Bindy DNS Controller, including authentication, authorization, secrets management, network segmentation, and container security. The architecture follows defense-in-depth principles with multiple security layers.

Security Principles

Least Privilege: All components have minimal permissions required for their function
Defense in Depth: Multiple security layers protect against single point of failure
Zero Trust: No implicit trust within the cluster; all access is authenticated and authorized
Immutability: Container filesystems are read-only; configuration is declarative
Auditability: All security-relevant events are logged and traceable

Security Domains

Domain 1: Development & CI/CD

Purpose: Code development, review, build, and release

Components:

GitHub repository (source code)
GitHub Actions (CI/CD pipelines)
Container Registry (ghcr.io)
Developer workstations

Security Controls:

✅ Code Signing: All commits cryptographically signed (GPG/SSH) - C-1
✅ Code Review: 2+ reviewers required for all PRs
✅ Vulnerability Scanning: cargo-audit + Trivy in CI/CD - C-3
✅ SBOM Generation: Software Bill of Materials for all releases
✅ Branch Protection: Signed commits required, no direct pushes to main
✅ 2FA: Two-factor authentication required for all contributors

Trust Level: High (controls ensure code integrity)

Domain 2: Kubernetes Control Plane

Purpose: Kubernetes API server, scheduler, controller-manager, etcd

Components:

Kubernetes API server
etcd (cluster state storage)
Scheduler
Controller-manager

Security Controls:

✅ RBAC: Role-Based Access Control enforced for all API requests
✅ Encryption at Rest: etcd data encrypted (including Secrets)
✅ TLS: All control plane communication encrypted
✅ Audit Logging: All API requests logged
✅ Pod Security Admission: Enforces Pod Security Standards

Trust Level: Critical (compromise of control plane = cluster compromise)

Domain 3: dns-system Namespace

Purpose: Bindy controller and BIND9 pods

Components:

Bindy controller (Deployment)
BIND9 primary (StatefulSet)
BIND9 secondaries (StatefulSet)
ConfigMaps (BIND9 configuration)
Secrets (RNDC keys)
Services (DNS, RNDC endpoints)

Security Controls:

✅ RBAC Least Privilege: Controller has minimal permissions - C-2
✅ Non-Root Containers: All pods run as uid 1000+
✅ Read-Only Filesystem: Immutable container filesystems
✅ Pod Security Standards: Restricted profile enforced
✅ Resource Limits: CPU/memory limits prevent DoS
❌ Network Policies (planned - L-1): Restrict pod-to-pod communication

Trust Level: High (protected by RBAC, Pod Security Standards)

Domain 4: Tenant Namespaces

Purpose: DNS zone management by application teams

Components:

DNSZone custom resources
DNS record custom resources (ARecord, CNAMERecord, etc.)
Application pods (may read DNS records)

Security Controls:

✅ Namespace Isolation: Teams cannot access other namespaces
✅ RBAC: Teams can only manage their own DNS zones
✅ CRD Validation: OpenAPI v3 schema validation on all CRs
❌ Admission Webhooks (planned): Additional validation for DNS records

Trust Level: Medium (tenants are trusted but isolated)

Domain 5: External Network

Purpose: Public internet (DNS clients)

Components:

DNS clients (recursive resolvers, end users)
LoadBalancer/NodePort services exposing port 53

Security Controls:

✅ Rate Limiting: BIND9 rate-limit directive prevents query floods
✅ AXFR Restrictions: Zone transfers only to known secondaries
❌ DNSSEC (planned): Cryptographic signing of DNS responses
❌ Edge DDoS Protection (planned): CloudFlare, AWS Shield

Trust Level: Untrusted (all traffic assumed hostile)

Data Flow Diagrams

Diagram 1: DNS Zone Reconciliation Flow

sequenceDiagram
    participant Dev as Developer
    participant Git as Git Repository
    participant K8s as Kubernetes API
    participant Ctrl as Bindy Controller
    participant CM as ConfigMap
    participant Sec as Secret
    participant BIND as BIND9 Pod

    Dev->>Git: Push DNSZone CR (GitOps)
    Git->>K8s: FluxCD applies CR
    K8s->>Ctrl: Watch event (DNSZone created/updated)
    Ctrl->>K8s: Read DNSZone spec
    Ctrl->>K8s: Read Bind9Instance CR
    Ctrl->>Sec: Read RNDC key
    Note over Sec: Audit: Controller read secret<br/>ServiceAccount: bindy<br/>Timestamp: 2025-12-17 10:23:45
    Ctrl->>CM: Create/Update ConfigMap<br/>(named.conf, zone file)
    Ctrl->>BIND: Send RNDC command<br/>(reload zone)
    BIND->>CM: Load updated zone file
    BIND-->>Ctrl: Reload successful
    Ctrl->>K8s: Update DNSZone status<br/>(Ready=True)

Security Notes:

✅ All API calls authenticated with ServiceAccount token (JWT)
✅ RBAC enforced at every step (controller has least privilege)
✅ Secret read is audited (H-3 planned)
✅ RNDC communication uses HMAC key authentication
✅ ConfigMap is immutable (recreated on change, not modified)

Diagram 2: DNS Query Flow

sequenceDiagram
    participant Client as DNS Client<br/>(Untrusted)
    participant LB as LoadBalancer
    participant BIND1 as BIND9 Primary
    participant BIND2 as BIND9 Secondary
    participant CM as ConfigMap<br/>(Zone Data)

    Client->>LB: DNS Query (UDP 53)<br/>example.com A?
    Note over LB: Rate limiting<br/>DDoS protection (planned)
    LB->>BIND1: Forward query
    BIND1->>CM: Read zone file<br/>(cached in memory)
    BIND1-->>LB: DNS Response<br/>93.184.216.34
    LB-->>Client: DNS Response

    Note over BIND1,BIND2: Zone replication (AXFR/IXFR)
    BIND1->>BIND2: Notify (zone updated)
    BIND2->>BIND1: AXFR request<br/>(authenticated with allow-transfer)
    BIND1-->>BIND2: Zone transfer
    BIND2->>CM: Update local zone cache

Security Notes:

✅ DNS port 53 is public (required for DNS service)
✅ Rate limiting prevents query floods
✅ AXFR restricted to known secondary IPs
✅ Zone data is read-only in BIND9 (managed by controller)
❌ DNSSEC (planned): Would sign responses cryptographically

Diagram 3: Secret Access Flow

sequenceDiagram
    participant Ctrl as Bindy Controller
    participant K8s as Kubernetes API
    participant etcd as etcd<br/>(Encrypted at Rest)
    participant Audit as Audit Log

    Ctrl->>K8s: GET /api/v1/namespaces/dns-system/secrets/rndc-key
    Note over K8s: Authentication: JWT<br/>Authorization: RBAC
    K8s->>Audit: Log API request<br/>User: system:serviceaccount:dns-system:bindy<br/>Verb: get<br/>Resource: secrets/rndc-key<br/>Result: allowed
    K8s->>etcd: Read secret (encrypted)
    etcd-->>K8s: Return encrypted data
    K8s-->>Ctrl: Return secret (decrypted)
    Note over Ctrl: Controller uses RNDC key<br/>to authenticate to BIND9

Security Notes:

✅ Secrets encrypted at rest in etcd
✅ Secrets transmitted over TLS (in transit)
✅ RBAC limits secret read access to controller only
✅ Kubernetes audit log captures all secret access
❌ Dedicated secret access audit trail (H-3 planned): More visible tracking

Diagram 4: Container Image Supply Chain

flowchart TD
    Dev[Developer] -->|Signed Commit| Git[Git Repository]
    Git -->|Trigger| CI[GitHub Actions CI/CD]
    CI -->|cargo build| Bin[Rust Binary]
    CI -->|cargo audit| Audit[Vulnerability Scan]
    Audit -->|Pass| Bin
    Bin -->|Multi-stage build| Docker[Docker Build]
    Docker -->|Trivy scan| Scan[Container Scan]
    Scan -->|Pass| Sign[Sign Image<br/>Provenance + SBOM]
    Sign -->|Push| Reg[Container Registry<br/>ghcr.io]
    Reg -->|Pull| K8s[Kubernetes Cluster]
    K8s -->|Verify| Pod[Controller Pod]

    style Git fill:#90EE90
    style Audit fill:#FFD700
    style Scan fill:#FFD700
    style Sign fill:#90EE90
    style Pod fill:#90EE90

Security Controls:

✅ C-1: All commits signed (GPG/SSH)
✅ C-3: Vulnerability scanning (cargo-audit + Trivy)
✅ SLSA Level 2: Build provenance + SBOM
✅ Signed Images: Docker provenance attestation
❌ M-1 (planned): Pin images by digest (not tags)
❌ Image Verification (planned): Admission controller verifies signatures

Trust Boundaries

Boundary Map

graph TB
    subgraph Untrusted["🔴 UNTRUSTED ZONE"]
        Internet[Internet<br/>DNS Clients]
    end

    subgraph Perimeter["🟡 PERIMETER"]
        LB[LoadBalancer<br/>Port 53]
    end

    subgraph Cluster["🟢 KUBERNETES CLUSTER (Trusted)"]
        subgraph ControlPlane["Control Plane"]
            API[Kubernetes API]
            etcd[etcd]
        end

        subgraph DNSNamespace["🟠 dns-system Namespace<br/>(High Privilege)"]
            Ctrl[Bindy Controller]
            BIND[BIND9 Pods]
            Secrets[Secrets]
        end

        subgraph TenantNS["🔵 Tenant Namespaces<br/>(Low Privilege)"]
            App1[team-web]
            App2[team-api]
        end
    end

    Internet -->|DNS Queries| LB
    LB -->|Forwarded| BIND
    BIND -->|Read ConfigMaps| DNSNamespace
    Ctrl -->|Reconcile| API
    Ctrl -->|Read| Secrets
    API -->|Store| etcd
    App1 -->|Create DNSZone| API
    App2 -->|Create DNSZone| API

    style Internet fill:#FF6B6B
    style LB fill:#FFD93D
    style ControlPlane fill:#6BCB77
    style DNSNamespace fill:#FFA500
    style TenantNS fill:#4D96FF

Trust Boundary Rules:

Untrusted → Perimeter: All traffic rate-limited, DDoS protection (planned)
Perimeter → dns-system: Only port 53 allowed, no direct access to controller
dns-system → Control Plane: Authenticated with ServiceAccount token, RBAC enforced
Tenant Namespaces → Control Plane: Authenticated with user credentials, RBAC enforced
Secrets Access: Only controller ServiceAccount can read, audit logged

Authentication & Authorization

RBAC Architecture

graph LR
    subgraph Identities
        SA[ServiceAccount: bindy<br/>ns: dns-system]
        User1[User: alice<br/>Team: web]
        User2[User: bob<br/>Team: api]
    end

    subgraph Roles
        CR[ClusterRole:<br/>bindy-controller]
        NSR[Role:<br/>dnszone-editor<br/>ns: team-web]
    end

    subgraph Bindings
        CRB[ClusterRoleBinding]
        RB[RoleBinding]
    end

    subgraph Resources
        CRD[CRDs<br/>Bind9Cluster]
        Zone[DNSZone<br/>ns: team-web]
        Sec[Secrets<br/>ns: dns-system]
    end

    SA -->|bound to| CRB
    CRB -->|grants| CR
    CR -->|allows| CRD
    CR -->|allows| Sec

    User1 -->|bound to| RB
    RB -->|grants| NSR
    NSR -->|allows| Zone

    style SA fill:#FFD93D
    style CR fill:#6BCB77
    style Sec fill:#FF6B6B

Controller RBAC Permissions

Cluster-Scoped Resources:

Resource	Verbs	Rationale
`bind9clusters.bindy.firestoned.io`	get, list, watch, create, update, patch	Manage cluster topology
`bind9instances.bindy.firestoned.io`	get, list, watch, create, update, patch	Manage BIND9 instances
❌ delete on ANY resource	DENIED	✅ C-2: Least privilege, prevent accidental deletion

Namespaced Resources (dns-system):

Resource	Verbs	Rationale
`secrets`	get, list, watch	Read RNDC keys (READ-ONLY)
`configmaps`	get, list, watch, create, update, patch	Manage BIND9 configuration
`deployments`	get, list, watch, create, update, patch	Manage BIND9 deployments
`services`	get, list, watch, create, update, patch	Expose DNS services
`serviceaccounts`	get, list, watch, create, update, patch	Manage BIND9 ServiceAccounts
❌ secrets	❌ create, update, patch, delete	✅ PCI-DSS 7.1.2: Read-only access
❌ delete on ANY resource	DENIED	✅ C-2: Least privilege

Verification:

# Run automated RBAC verification
deploy/rbac/verify-rbac.sh

User RBAC Permissions (Tenants)

Example: team-web namespace

User	Role	Resources	Verbs	Scope
alice	dnszone-editor	dnszones.bindy.firestoned.io	get, list, watch, create, update, patch	team-web only
alice	dnszone-editor	arecords, cnamerecords, …	get, list, watch, create, update, patch	team-web only
alice	❌	dnszones in other namespaces	❌ DENIED	Cannot access team-api zones
alice	❌	secrets, configmaps	❌ DENIED	Cannot access BIND9 internals

Secrets Management

Secret Types

Secret	Purpose	Access	Rotation	Encryption
RNDC Key	Authenticate to BIND9	Controller: read-only	Manual (planned automation)	At rest: etcd, In transit: TLS
TLS Certificates (future)	HTTPS, DNSSEC	Controller: read-only	Cert-manager (automated)	At rest: etcd, In transit: TLS
ServiceAccount Token	Kubernetes API auth	Auto-mounted	Kubernetes (short-lived)	JWT signed by cluster CA

Secret Lifecycle

stateDiagram-v2
    [*] --> Created: Admin creates secret<br/>(kubectl create secret)
    Created --> Stored: etcd encrypts at rest
    Stored --> Mounted: Controller pod starts<br/>(Kubernetes mounts as volume)
    Mounted --> Used: Controller reads RNDC key
    Used --> Audited: Access logged (H-3 planned)
    Audited --> Rotated: Key rotation (manual)
    Rotated --> Stored: New key stored
    Stored --> Deleted: Old key deleted after grace period
    Deleted --> [*]

Secret Protection

At Rest:

✅ etcd encryption enabled (AES-256-GCM)
✅ Secrets stored in Kubernetes Secrets (not in code, env vars, or ConfigMaps)

In Transit:

✅ All Kubernetes API communication over TLS
✅ ServiceAccount token transmitted over TLS

In Use:

✅ Controller runs as non-root (uid 1000+)
✅ Read-only filesystem (secrets cannot be written to disk)
✅ Memory protection (secrets cleared after use - Rust Drop trait)

Access Control:

✅ RBAC limits secret read to controller only
✅ Kubernetes audit log captures all secret access
❌ H-3 (planned): Dedicated secret access audit trail with alerts

Network Security

Network Architecture

graph TB
    subgraph Internet
        Client[DNS Clients]
    end

    subgraph Kubernetes["Kubernetes Cluster"]
        subgraph Ingress["Ingress"]
            LB[LoadBalancer<br/>Port 53 UDP/TCP]
        end

        subgraph dns-system["dns-system Namespace"]
            Ctrl[Bindy Controller]
            BIND1[BIND9 Primary<br/>Port 53, 953]
            BIND2[BIND9 Secondary<br/>Port 53]
        end

        subgraph kube-system["kube-system"]
            API[Kubernetes API<br/>Port 6443]
        end

        subgraph team-web["team-web Namespace"]
            App1[Application Pods]
        end
    end

    Client -->|UDP/TCP 53| LB
    LB -->|Forward| BIND1
    LB -->|Forward| BIND2
    Ctrl -->|HTTPS 6443| API
    Ctrl -->|TCP 953<br/>RNDC| BIND1
    BIND1 -->|AXFR/IXFR| BIND2
    App1 -->|HTTPS 6443| API

    style Client fill:#FF6B6B
    style LB fill:#FFD93D
    style API fill:#6BCB77
    style Ctrl fill:#4D96FF

Network Policies (Planned - L-1)

Policy 1: Controller Egress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-controller-egress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress:
  # Allow: Kubernetes API
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: TCP
      port: 6443
  # Allow: BIND9 RNDC
  - to:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bind9
    ports:
    - protocol: TCP
      port: 953
  # Allow: DNS (for cluster DNS resolution)
  - to:
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

Policy 2: BIND9 Ingress

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-ingress
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow: DNS queries from anywhere
  - from:
    - namespaceSelector: {}
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow: RNDC from controller only
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bindy
    ports:
    - protocol: TCP
      port: 953
  # Allow: AXFR from secondaries only
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bind9
          app.kubernetes.io/component: secondary
    ports:
    - protocol: TCP
      port: 53

Container Security

Container Hardening

Bindy Controller Pod Security:

apiVersion: v1
kind: Pod
metadata:
  name: bindy-controller
spec:
  serviceAccountName: bindy
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
    runAsGroup: 1000
    fsGroup: 1000
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: controller
    image: ghcr.io/firestoned/bindy:latest
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 1000
    resources:
      requests:
        memory: "128Mi"
        cpu: "100m"
      limits:
        memory: "256Mi"
        cpu: "500m"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
      readOnly: false  # Only /tmp is writable
    - name: rndc-key
      mountPath: /etc/bindy/rndc
      readOnly: true
  volumes:
  - name: tmp
    emptyDir:
      sizeLimit: 100Mi
  - name: rndc-key
    secret:
      secretName: rndc-key

Security Features:

✅ Non-root user (uid 1000)
✅ Read-only root filesystem (only /tmp writable)
✅ No privileged escalation
✅ All capabilities dropped
✅ seccomp profile (restrict syscalls)
✅ Resource limits (prevent DoS)
✅ Secrets mounted read-only

Image Security

Base Image: Chainguard (Zero-CVE)

FROM cgr.dev/chainguard/static:latest
COPY --chmod=755 bindy /usr/local/bin/bindy
USER 1000:1000
ENTRYPOINT ["/usr/local/bin/bindy"]

Features:

✅ Chainguard static base (zero CVEs, no package manager)
✅ Minimal attack surface (~15MB image size)
✅ No shell, no utilities (static binary only)
✅ FIPS-ready (if required)
✅ Signed image with provenance
✅ SBOM included

Vulnerability Scanning:

✅ Trivy scans on every PR, main push, release
✅ CI fails on CRITICAL/HIGH vulnerabilities
✅ Daily scheduled scans detect new CVEs

Supply Chain Security

SLSA Level 2 Compliance

Requirement	Implementation	Status
Build provenance	Signed commits provide authorship proof	✅ C-1
Source integrity	GPG/SSH signatures verify source	✅ C-1
Build integrity	SBOM generated for all releases	✅ SLSA
Build isolation	GitHub Actions ephemeral runners	✅ CI/CD
Parameterless build	Reproducible builds (same input = same output)	❌ H-4 (planned)

Supply Chain Flow

flowchart LR
    A[Developer] -->|Signed Commit| B[Git]
    B -->|Webhook| C[GitHub Actions]
    C -->|Build| D[Binary]
    C -->|Scan| E[cargo-audit]
    E -->|Pass| D
    D -->|Build| F[Container Image]
    F -->|Scan| G[Trivy]
    G -->|Pass| H[Sign Image]
    H -->|Provenance| I[SBOM]
    I -->|Push| J[Registry]
    J -->|Pull| K[Kubernetes]

    style A fill:#90EE90
    style E fill:#FFD700
    style G fill:#FFD700
    style H fill:#90EE90
    style I fill:#90EE90

Supply Chain Threats Mitigated:

✅ Code Injection: Signed commits prevent unauthorized code changes
✅ Dependency Confusion: cargo-audit verifies dependencies from crates.io
✅ Malicious Dependencies: Vulnerability scanning detects known CVEs
✅ Image Tampering: Signed images with provenance attestation
❌ Compromised Build Environment (partially): Ephemeral runners, but build reproducibility not verified (H-4)

References

Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Threat Model - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 6.4.1, Basel III Cyber Risk

Overview
System Description
Assets
Trust Boundaries
STRIDE Threat Analysis
Attack Surface
Threat Scenarios
Mitigations
Residual Risks
Security Architecture

Overview

This document provides a comprehensive threat model for the Bindy DNS Controller, a Kubernetes operator that manages BIND9 DNS servers. The threat model uses the STRIDE methodology (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege) to identify and analyze security threats.

Objectives

Identify threats to the DNS infrastructure managed by Bindy
Assess risk for each identified threat
Document mitigations (existing and required)
Provide security guidance for deployers and operators
Support compliance with SOX 404, PCI-DSS 6.4.1, Basel III

Scope

In Scope:

Bindy controller container and runtime
Custom Resource Definitions (CRDs) and Kubernetes API interactions
BIND9 pods managed by Bindy
DNS zone data and configuration
RNDC (Remote Name Daemon Control) communication
Container images and supply chain
CI/CD pipeline security

Out of Scope:

Kubernetes cluster security (managed by platform team)
Network infrastructure security (managed by network team)
Physical security of data centers
DNS client security (recursive resolvers outside our control)

System Description

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     Kubernetes Cluster                       │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │              dns-system Namespace                   │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │        Bindy Controller (Deployment)         │  │    │
│  │  │  ┌────────────────────────────────────────┐  │  │    │
│  │  │  │  Controller Pod (Non-Root, ReadOnly)   │  │    │    │
│  │  │  │  - Watches CRDs                        │  │  │    │
│  │  │  │  - Reconciles DNS zones                │  │  │    │
│  │  │  │  - Manages BIND9 pods                  │  │  │    │
│  │  │  │  - Uses RNDC for zone updates         │  │  │    │
│  │  │  └────────────────────────────────────────┘  │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │       BIND9 Primary (StatefulSet)           │  │    │
│  │  │  ┌────────────────────────────────────────┐  │  │    │
│  │  │  │  BIND Pod (Non-Root, ReadOnly)         │  │  │    │
│  │  │  │  - Authoritative DNS (Port 53)         │  │  │    │
│  │  │  │  - RNDC Control (Port 953)             │  │  │    │
│  │  │  │  - Zone files (ConfigMaps)             │  │  │    │
│  │  │  │  - RNDC key (Secret, read-only)        │  │  │    │
│  │  │  └────────────────────────────────────────┘  │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  │  ┌──────────────────────────────────────────────┐  │    │
│  │  │      BIND9 Secondaries (StatefulSet)        │  │    │
│  │  │  - Receive zone transfers from primary       │  │    │
│  │  │  - Provide redundancy                        │  │    │
│  │  │  - Geographic distribution                   │  │    │
│  │  └──────────────────────────────────────────────┘  │    │
│  │                                                      │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
│  ┌────────────────────────────────────────────────────┐    │
│  │         Other Namespaces (Multi-Tenancy)           │    │
│  │  - team-web (DNSZone CRs)                          │    │
│  │  - team-api (DNSZone CRs)                          │    │
│  │  - platform-dns (Bind9Cluster CRs)                 │    │
│  └────────────────────────────────────────────────────┘    │
│                                                              │
└─────────────────────────────────────────────────────────────┘
          │                           ▲
          │ DNS Queries (UDP/TCP 53)  │
          ▼                           │
    ┌─────────────────────────────────────┐
    │       External DNS Clients          │
    │  - Recursive resolvers              │
    │  - Corporate clients                │
    │  - Internet users                   │
    └─────────────────────────────────────┘

Components

Bindy Controller
- Kubernetes operator written in Rust
- Watches custom resources (Bind9Cluster, Bind9Instance, DNSZone, DNS records)
- Reconciles desired state with actual state
- Manages BIND9 deployments, ConfigMaps, Secrets, Services
- Uses RNDC to update zones on running BIND9 instances
BIND9 Pods
- Authoritative DNS servers running BIND9
- Primary server handles zone updates
- Secondary servers replicate zones via AXFR/IXFR
- Exposed via LoadBalancer or NodePort services
Custom Resources (CRDs)
- Bind9Cluster: Cluster-scoped, defines BIND9 cluster topology
- Bind9Instance: Namespaced, defines individual BIND9 server
- DNSZone: Namespaced, defines DNS zone (e.g., example.com)
- DNS Records: ARecord, CNAMERecord, MXRecord, etc.
Supporting Resources
- ConfigMaps: Store BIND9 configuration and zone files
- Secrets: Store RNDC keys (symmetric HMAC keys)
- Services: Expose DNS (port 53) and RNDC (port 953)
- ServiceAccounts: RBAC for controller access

Assets

High-Value Assets

Asset	Description	Confidentiality	Integrity	Availability	Owner
DNS Zone Data	Authoritative DNS records for all managed domains	Medium	Critical	Critical	Teams/Platform
RNDC Keys	Symmetric HMAC keys for BIND9 control	Critical	Critical	High	Security Team
Controller Binary	Signed container image with controller logic	Medium	Critical	High	Development Team
BIND9 Configuration	named.conf, zone configs	Low	Critical	High	Platform Team
Kubernetes API Access	ServiceAccount token for controller	Critical	Critical	Critical	Platform Team
CRD Schemas	Define API contract for DNS management	Low	Critical	Medium	Development Team
Audit Logs	Record of all DNS changes and access	High	Critical	High	Security Team
SBOM	Software Bill of Materials for compliance	Low	Critical	Medium	Compliance Team

Asset Protection Goals

DNS Zone Data: Prevent unauthorized modification (tampering), ensure availability
RNDC Keys: Prevent disclosure (compromise allows full BIND9 control)
Controller Binary: Prevent supply chain attacks, ensure code integrity
Kubernetes API Access: Prevent privilege escalation, enforce least privilege
Audit Logs: Ensure non-repudiation, prevent tampering, retain for compliance

Trust Boundaries

Boundary 1: Kubernetes Cluster Perimeter

Trust Level: High Description: Kubernetes API server, etcd, and cluster networking

Assumptions:

Kubernetes RBAC is properly configured
etcd is encrypted at rest
Network policies are enforced
Node security is managed by platform team

Threats if Compromised:

Attacker gains full control of all resources in cluster
DNS data can be exfiltrated or modified
Controller can be manipulated or replaced

Boundary 2: dns-system Namespace

Trust Level: High Description: Namespace containing Bindy controller and BIND9 pods

Assumptions:

RBAC limits access to authorized ServiceAccounts only
Secrets are encrypted at rest in etcd
Pod Security Standards enforced (Restricted)

Threats if Compromised:

Attacker can read RNDC keys
Attacker can modify DNS zones
Attacker can disrupt DNS service

Boundary 3: Controller Container

Trust Level: Medium-High Description: Bindy controller runtime environment

Assumptions:

Container runs as non-root user
Filesystem is read-only except /tmp
No privileged capabilities
Resource limits enforced

Threats if Compromised:

Attacker can abuse Kubernetes API access
Attacker can read secrets controller has access to
Attacker can disrupt reconciliation loops

Boundary 4: BIND9 Container

Trust Level: Medium Description: BIND9 DNS server runtime

Assumptions:

Container runs as non-root
Exposed to internet (port 53)
Configuration is managed by controller (read-only)

Threats if Compromised:

Attacker can serve malicious DNS responses
Attacker can exfiltrate zone data
Attacker can pivot to other cluster resources (if network policies weak)

Boundary 5: External Network (Internet)

Trust Level: Untrusted Description: Public internet where DNS clients reside

Assumptions:

All traffic is potentially hostile
DDoS attacks are likely
DNS protocol vulnerabilities will be exploited

Threats:

DNS amplification attacks (abuse open resolvers)
Cache poisoning attempts
Zone enumeration (AXFR abuse)
DoS via query floods

STRIDE Threat Analysis

S - Spoofing (Identity)

S1: Spoofed Kubernetes API Requests

Threat: Attacker impersonates the Bindy controller ServiceAccount to make unauthorized API calls.

Impact: HIGH Likelihood: LOW (requires compromised cluster or stolen token)

Attack Scenario:

Attacker compromises a pod in the cluster
Steals ServiceAccount token from /var/run/secrets/kubernetes.io/serviceaccount/token
Uses token to impersonate controller and modify DNS zones

Mitigations:

✅ RBAC least privilege (controller cannot delete resources)
✅ Pod Security Standards (non-root, read-only filesystem)
✅ Short-lived ServiceAccount tokens (TokenRequest API)
❌ MISSING: Network policies to restrict egress from controller pod
❌ MISSING: Audit logging for all ServiceAccount API calls

Residual Risk: MEDIUM (need network policies and audit logs)

S2: Spoofed RNDC Commands

Threat: Attacker gains access to RNDC key and sends malicious commands to BIND9.

Impact: CRITICAL Likelihood: LOW (RNDC keys stored in Kubernetes Secrets with RBAC)

Attack Scenario:

Attacker compromises controller pod or namespace
Reads RNDC key from Kubernetes Secret
Connects to BIND9 RNDC port (953) and issues commands (e.g., reload, freeze, thaw)

Mitigations:

✅ Secrets encrypted at rest (Kubernetes)
✅ RBAC limits secret read access to controller only
✅ RNDC port (953) not exposed externally
❌ MISSING: Secret access audit trail (H-3)
❌ MISSING: RNDC key rotation policy

Residual Risk: MEDIUM (need secret audit trail)

S3: Spoofed Git Commits (Supply Chain)

Threat: Attacker forges commits without proper signature, injecting malicious code.

Impact: CRITICAL Likelihood: VERY LOW (branch protection enforces signed commits)

Attack Scenario:

Attacker compromises GitHub account or uses stolen SSH key
Pushes unsigned commit to feature branch
Attempts to merge to main without proper review

Mitigations:

✅ All commits MUST be signed (GPG/SSH)
✅ GitHub branch protection requires signed commits
✅ CI/CD verifies commit signatures
✅ 2+ reviewers required for all PRs
✅ Linear history (no merge commits)

Residual Risk: VERY LOW (strong controls in place)

T - Tampering (Data Integrity)

T1: Tampering with DNS Zone Data

Threat: Attacker modifies DNS records to redirect traffic or cause outages.

Impact: CRITICAL Likelihood: LOW (requires Kubernetes API access)

Attack Scenario:

Attacker gains write access to DNSZone CRs (via compromised RBAC or stolen credentials)
Modifies A/CNAME records to point to attacker-controlled servers
Traffic is redirected, enabling phishing, data theft, or service disruption

Mitigations:

✅ RBAC enforces least privilege (users can only modify zones in their namespace)
✅ GitOps workflow (changes via pull requests, not direct kubectl)
✅ Audit logging in Kubernetes (all CR modifications logged)
❌ MISSING: Webhook validation for DNS records (prevent obviously malicious changes)
❌ MISSING: DNSSEC signing (prevents tampering of DNS responses in transit)

Residual Risk: MEDIUM (need validation webhooks and DNSSEC)

T2: Tampering with Container Images

Threat: Attacker replaces legitimate Bindy/BIND9 container image with malicious version.

Impact: CRITICAL Likelihood: VERY LOW (signed images, supply chain controls)

Attack Scenario:

Attacker compromises CI/CD pipeline or registry credentials
Pushes malicious image with same tag (e.g., :latest)
Controller pulls compromised image on next rollout

Mitigations:

✅ All images signed with provenance attestation (SLSA Level 2)
✅ SBOM generated for all releases
✅ GitHub Actions signed commits verification
✅ Multi-stage builds minimize attack surface
❌ MISSING: Image digests pinned (not tags) - see M-1
❌ MISSING: Admission controller to verify image signatures (e.g., Sigstore Cosign)

Residual Risk: LOW (strong supply chain controls, but pinning digests would further reduce risk)

T3: Tampering with ConfigMaps/Secrets

Threat: Attacker modifies BIND9 configuration or RNDC keys via Kubernetes API.

Impact: HIGH Likelihood: LOW (RBAC protects ConfigMaps/Secrets)

Attack Scenario:

Attacker gains elevated privileges in dns-system namespace
Modifies BIND9 ConfigMap to disable security features or add backdoor zones
BIND9 pod restarts with malicious configuration

Mitigations:

✅ Controller has NO delete permissions on Secrets/ConfigMaps (C-2)
✅ RBAC limits write access to controller only
✅ Immutable ConfigMaps (once created, cannot be modified - requires recreation)
❌ MISSING: ConfigMap/Secret integrity checks (hash validation)
❌ MISSING: Automated drift detection (compare running config vs desired state)

Residual Risk: MEDIUM (need integrity checks)

R - Repudiation (Non-Repudiation)

R1: Unauthorized DNS Changes Without Attribution

Threat: Attacker modifies DNS zones and there’s no audit trail proving who made the change.

Impact: HIGH (compliance violation, incident response hindered) Likelihood: LOW (Kubernetes audit logs capture API calls)

Attack Scenario:

Attacker gains access to cluster with weak RBAC
Modifies DNSZone CRs
No log exists linking the change to a specific user or ServiceAccount

Mitigations:

✅ Kubernetes audit logs enabled (captures all API requests)
✅ All commits signed (non-repudiation for code changes)
✅ GitOps workflow (changes traceable to Git commits and PR reviews)
❌ MISSING: Centralized log aggregation with tamper-proof storage (H-2)
❌ MISSING: Log retention policy (90 days active, 1 year archive per PCI-DSS)
❌ MISSING: Audit trail queries documented for compliance reviews

Residual Risk: MEDIUM (need H-2 - Audit Log Retention Policy)

R2: Secret Access Without Audit Trail

Threat: Attacker reads RNDC keys from Secrets, no record of who accessed them.

Impact: HIGH Likelihood: LOW (secret access is logged by Kubernetes, but not prominently tracked)

Attack Scenario:

Attacker compromises ServiceAccount with secret read access
Reads RNDC key from Kubernetes Secret
Uses key to control BIND9, but no clear audit trail of secret access

Mitigations:

✅ Kubernetes audit logs capture Secret read operations
❌ MISSING: Dedicated audit trail for secret access (H-3)
❌ MISSING: Alerts on unexpected secret reads
❌ MISSING: Secret access dashboard for compliance reviews

Residual Risk: MEDIUM (need H-3 - Secret Access Audit Trail)

I - Information Disclosure

I1: Exposure of RNDC Keys

Threat: RNDC keys leaked via logs, environment variables, or insecure storage.

Impact: CRITICAL Likelihood: VERY LOW (secrets stored in Kubernetes Secrets, not in code)

Attack Scenario:

Developer hardcodes RNDC key in code or logs it for debugging
Key is committed to Git or appears in log aggregation system
Attacker finds key and uses it to control BIND9

Mitigations:

✅ Secrets stored in Kubernetes Secrets (encrypted at rest)
✅ Pre-commit hooks to detect secrets in code
✅ GitHub secret scanning enabled
✅ CI/CD fails if secrets detected
❌ MISSING: Log sanitization (ensure secrets never appear in logs)
❌ MISSING: Secret rotation policy (rotate RNDC keys periodically)

Residual Risk: LOW (good controls, but rotation would improve)

I2: Zone Data Enumeration

Threat: Attacker uses AXFR (zone transfer) to download entire zone contents.

Impact: MEDIUM (zone data is semi-public, but bulk enumeration aids reconnaissance) Likelihood: MEDIUM (AXFR often left open by mistake)

Attack Scenario:

Attacker sends AXFR request to BIND9 server
If AXFR is not restricted, server returns all records in zone
Attacker uses zone data for targeted attacks (subdomain enumeration, email harvesting)

Mitigations:

✅ AXFR restricted to secondary servers only (BIND9 allow-transfer directive)
✅ BIND9 configuration managed by controller (prevents manual misconfig)
❌ MISSING: TSIG authentication for zone transfers (H-4)
❌ MISSING: Rate limiting on AXFR requests

Residual Risk: MEDIUM (need TSIG for AXFR)

I3: Container Image Vulnerability Disclosure

Threat: Container images contain vulnerabilities that could be exploited if disclosed.

Impact: MEDIUM Likelihood: MEDIUM (vulnerabilities exist in all software)

Attack Scenario:

Vulnerability is disclosed in a dependency (e.g., CVE in glibc)
Attacker scans for services using vulnerable version
Exploits vulnerability to gain RCE or escalate privileges

Mitigations:

✅ Automated vulnerability scanning (cargo-audit + Trivy) - C-3
✅ CI blocks on CRITICAL/HIGH vulnerabilities
✅ Daily scheduled scans detect new CVEs
✅ Remediation SLAs defined (CRITICAL: 24h, HIGH: 7d)
✅ Chainguard zero-CVE base images used

Residual Risk: LOW (strong vulnerability management)

D - Denial of Service

D1: DNS Query Flood (DDoS)

Threat: Attacker floods BIND9 servers with DNS queries, exhausting resources.

Impact: CRITICAL (DNS unavailability impacts all services) Likelihood: HIGH (DNS is a common DDoS target)

Attack Scenario:

Attacker uses botnet to send millions of DNS queries to BIND9 servers
BIND9 CPU/memory exhausted, becomes unresponsive
Legitimate DNS queries fail, causing outages

Mitigations:

✅ Rate limiting in BIND9 (rate-limit directive)
✅ Resource limits on BIND9 pods (CPU/memory requests/limits)
✅ Horizontal scaling (multiple BIND9 secondaries)
❌ MISSING: DDoS protection at network edge (e.g., CloudFlare, AWS Shield)
❌ MISSING: Query pattern analysis and anomaly detection
❌ MISSING: Automated pod scaling based on query load (HPA)

Residual Risk: MEDIUM (need edge DDoS protection)

D2: Controller Resource Exhaustion

Threat: Attacker creates thousands of DNSZone CRs, overwhelming controller.

Impact: HIGH (controller fails, DNS updates stop) Likelihood: LOW (requires cluster access)

Attack Scenario:

Attacker gains write access to Kubernetes API
Creates 10,000+ DNSZone CRs
Controller reconciliation queue overwhelms CPU/memory
Controller crashes or becomes unresponsive

Mitigations:

✅ Resource limits on controller pod
✅ Exponential backoff for failed reconciliations
❌ MISSING: Rate limiting on reconciliation loops (M-3)
❌ MISSING: Admission webhook to limit number of CRs per namespace
❌ MISSING: Horizontal scaling of controller (leader election)

Residual Risk: MEDIUM (need M-3 - Rate Limiting)

D3: AXFR Amplification Attack

Threat: Attacker abuses AXFR to amplify traffic in DDoS attack.

Impact: MEDIUM Likelihood: LOW (AXFR restricted to secondaries)

Attack Scenario:

Attacker spoofs source IP of DDoS target
Sends AXFR request to BIND9
BIND9 sends large zone file to spoofed IP (amplification)

Mitigations:

✅ AXFR restricted to known secondary IPs (allow-transfer)
✅ BIND9 does not respond to spoofed source IPs (anti-spoofing)
❌ MISSING: Response rate limiting (RRL) for AXFR

Residual Risk: LOW (AXFR restrictions effective)

E - Elevation of Privilege

E1: Container Escape to Node

Threat: Attacker escapes from Bindy or BIND9 container to underlying Kubernetes node.

Impact: CRITICAL (full node compromise, lateral movement) Likelihood: VERY LOW (Pod Security Standards enforced)

Attack Scenario:

Attacker exploits container runtime vulnerability (e.g., runc CVE)
Escapes container to host filesystem
Gains root access on node, compromises kubelet and other pods

Mitigations:

✅ Non-root containers (uid 1000+)
✅ Read-only root filesystem
✅ No privileged capabilities
✅ Pod Security Standards (Restricted)
✅ seccomp profile (restrict syscalls)
✅ AppArmor/SELinux profiles
❌ MISSING: Regular node patching (managed by platform team)

Residual Risk: VERY LOW (defense in depth)

E2: RBAC Privilege Escalation

Threat: Attacker escalates from limited RBAC role to cluster-admin.

Impact: CRITICAL Likelihood: VERY LOW (RBAC reviewed, least privilege enforced)

Attack Scenario:

Attacker compromises ServiceAccount with limited permissions
Exploits RBAC misconfiguration (e.g., wildcard permissions)
Gains cluster-admin and full control of cluster

Mitigations:

✅ RBAC least privilege (controller has NO delete permissions) - C-2
✅ Automated RBAC verification script (deploy/rbac/verify-rbac.sh)
✅ No wildcard permissions in controller RBAC
✅ Regular RBAC audits (quarterly)
❌ MISSING: RBAC policy-as-code validation (OPA/Gatekeeper)

Residual Risk: VERY LOW (strong RBAC controls)

E3: Exploiting Vulnerable Dependencies

Threat: Attacker exploits vulnerability in Rust dependency to gain code execution.

Impact: HIGH Likelihood: LOW (automated vulnerability scanning, rapid patching)

Attack Scenario:

CVE disclosed in dependency (e.g., tokio, hyper, kube)
Attacker crafts malicious Kubernetes API response to trigger vulnerability
Controller crashes or attacker gains RCE in controller pod

Mitigations:

✅ Automated vulnerability scanning (cargo-audit) - C-3
✅ CI blocks on CRITICAL/HIGH vulnerabilities
✅ Remediation SLAs enforced (CRITICAL: 24h)
✅ Daily scheduled scans
✅ Dependency updates via Dependabot

Residual Risk: LOW (excellent vulnerability management)

Attack Surface

1. Kubernetes API

Exposure: Internal (within cluster) Authentication: ServiceAccount token (JWT) Authorization: RBAC (least privilege)

Attack Vectors:

Token theft from compromised pod
RBAC misconfiguration allowing excessive permissions
API server vulnerability (CVE in Kubernetes)

Mitigations:

Short-lived tokens (TokenRequest API)
RBAC verification script
Regular Kubernetes upgrades

Risk: MEDIUM

2. DNS Port 53 (UDP/TCP)

Exposure: External (internet-facing) Authentication: None (public DNS) Authorization: None

Attack Vectors:

DNS amplification attacks
Query floods (DDoS)
Cache poisoning attempts (if recursion enabled)
NXDOMAIN attacks

Mitigations:

Rate limiting (BIND9 rate-limit)
Recursion disabled (authoritative-only)
DNSSEC (planned)
DDoS protection at edge

Risk: HIGH (public-facing, no authentication)

3. RNDC Port 953

Exposure: Internal (within cluster, not exposed externally) Authentication: HMAC key (symmetric) Authorization: Key-based (all-or-nothing)

Attack Vectors:

RNDC key theft from Kubernetes Secret
Brute-force HMAC key (unlikely with strong key)
MITM attack (if network not encrypted)

Mitigations:

Secrets encrypted at rest
RBAC limits secret read access
RNDC port not exposed externally
NetworkPolicy (planned - L-1)

Risk: MEDIUM

4. Container Images (Supply Chain)

Exposure: Public (GitHub Container Registry) Authentication: Pull is unauthenticated (public repo) Authorization: Push requires GitHub token with packages:write

Attack Vectors:

Compromised CI/CD pipeline pushing malicious image
Dependency confusion (malicious crate with same name)
Compromised base image (upstream supply chain attack)

Mitigations:

Signed commits (all code changes)
Signed container images (provenance)
SBOM generation
Vulnerability scanning (Trivy)
Chainguard zero-CVE base images
Dependabot for dependency updates

Risk: LOW (strong supply chain security)

5. Custom Resource Definitions (CRDs)

Exposure: Internal (Kubernetes API) Authentication: Kubernetes user/ServiceAccount Authorization: RBAC (namespace-scoped for DNSZone)

Attack Vectors:

Malicious CRs with crafted input (e.g., XXL zone names)
Schema validation bypass
CR injection via compromised user

Mitigations:

Schema validation in CRD (OpenAPI v3)
Input sanitization in controller
Namespace isolation (RBAC)
Admission webhooks (planned)

Risk: MEDIUM

6. Git Repository (Code)

Exposure: Public (GitHub) Authentication: Push requires GitHub 2FA + signed commits Authorization: Branch protection on main

Attack Vectors:

Compromised GitHub account
Unsigned commit merged to main
Malicious PR approved by reviewers

Mitigations:

All commits signed (GPG/SSH) - C-1
Branch protection (2+ reviewers required)
CI/CD verifies signatures
Linear history (no merge commits)

Risk: VERY LOW (strong controls)

Threat Scenarios

Scenario 1: Compromised Controller Pod

Severity: HIGH

Attack Path:

Attacker exploits vulnerability in controller code (e.g., memory corruption, logic bug)
Gains code execution in controller pod
Reads ServiceAccount token from /var/run/secrets/
Uses token to modify DNSZone CRs or read RNDC keys from Secrets

Impact:

Attacker can modify DNS records (redirect traffic)
Attacker can disrupt DNS service (delete zones, BIND9 pods)
Attacker can pivot to other namespaces (if RBAC is weak)

Mitigations:

Controller runs as non-root, read-only filesystem
RBAC least privilege (no delete permissions)
Resource limits prevent resource exhaustion
Vulnerability scanning (cargo-audit, Trivy)
Network policies (planned - L-1)

Residual Risk: MEDIUM (need network policies)

Scenario 2: DNS Cache Poisoning

Severity: MEDIUM

Attack Path:

Attacker sends forged DNS responses to recursive resolver
Resolver caches malicious record (e.g., A record for bank.com pointing to attacker IP)
Clients query resolver, receive poisoned response
Traffic redirected to attacker (phishing, MITM)

Impact:

Users redirected to malicious sites
Credentials stolen
Man-in-the-middle attacks

Mitigations:

DNSSEC (planned) - cryptographically signs DNS responses
BIND9 is authoritative-only (not vulnerable to cache poisoning)
Recursive resolvers outside our control (client responsibility)

Residual Risk: MEDIUM (DNSSEC would eliminate this risk)

Scenario 3: Supply Chain Attack via Malicious Dependency

Severity: CRITICAL

Attack Path:

Attacker compromises popular Rust crate (e.g., via compromised maintainer account)
Malicious code injected into crate update
Bindy controller depends on compromised crate
Malicious code runs in controller, exfiltrates secrets or modifies DNS zones

Impact:

Complete compromise of DNS infrastructure
Data exfiltration (secrets, zone data)
Backdoor access to cluster

Mitigations:

Dependency scanning (cargo-audit) - C-3
SBOM generation (track all dependencies)
Signed commits (code changes traceable)
Dependency version pinning in Cargo.lock
Manual review for major dependency updates

Residual Risk: LOW (strong supply chain controls)

Scenario 4: Insider Threat (Malicious Admin)

Severity: HIGH

Attack Path:

Malicious cluster admin with cluster-admin RBAC role
Directly modifies DNSZone CRs to redirect traffic
Deletes audit logs to cover tracks
Exfiltrates RNDC keys from Secrets

Impact:

DNS records modified without attribution
Service disruption
Data theft

Mitigations:

GitOps workflow (changes via PRs, not direct kubectl)
All changes require 2+ reviewers
Immutable audit logs (planned - H-2)
Secret access audit trail (planned - H-3)
Separation of duties (no single admin has all access)

Residual Risk: MEDIUM (need H-2 and H-3)

Scenario 5: DDoS Attack on DNS Infrastructure

Severity: CRITICAL

Attack Path:

Attacker launches volumetric DDoS attack (millions of queries/sec)
BIND9 pods overwhelmed, become unresponsive
DNS queries fail, causing outages for all dependent services

Impact:

Complete DNS outage
All services depending on DNS become unavailable
Revenue loss, SLA violations

Mitigations:

Rate limiting in BIND9
Horizontal scaling (multiple secondaries)
Resource limits (prevent total resource exhaustion)
DDoS protection at edge (planned - CloudFlare, AWS Shield)
Autoscaling (planned - HPA based on query load)

Residual Risk: MEDIUM (need edge DDoS protection)

Mitigations

Existing Mitigations (Implemented)

ID	Mitigation	Threats Mitigated	Compliance
M-01	Signed commits required	S3 (spoofed commits)	✅ C-1
M-02	RBAC least privilege	E2 (privilege escalation)	✅ C-2
M-03	Vulnerability scanning	I3 (CVE disclosure), E3 (dependency exploit)	✅ C-3
M-04	Non-root containers	E1 (container escape)	✅ Pod Security
M-05	Read-only filesystem	T2 (tampering), E1 (escape)	✅ Pod Security
M-06	Secrets encrypted at rest	I1 (RNDC key disclosure)	✅ Kubernetes
M-07	AXFR restricted to secondaries	I2 (zone enumeration)	✅ BIND9 config
M-08	Rate limiting (BIND9)	D1 (DNS query flood)	✅ BIND9 config
M-09	SBOM generation	T2 (supply chain)	✅ SLSA Level 2
M-10	Chainguard zero-CVE images	I3 (CVE disclosure)	✅ Container security

Planned Mitigations (Roadmap)

ID	Mitigation	Threats Mitigated	Priority	Roadmap Item
M-11	Audit log retention policy	R1 (non-repudiation)	HIGH	H-2
M-12	Secret access audit trail	R2 (secret access), I1 (disclosure)	HIGH	H-3
M-13	Admission webhooks	T1 (DNS tampering)	MEDIUM	Future
M-14	DNSSEC signing	T1 (tampering), Scenario 2 (cache poisoning)	MEDIUM	Future
M-15	Image digest pinning	T2 (image tampering)	MEDIUM	M-1
M-16	Rate limiting (controller)	D2 (controller exhaustion)	MEDIUM	M-3
M-17	Network policies	S1 (API spoofing), E1 (lateral movement)	LOW	L-1
M-18	DDoS edge protection	D1 (DNS query flood)	HIGH	External
M-19	RNDC key rotation	I1 (key disclosure)	MEDIUM	Future
M-20	TSIG for AXFR	I2 (zone enumeration)	MEDIUM	Future

Residual Risks

Critical Residual Risks

None identified (all critical threats have strong mitigations).

High Residual Risks

DDoS Attacks (D1) - Risk reduced by rate limiting and horizontal scaling, but edge DDoS protection is needed for volumetric attacks (100+ Gbps).
Insider Threats (Scenario 4) - Risk reduced by GitOps and RBAC, but immutable audit logs (H-2) and secret access audit trail (H-3) are needed for full non-repudiation.

Medium Residual Risks

DNS Tampering (T1) - Risk reduced by RBAC, but admission webhooks and DNSSEC would provide defense-in-depth.
Controller Resource Exhaustion (D2) - Risk reduced by resource limits, but rate limiting (M-3) and admission webhooks are needed.
Zone Enumeration (I2) - Risk reduced by AXFR restrictions, but TSIG authentication would eliminate AXFR abuse.
Compromised Controller Pod (Scenario 1) - Risk reduced by Pod Security Standards, but network policies (L-1) would prevent lateral movement.

Security Architecture

Defense in Depth Layers

┌─────────────────────────────────────────────────────────────┐
│  Layer 7: Monitoring & Response                             │
│  - Audit logs (Kubernetes API)                              │
│  - Vulnerability scanning (daily)                           │
│  - Incident response playbooks                              │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 6: Application Security                              │
│  - Input validation (CRD schemas)                           │
│  - Least privilege RBAC                                     │
│  - Signed commits (non-repudiation)                         │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 5: Container Security                                │
│  - Non-root user (uid 1000+)                                │
│  - Read-only filesystem                                     │
│  - No privileged capabilities                               │
│  - Vulnerability scanning (Trivy)                           │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 4: Pod Security                                      │
│  - Pod Security Standards (Restricted)                      │
│  - seccomp profile (restrict syscalls)                      │
│  - AppArmor/SELinux profiles                                │
│  - Resource limits (CPU/memory)                             │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 3: Namespace Isolation                               │
│  - RBAC (namespace-scoped roles)                            │
│  - Network policies (planned)                               │
│  - Resource quotas                                          │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 2: Cluster Security                                  │
│  - etcd encryption at rest                                  │
│  - API server authentication/authorization                  │
│  - Secrets management                                       │
└─────────────────────────────────────────────────────────────┘
             │
┌─────────────────────────────────────────────────────────────┐
│  Layer 1: Infrastructure Security                           │
│  - Node OS hardening (managed by platform team)             │
│  - Network segmentation                                     │
│  - Physical security                                        │
└─────────────────────────────────────────────────────────────┘

Security Controls Summary

Control Category	Implemented	Planned	Residual Risk
Access Control	RBAC least privilege, signed commits	Admission webhooks	LOW
Data Protection	Secrets encrypted, AXFR restricted	DNSSEC, TSIG	MEDIUM
Supply Chain	Signed commits/images, SBOM, vuln scanning	Image digest pinning	LOW
Monitoring	Kubernetes audit logs, vuln scanning	Audit retention policy, secret access trail	MEDIUM
Resilience	Rate limiting, resource limits	Edge DDoS protection, HPA	MEDIUM
Container Security	Non-root, read-only FS, Pod Security Standards	Network policies	LOW

References

Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Signed Releases

Bindy releases are cryptographically signed using Cosign with keyless signing (Sigstore). This ensures:

Authenticity: Verify that releases come from the official Bindy GitHub repository
Integrity: Detect any tampering with release artifacts
Non-repudiation: Cryptographic proof that artifacts were built by official CI/CD
Transparency: All signatures are recorded in the Sigstore transparency log (Rekor)

What Is Signed

Every Bindy release includes signed artifacts:

Container Images:
- ghcr.io/firestoned/bindy:* (Chainguard base)
- ghcr.io/firestoned/bindy-distroless:* (Google Distroless base)
Binary Tarballs:
- bindy-linux-amd64.tar.gz
- bindy-linux-arm64.tar.gz
Signature Artifacts (uploaded to releases):
- *.tar.gz.bundle - Cosign signature bundles for binaries
- Container signatures are stored in the OCI registry

Installing Cosign

To verify signatures, install Cosign:

# macOS
brew install cosign

# Linux (download binary)
LATEST_VERSION=$(curl -s https://api.github.com/repos/sigstore/cosign/releases/latest | grep tag_name | cut -d '"' -f 4)
curl -Lo cosign https://github.com/sigstore/cosign/releases/download/${LATEST_VERSION}/cosign-linux-amd64
chmod +x cosign
sudo mv cosign /usr/local/bin/

# Verify installation
cosign version

Verifying Container Images

Cosign uses keyless signing with Sigstore, which means:

No private keys to manage or distribute
Signatures are verified against the GitHub Actions OIDC identity
All signatures are logged in the public Rekor transparency log

Quick Verification

# Verify the latest Chainguard image
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:latest

# Verify a specific version
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:v0.1.0

# Verify the Distroless variant
cosign verify \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy-distroless:latest

Understanding the Verification Output

When verification succeeds, Cosign returns JSON output with signature details:

[
  {
    "critical": {
      "identity": {
        "docker-reference": "ghcr.io/firestoned/bindy"
      },
      "image": {
        "docker-manifest-digest": "sha256:abcd1234..."
      },
      "type": "cosign container image signature"
    },
    "optional": {
      "Bundle": {
        "SignedEntryTimestamp": "...",
        "Payload": {
          "body": "...",
          "integratedTime": 1234567890,
          "logIndex": 12345678,
          "logID": "..."
        }
      },
      "Issuer": "https://token.actions.githubusercontent.com",
      "Subject": "https://github.com/firestoned/bindy/.github/workflows/release.yaml@refs/tags/v0.1.0"
    }
  }
]

Key fields to verify:

Subject: Shows the exact GitHub workflow that created the signature
Issuer: Confirms it came from GitHub Actions
integratedTime: Unix timestamp when signature was created
logIndex: Entry in the Rekor transparency log (publicly auditable)

Verification Failures

If verification fails, you’ll see an error like:

Error: no matching signatures:

Do NOT use unverified images in production. This indicates:

The image was not signed by the official Bindy release workflow
The image may have been tampered with
The image may be a counterfeit

Verifying Binary Releases

Binary tarballs are signed with Cosign blob signing. Each release includes .bundle files containing the signature.

Download and Verify

# Download the binary tarball and signature bundle from GitHub Releases
VERSION="v0.1.0"
PLATFORM="linux-amd64"  # or linux-arm64

# Download tarball
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz"

# Download signature bundle
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz.bundle"

# Verify the signature
cosign verify-blob \
  --bundle "bindy-${PLATFORM}.tar.gz.bundle" \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  "bindy-${PLATFORM}.tar.gz"

Verification Success

If successful, you’ll see:

Verified OK

You can now safely extract and use the binary:

tar xzf bindy-${PLATFORM}.tar.gz
./bindy --version

Automated Verification Script

Create a script to download and verify releases automatically:

#!/bin/bash
set -euo pipefail

VERSION="${1:-latest}"
PLATFORM="${2:-linux-amd64}"

if [ "$VERSION" = "latest" ]; then
  VERSION=$(curl -s https://api.github.com/repos/firestoned/bindy/releases/latest | grep tag_name | cut -d '"' -f 4)
fi

echo "Downloading Bindy $VERSION for $PLATFORM..."

# Download artifacts
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz"
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/bindy-${PLATFORM}.tar.gz.bundle"

# Verify signature
echo "Verifying signature..."
cosign verify-blob \
  --bundle "bindy-${PLATFORM}.tar.gz.bundle" \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  "bindy-${PLATFORM}.tar.gz"

# Extract
echo "Extracting..."
tar xzf "bindy-${PLATFORM}.tar.gz"

echo "✓ Bindy $VERSION successfully verified and installed"
./bindy --version

Additional Security Verification

Check SHA256 Checksums

Every release includes a checksums.sha256 file with SHA256 hashes of all artifacts:

# Download checksums
curl -LO "https://github.com/firestoned/bindy/releases/download/${VERSION}/checksums.sha256"

# Verify the tarball checksum
sha256sum -c checksums.sha256 --ignore-missing

Inspect Rekor Transparency Log

All signatures are recorded in the public Rekor transparency log:

# Search for Bindy signatures
rekor-cli search --email noreply@github.com --rekor_server https://rekor.sigstore.dev

# Or use the web interface:
# https://search.sigstore.dev/?email=noreply@github.com

Verify SLSA Provenance

Bindy releases also include SLSA provenance attestations:

# Verify SLSA provenance for the container image
cosign verify-attestation \
  --type slsaprovenance \
  --certificate-identity-regexp='https://github.com/firestoned/bindy' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' \
  ghcr.io/firestoned/bindy:${VERSION}

Kubernetes Deployment Verification

When deploying to Kubernetes, use policy-controller or Kyverno to enforce signature verification:

Kyverno Policy Example

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: verify-bindy-images
spec:
  validationFailureAction: enforce
  background: false
  rules:
    - name: verify-bindy-signature
      match:
        any:
          - resources:
              kinds:
                - Pod
      verifyImages:
        - imageReferences:
            - "ghcr.io/firestoned/bindy*"
          attestors:
            - entries:
                - keyless:
                    subject: "https://github.com/firestoned/bindy/.github/workflows/release.yaml@*"
                    issuer: "https://token.actions.githubusercontent.com"
                    rekor:
                      url: https://rekor.sigstore.dev

This policy ensures:

Only signed Bindy images can run in the cluster
Signatures must come from the official release workflow
Signatures are verified against the Rekor transparency log

Troubleshooting

“Error: no matching signatures”

Cause: Image/artifact is not signed or signature doesn’t match the identity.

Solution:

Verify you’re using an official release from ghcr.io/firestoned/bindy*
Check the tag/version exists on the GitHub releases page
Ensure you’re not using a locally-built image

“Error: unable to verify bundle”

Cause: Signature bundle is corrupted or doesn’t match the artifact.

Solution:

Re-download the artifact and bundle
Verify the SHA256 checksum matches checksums.sha256
Report the issue if checksums match but verification fails

“Error: fetching bundle: context deadline exceeded”

Cause: Network issue connecting to Sigstore services.

Solution:

Check your internet connection
Verify you can reach https://rekor.sigstore.dev and https://fulcio.sigstore.dev
Try again with increased timeout: COSIGN_TIMEOUT=60s cosign verify ...

Security Contact

If you discover a security issue with signed releases:

DO NOT open a public GitHub issue
Report to: security@firestoned.io
Include: artifact name, version, verification output, and steps to reproduce

See SECURITY.md for our security policy and vulnerability disclosure process.

SPDX License Headers

All Bindy source files include SPDX license identifiers for automated license compliance tracking.

What is SPDX?

SPDX (Software Package Data Exchange) is an ISO standard (ISO/IEC 5962:2021) for communicating software license information. SPDX identifiers enable:

Automated SBOM generation: Tools like cargo-cyclonedx detect licenses automatically
License compliance auditing: Verify no GPL contamination in MIT-licensed project
Supply chain transparency: Clear license identification at file granularity
Tooling integration: GitHub, Snyk, Trivy, and other tools recognize SPDX headers

Required Header Format

All source files MUST include SPDX headers in the first 10 lines:

Rust files (.rs):

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

Shell scripts (.sh, .bash):

#!/usr/bin/env bash
# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

Makefiles (Makefile, *.mk):

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT

GitHub Actions workflows (.yaml, .yml):

# Copyright (c) 2025 Erick Bourgeois, firestoned
# SPDX-License-Identifier: MIT
name: My Workflow

Automated Verification

Bindy enforces SPDX headers via CI/CD:

Workflow: .github/workflows/license-check.yaml

Checks:

All Rust files (.rs)
All Shell scripts (.sh, .bash)
All Makefiles (Makefile, *.mk)
All GitHub Actions workflows (.yaml, .yml)

Enforcement:

Runs on every pull request
Runs on every push to main
Pull requests fail if any source files lack SPDX headers
Provides clear error messages with examples for missing headers

Output Example:

✅ All 347 source files have SPDX license headers

File types checked:
  - Rust files (.rs)
  - Shell scripts (.sh, .bash)
  - Makefiles (Makefile, *.mk)
  - GitHub Actions workflows (.yaml, .yml)

License: MIT

Bindy is licensed under the MIT License, one of the most permissive open source licenses.

Permissions:

✅ Commercial use
✅ Modification
✅ Distribution
✅ Private use

Conditions:

📋 Include copyright notice
📋 Include license text

Limitations:

❌ No liability
❌ No warranty

Full license text: LICENSE

Compliance Evidence

SOX 404 (Sarbanes-Oxley):

Control: License compliance and intellectual property tracking
Evidence: All source files tagged with SPDX identifiers, automated verification
Audit Trail: Git history shows when SPDX headers were added

PCI-DSS 6.4.6 (Payment Card Industry):

Requirement: Code review and approval processes
Evidence: SPDX verification blocks unapproved code (missing headers) from merging
Automation: CI/CD enforces license compliance before code review

SLSA Level 3 (Supply Chain Security):

Requirement: Build environment provenance and dependencies
Evidence: SPDX headers enable automated SBOM generation with license info
Transparency: Every dependency’s license is machine-readable

References

Incident Response Playbooks - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404, PCI-DSS 12.10.1, Basel III

Overview
Incident Classification
Response Team
Communication Protocols
Playbook Index
Playbooks
Post-Incident Activities

Overview

This document provides step-by-step incident response playbooks for security incidents involving the Bindy DNS Controller. Each playbook follows the NIST Incident Response Lifecycle: Preparation, Detection & Analysis, Containment, Eradication, Recovery, and Post-Incident Activity.

Objectives

Rapid Response: Minimize time between detection and containment
Clear Procedures: Provide step-by-step guidance for responders
Minimize Impact: Reduce blast radius and prevent escalation
Evidence Preservation: Maintain audit trail for forensics and compliance
Continuous Improvement: Learn from incidents to strengthen defenses

Incident Classification

Severity Levels

Severity	Definition	Response Time	Escalation
🔴 CRITICAL	Complete service outage, data breach, or active exploitation	Immediate (< 15 min)	CISO, CTO, VP Engineering
🟠 HIGH	Degraded service, vulnerability with known exploit, unauthorized access	< 1 hour	Security Lead, Engineering Manager
🟡 MEDIUM	Vulnerability without exploit, suspicious activity, minor service impact	< 4 hours	Security Team, On-Call Engineer
🔵 LOW	Informational findings, potential issues, no immediate risk	< 24 hours	Security Team

Response Team

Roles and Responsibilities

Role	Responsibilities	Contact
Incident Commander	Overall coordination, decision-making, stakeholder communication	On-call rotation
Security Lead	Threat analysis, forensics, remediation guidance	security@firestoned.io
Platform Engineer	Kubernetes cluster operations, pod management	platform@firestoned.io
DNS Engineer	BIND9 expertise, zone management	dns-team@firestoned.io
Compliance Officer	Regulatory reporting, evidence collection	compliance@firestoned.io
Communications	Internal/external communication, customer notifications	comms@firestoned.io

On-Call Rotation

Primary: Security Lead (24/7 PagerDuty)
Secondary: Platform Engineer (escalation)
Tertiary: CTO (executive escalation)

Communication Protocols

Internal Communication

War Room (Incident > MEDIUM):

Slack Channel: #incident-[YYYY-MM-DD]-[number]
Video Call: Zoom war room (pinned in channel)
Status Updates: Every 30 minutes during active incident

Status Page:

Update status.firestoned.io for customer-impacting incidents
Templates: Investigating → Identified → Monitoring → Resolved

External Communication

Regulatory Reporting (CRITICAL incidents only):

PCI-DSS: Notify acquiring bank within 24 hours if cardholder data compromised
SOX: Document incident for quarterly IT controls audit
Basel III: Report cyber risk event to risk management committee

Customer Notification:

Criteria: Data breach, prolonged outage (> 4 hours), SLA violation
Channel: Email to registered contacts, status page
Timeline: Initial notification within 2 hours, updates every 4 hours

Playbook Index

ID	Playbook	Severity	Trigger
P1	Critical Vulnerability Detected	🔴 CRITICAL	GitHub issue, CVE alert, security scan
P2	Compromised Controller Pod	🔴 CRITICAL	Anomalous behavior, unauthorized access
P3	DNS Service Outage	🔴 CRITICAL	All BIND9 pods down, DNS queries failing
P4	RNDC Key Compromise	🔴 CRITICAL	Key leaked, unauthorized RNDC access
P5	Unauthorized DNS Changes	🟠 HIGH	Unexpected zone modifications
P6	DDoS Attack	🟠 HIGH	Query flood, resource exhaustion
P7	Supply Chain Compromise	🔴 CRITICAL	Malicious commit, compromised dependency

Playbooks

P1: Critical Vulnerability Detected

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) SLA: Patch deployed within 24 hours

Trigger

Daily security scan detects CRITICAL vulnerability (CVSS 9.0-10.0)
GitHub Security Advisory published for Bindy dependency
CVE announced with active exploitation in the wild
Automated GitHub issue created: [SECURITY] CRITICAL vulnerability detected

Detection

# Automated detection via GitHub Actions
# Workflow: .github/workflows/security-scan.yaml
# Frequency: Daily at 00:00 UTC

# Manual check:
cargo audit --deny warnings
trivy image ghcr.io/firestoned/bindy:latest --severity CRITICAL,HIGH

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Acknowledge Incident

# Acknowledge PagerDuty alert
# Create Slack war room: #incident-[date]-vuln-[CVE-ID]

Step 1.2: Assess Vulnerability

# Review GitHub issue or security scan results
# Questions to answer:
# - What is the vulnerable component? (dependency, base image, etc.)
# - What is the CVSS score and attack vector?
# - Is there a known exploit (Exploit-DB, Metasploit)?
# - Is Bindy actually vulnerable (code path reachable)?

Step 1.3: Check Production Exposure

# Verify if vulnerable version is deployed
kubectl get deploy -n dns-system bindy -o jsonpath='{.spec.template.spec.containers[0].image}'

# Check image digest
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy -o jsonpath='{.items[0].spec.containers[0].image}'

# Compare with vulnerable version from security advisory

Step 1.4: Determine Impact

If Bindy is NOT vulnerable (code path not reachable):
- Update to patched version at next release (non-urgent)
- Document exception in SECURITY.md
- Close incident as FALSE POSITIVE
If Bindy IS vulnerable (exploitable in production):
- PROCEED TO CONTAINMENT (Phase 2)

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Isolate Vulnerable Pods (if actively exploited)

# Scale down controller to prevent further exploitation
kubectl scale deploy -n dns-system bindy --replicas=0

# NOTE: This stops DNS updates but does NOT affect DNS queries
# BIND9 continues serving existing zones

Step 2.2: Review Audit Logs

# Check for signs of exploitation
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=1000 | grep -i "error\|panic\|exploit"

# Review Kubernetes audit logs (if available)
# Look for: Unusual API calls, secret reads, privilege escalation attempts

Step 2.3: Assess Blast Radius

Controller compromised? Check for unauthorized DNS changes, secret reads
BIND9 affected? Check if RNDC keys were stolen
Data exfiltration? Review network logs for unusual egress traffic

Phase 3: Eradication (T+1 hour to T+24 hours)

Step 3.1: Apply Patch

Option A: Update Dependency (Rust crate)

# Update specific dependency
cargo update -p <vulnerable-package>

# Verify fix
cargo audit

# Run tests
cargo test

# Build new image
docker build -t ghcr.io/firestoned/bindy:hotfix-$(date +%s) .

# Push to registry
docker push ghcr.io/firestoned/bindy:hotfix-$(date +%s)

Option B: Update Base Image

# Update Dockerfile to latest Chainguard image
# docker/Dockerfile:
FROM cgr.dev/chainguard/static:latest-dev  # Use latest digest

# Rebuild and push
docker build -t ghcr.io/firestoned/bindy:hotfix-$(date +%s) .
docker push ghcr.io/firestoned/bindy:hotfix-$(date +%s)

Option C: Apply Workaround (if no patch available)

Disable vulnerable feature flag
Add input validation to prevent exploit
Document workaround in SECURITY.md

Step 3.2: Verify Fix

# Scan patched image
trivy image ghcr.io/firestoned/bindy:hotfix-$(date +%s) --severity CRITICAL,HIGH

# Expected: No CRITICAL vulnerabilities found

Step 3.3: Emergency Release

# Tag release
git tag -s hotfix-v0.1.1 -m "Security hotfix: CVE-XXXX-XXXXX"
git push origin hotfix-v0.1.1

# Trigger release workflow
# Verify signed commits, SBOM generation, vulnerability scans pass

Phase 4: Recovery (T+24 hours to T+48 hours)

Step 4.1: Deploy Patched Version

# Update deployment manifest (GitOps)
# deploy/controller/deployment.yaml:
spec:
  template:
    spec:
      containers:
      - name: bindy
        image: ghcr.io/firestoned/bindy:hotfix-v0.1.1  # Patched version

# Apply via FluxCD (GitOps) or manually
kubectl apply -f deploy/controller/deployment.yaml

# Verify rollout
kubectl rollout status deploy/bindy -n dns-system

# Confirm pods running patched version
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy -o jsonpath='{.items[0].spec.containers[0].image}'

Step 4.2: Verify Service Health

# Check controller logs
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

# Verify reconciliation working
kubectl get dnszones --all-namespaces
kubectl describe dnszone -n team-web example-com

# Test DNS resolution
dig @<bind9-ip> example.com

Step 4.3: Run Security Scans

# Full security scan
cargo audit
trivy image ghcr.io/firestoned/bindy:hotfix-v0.1.1

# Expected: All clear

Phase 5: Post-Incident (T+48 hours to T+1 week)

Step 5.1: Document Incident

Update CHANGELOG.md with hotfix details
Document root cause in incident report
Update SECURITY.md if needed (known issues, exceptions)

Step 5.2: Notify Stakeholders

Update status page: “Resolved - Security patch deployed”
Send email to compliance team (attach incident report)
Notify customers if required (data breach, SLA violation)

Step 5.3: Post-Incident Review (PIR)

What went well? (Detection, response time, communication)
What could improve? (Patch process, testing, automation)
Action items: (Update playbook, add monitoring, improve defenses)

Step 5.4: Update Metrics

MTTR (Mean Time To Remediate): ____ hours
SLA compliance: ✅ Met / ❌ Missed
Update vulnerability dashboard

Success Criteria

✅ Patch deployed within 24 hours
✅ No exploitation detected in production
✅ Service availability maintained (or minimal downtime)
✅ All security scans pass post-patch
✅ Incident documented and reported to compliance

P2: Compromised Controller Pod

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Unauthorized DNS modifications, secret theft, lateral movement

Trigger

Anomalous controller behavior (unexpected API calls, network traffic)
Unauthorized modifications to DNS zones
Security alert from SIEM or IDS
Pod logs show suspicious activity (reverse shell, file downloads)

Detection

# Monitor controller logs for anomalies
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=500 | grep -E "(shell|wget|curl|nc|bash)"

# Check for unexpected processes in pod
kubectl exec -n dns-system <controller-pod> -- ps aux

# Review Kubernetes audit logs
# Look for: Unusual secret reads, excessive API calls, privilege escalation attempts

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm Compromise

# Check controller logs
kubectl logs -n dns-system <controller-pod> --tail=1000 > /tmp/controller-logs.txt

# Indicators of compromise (IOCs):
# - Reverse shell activity (nc, bash -i, /dev/tcp/)
# - File downloads (wget, curl to suspicious domains)
# - Privilege escalation attempts (sudo, setuid)
# - Crypto mining (high CPU, connections to mining pools)

Step 1.2: Assess Impact

# Check for unauthorized DNS changes
kubectl get dnszones --all-namespaces -o yaml > /tmp/dnszones-snapshot.yaml

# Compare with known good state (GitOps repo)
diff /tmp/dnszones-snapshot.yaml /path/to/gitops/dnszones/

# Check for secret reads
# Review Kubernetes audit logs for GET /api/v1/namespaces/dns-system/secrets/*

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Isolate Controller Pod

# Apply network policy to block all egress (prevent data exfiltration)
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-controller-quarantine
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress: []  # Block all egress
EOF

# Delete compromised pod (force recreation)
kubectl delete pod -n dns-system <controller-pod> --force --grace-period=0

Step 2.2: Rotate Credentials

# Rotate RNDC key (if potentially stolen)
# Generate new key
tsig-keygen -a hmac-sha256 rndc-key > /tmp/new-rndc-key.conf

# Update secret
kubectl create secret generic rndc-key-new \
  --from-file=rndc.key=/tmp/new-rndc-key.conf \
  -n dns-system \
  --dry-run=client -o yaml | kubectl apply -f -

# Update BIND9 pods to use new key (restart required)
kubectl rollout restart statefulset/bind9-primary -n dns-system
kubectl rollout restart statefulset/bind9-secondary -n dns-system

# Delete old secret
kubectl delete secret rndc-key -n dns-system

Step 2.3: Preserve Evidence

# Save pod logs before deletion
kubectl logs -n dns-system <controller-pod> --all-containers > /tmp/forensics/controller-logs-$(date +%s).txt

# Capture pod manifest
kubectl get pod -n dns-system <controller-pod> -o yaml > /tmp/forensics/controller-pod-manifest.yaml

# Save Kubernetes events
kubectl get events -n dns-system --sort-by='.lastTimestamp' > /tmp/forensics/events.txt

# Export audit logs (if available)
# - ServiceAccount API calls
# - Secret access logs
# - DNS zone modifications

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Root Cause Analysis

# Analyze logs for initial compromise vector
# Common vectors:
# - Vulnerability in controller code (RCE, memory corruption)
# - Compromised dependency (malicious crate)
# - Supply chain attack (malicious image)
# - Misconfigured RBAC (excessive permissions)

# Check image provenance
kubectl get pod -n dns-system <controller-pod> -o jsonpath='{.spec.containers[0].image}'

# Verify image signature and SBOM
# If signature invalid or SBOM shows unexpected dependencies → supply chain attack

Step 3.2: Patch Vulnerability

If controller code vulnerability: Apply patch (see P1)
If supply chain attack: Investigate upstream, rollback to known good image
If RBAC misconfiguration: Fix RBAC, re-run verification script

Step 3.3: Scan for Backdoors

# Scan all images for malware
trivy image ghcr.io/firestoned/bindy:latest --scanners vuln,secret,misconfig

# Check for unauthorized SSH keys, cron jobs, persistence mechanisms
kubectl exec -n dns-system <new-controller-pod> -- ls -la /root/.ssh/
kubectl exec -n dns-system <new-controller-pod> -- cat /etc/crontab

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Deploy Clean Controller

# Verify image integrity
# - Signed commits in Git history
# - Signed container image with provenance
# - Clean vulnerability scan

# Deploy patched controller
kubectl rollout restart deploy/bindy -n dns-system

# Remove quarantine network policy
kubectl delete networkpolicy bindy-controller-quarantine -n dns-system

# Verify health
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

Step 4.2: Verify DNS Zones

# Restore DNS zones from GitOps (if unauthorized changes detected)
# 1. Revert changes in Git
# 2. Force FluxCD reconciliation
flux reconcile kustomization bindy-system --with-source

# Verify all zones match expected state
kubectl get dnszones --all-namespaces -o yaml | diff - /path/to/gitops/dnszones/

Step 4.3: Validate Service

# Test DNS resolution
dig @<bind9-ip> example.com

# Verify controller reconciliation
kubectl get dnszones --all-namespaces
kubectl describe dnszone -n team-web example-com | grep "Ready.*True"

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Forensic Analysis

Engage forensics team if required
Analyze preserved logs for IOCs
Timeline of compromise (initial access → lateral movement → exfiltration)

Step 5.2: Notify Stakeholders

Compliance: Report to SOX/PCI-DSS auditors (security incident)
Customers: If DNS records were modified or data exfiltrated
Regulators: If required by Basel III (cyber risk event reporting)

Step 5.3: Improve Defenses

Short-term: Implement missing network policies (L-1)
Medium-term: Add runtime security monitoring (Falco, Tetragon)
Long-term: Implement admission controller for image verification

Step 5.4: Update Documentation

Update incident playbook with lessons learned
Document new IOCs for detection rules
Update threat model (docs/security/THREAT_MODEL.md)

Success Criteria

✅ Compromised pod isolated within 15 minutes
✅ No lateral movement to other pods/namespaces
✅ Credentials rotated (RNDC keys)
✅ Root cause identified and patched
✅ DNS service fully restored with verified integrity
✅ Forensic evidence preserved for investigation

P3: DNS Service Outage

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: All DNS queries failing, service unavailable

Trigger

All BIND9 pods down (CrashLoopBackOff, OOMKilled)
DNS queries timing out
Monitoring alert: “DNS service unavailable”
Customer reports: “Cannot resolve domain names”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+10 min)

Step 1.1: Confirm Outage

# Test DNS resolution
dig @<bind9-loadbalancer-ip> example.com

# Check pod status
kubectl get pods -n dns-system -l app.kubernetes.io/name=bind9

# Check service endpoints
kubectl get svc -n dns-system bind9-dns -o wide
kubectl get endpoints -n dns-system bind9-dns

Step 1.2: Identify Root Cause

# Check pod logs
kubectl logs -n dns-system <bind9-pod> --tail=200

# Common root causes:
# - OOMKilled (memory exhaustion)
# - CrashLoopBackOff (configuration error, missing ConfigMap)
# - ImagePullBackOff (registry issue, image not found)
# - Pending (insufficient resources, node failure)

# Check events
kubectl describe pod -n dns-system <bind9-pod>

Phase 2: Containment & Quick Fix (T+10 min to T+30 min)

Scenario A: OOMKilled (Memory Exhaustion)

# Increase memory limit
kubectl patch statefulset bind9-primary -n dns-system -p '
spec:
  template:
    spec:
      containers:
      - name: bind9
        resources:
          limits:
            memory: "512Mi"  # Increase from 256Mi
'

# Restart pods
kubectl rollout restart statefulset/bind9-primary -n dns-system

Scenario B: Configuration Error

# Check ConfigMap
kubectl get cm -n dns-system bind9-config -o yaml

# Common issues:
# - Syntax error in named.conf
# - Missing zone file
# - Invalid RNDC key

# Fix configuration (update ConfigMap)
kubectl edit cm bind9-config -n dns-system

# Restart pods to apply new config
kubectl rollout restart statefulset/bind9-primary -n dns-system

Scenario C: Image Pull Failure

# Check image pull secret
kubectl get secret -n dns-system ghcr-pull-secret

# Verify image exists
docker pull ghcr.io/firestoned/bindy:latest

# If image missing, rollback to previous version
kubectl rollout undo statefulset/bind9-primary -n dns-system

Phase 3: Recovery (T+30 min to T+2 hours)

Step 3.1: Verify Service Restoration

# Check all pods healthy
kubectl get pods -n dns-system -l app.kubernetes.io/name=bind9

# Test DNS resolution (all zones)
dig @<bind9-ip> example.com
dig @<bind9-ip> test.example.com

# Check service endpoints
kubectl get endpoints -n dns-system bind9-dns
# Should show all healthy pod IPs

Step 3.2: Validate Data Integrity

# Verify all zones loaded
kubectl exec -n dns-system <bind9-pod> -- rndc status

# Check zone serial numbers (ensure no data loss)
dig @<bind9-ip> example.com SOA

# Compare with expected serial (from GitOps)

Phase 4: Post-Incident (T+2 hours to T+1 week)

Step 4.1: Root Cause Analysis

Why did BIND9 exhaust memory? (Too many zones, memory leak, query flood)
Why did configuration break? (Controller bug, bad CRD validation, manual change)
Why did image pull fail? (Registry downtime, authentication issue)

Step 4.2: Preventive Measures

Add horizontal pod autoscaling (HPA based on CPU/memory)
Add health checks (liveness/readiness probes for BIND9)
Add configuration validation (admission webhook for ConfigMaps)
Add chaos engineering tests (kill pods, exhaust memory, test recovery)

Step 4.3: Update SLO/SLA

Document actual downtime
Calculate availability percentage
Update SLA reports for customers

Success Criteria

✅ DNS service restored within 30 minutes
✅ All zones serving correctly
✅ No data loss (zone serial numbers match)
✅ Root cause identified and documented
✅ Preventive measures implemented

P4: RNDC Key Compromise

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Attacker can control BIND9 (reload zones, freeze service, etc.)

Trigger

RNDC key found in logs, Git commit, or public repository
Unauthorized RNDC commands detected (audit logs)
Security scan detects secret in code or environment variables

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm Compromise

# Search for leaked key in logs
grep -r "rndc-key" /var/log/ /tmp/

# Search Git history for accidentally committed keys
git log -S "rndc-key" --all

# Check GitHub secret scanning alerts
# GitHub → Security → Secret scanning alerts

Step 1.2: Assess Impact

# Check BIND9 logs for unauthorized RNDC commands
kubectl logs -n dns-system <bind9-pod> --tail=1000 | grep "rndc command"

# Check for malicious activity:
# - rndc freeze (stop zone updates)
# - rndc reload (load malicious zone)
# - rndc querylog on (enable debug logging for reconnaissance)

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Rotate RNDC Key (Emergency)

# Generate new RNDC key
tsig-keygen -a hmac-sha256 rndc-key-emergency > /tmp/rndc-key-new.conf

# Extract key from generated file
cat /tmp/rndc-key-new.conf

# Create new Kubernetes secret
kubectl create secret generic rndc-key-rotated \
  --from-literal=key="<new-key-here>" \
  -n dns-system

# Update controller deployment to use new secret
kubectl set env deploy/bindy -n dns-system RNDC_KEY_SECRET=rndc-key-rotated

# Update BIND9 StatefulSets
kubectl set volume statefulset/bind9-primary -n dns-system \
  --add --name=rndc-key \
  --type=secret \
  --secret-name=rndc-key-rotated \
  --mount-path=/etc/bind/rndc.key \
  --sub-path=rndc.key

# Restart all BIND9 pods
kubectl rollout restart statefulset/bind9-primary -n dns-system
kubectl rollout restart statefulset/bind9-secondary -n dns-system

# Delete compromised secret
kubectl delete secret rndc-key -n dns-system

Step 2.2: Block Network Access (if attacker active)

# Apply network policy to block RNDC port (953) from external access
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bind9-rndc-deny-external
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  # Allow DNS queries (port 53)
  - from:
    - namespaceSelector: {}
    ports:
    - protocol: UDP
      port: 53
    - protocol: TCP
      port: 53
  # Allow RNDC only from controller
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: bindy
    ports:
    - protocol: TCP
      port: 953
EOF

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Remove Leaked Secrets

If secret in Git:

# Remove from Git history (use BFG Repo-Cleaner)
git clone --mirror git@github.com:firestoned/bindy.git
bfg --replace-text passwords.txt bindy.git
cd bindy.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push --force

# Notify all team members to re-clone repository

If secret in logs:

# Rotate logs immediately
kubectl delete pod -n dns-system <controller-pod>  # Forces log rotation

# Purge old logs from log aggregation system
# (Depends on logging backend: Elasticsearch, CloudWatch, etc.)

Step 3.2: Audit All Secret Access

# Review Kubernetes audit logs
# Find all ServiceAccounts that read rndc-key secret in last 30 days
# Check if any unauthorized access occurred

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Verify Key Rotation

# Test RNDC with new key
kubectl exec -n dns-system <controller-pod> -- \
  rndc -s <bind9-ip> -k /etc/bindy/rndc/rndc.key status

# Expected: Command succeeds with new key

# Test DNS service
dig @<bind9-ip> example.com

# Expected: DNS queries work normally

Step 4.2: Update Documentation

# Update secret rotation procedure in SECURITY.md
# Document rotation frequency (e.g., quarterly, or after incident)

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Secret Detection

# Add pre-commit hook to detect secrets
# .git/hooks/pre-commit:
#!/bin/bash
git diff --cached --name-only | xargs grep -E "(rndc-key|BEGIN RSA PRIVATE KEY)" && {
  echo "ERROR: Secret detected in commit. Aborting."
  exit 1
}

# Enable GitHub secret scanning (if not already enabled)
# GitHub → Settings → Code security and analysis → Secret scanning: Enable

Step 5.2: Automate Key Rotation

# Implement automated quarterly key rotation
# Add CronJob to generate and rotate keys every 90 days

Step 5.3: Improve Secret Management

Consider external secret manager (HashiCorp Vault, AWS Secrets Manager)
Implement secret access audit trail (H-3)
Add alerts on unexpected secret reads

Success Criteria

✅ RNDC key rotated within 1 hour
✅ Leaked secret removed from all locations
✅ No unauthorized RNDC commands executed
✅ DNS service fully functional with new key
✅ Secret detection mechanisms implemented
✅ Audit trail reviewed and documented

P5: Unauthorized DNS Changes

Severity: 🟠 HIGH Response Time: < 1 hour Impact: DNS records modified without approval, potential traffic redirection

Trigger

Unexpected changes to DNSZone custom resources
DNS records pointing to unknown IP addresses
GitOps detects drift (actual state ≠ desired state)
User reports: “DNS not resolving correctly”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+30 min)

Step 1.1: Identify Unauthorized Changes

# Get current DNSZone state
kubectl get dnszones --all-namespaces -o yaml > /tmp/current-dnszones.yaml

# Compare with GitOps source of truth
diff /tmp/current-dnszones.yaml /path/to/gitops/dnszones/

# Check Kubernetes audit logs for who made changes
# Look for: kubectl apply, kubectl edit, kubectl patch on DNSZone resources

Step 1.2: Assess Impact

# Which zones were modified?
# What records changed? (A, CNAME, MX, TXT)
# Where is traffic being redirected?

# Test DNS resolution
dig @<bind9-ip> suspicious-domain.com

# Check if malicious IP is reachable
nslookup suspicious-domain.com
curl -I http://<suspicious-ip>/

Phase 2: Containment (T+30 min to T+1 hour)

Step 2.1: Revert Unauthorized Changes

# Revert to known good state (GitOps)
kubectl apply -f /path/to/gitops/dnszones/team-web/example-com.yaml

# Force controller reconciliation
kubectl annotate dnszone -n team-web example-com \
  reconcile-at="$(date +%s)" --overwrite

# Verify zone restored
kubectl get dnszone -n team-web example-com -o yaml | grep "status"

Step 2.2: Revoke Access (if compromised user)

# Identify user who made unauthorized change (from audit logs)
# Example: user=alice, namespace=team-web

# Remove user's RBAC permissions
kubectl delete rolebinding dnszone-editor-alice -n team-web

# Force user to re-authenticate
# (Depends on authentication provider: OIDC, LDAP, etc.)

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Root Cause Analysis

Compromised user credentials? Rotate passwords, check for MFA bypass
RBAC misconfiguration? User had excessive permissions
Controller bug? Controller reconciled incorrect state
Manual kubectl change? Bypassed GitOps workflow

Step 3.2: Fix Root Cause

# Example: RBAC was too permissive
# Fix RoleBinding to limit scope
kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: dnszone-editor-alice
  namespace: team-web
subjects:
- kind: User
  name: alice
roleRef:
  kind: Role
  name: dnszone-editor  # Role only allows CRUD on DNSZones, not deletion
  apiGroup: rbac.authorization.k8s.io
EOF

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Verify DNS Integrity

# Test all zones
for zone in $(kubectl get dnszones --all-namespaces -o jsonpath='{.items[*].spec.zoneName}'); do
  echo "Testing $zone"
  dig @<bind9-ip> $zone SOA
done

# Expected: All zones resolve correctly with expected serial numbers

Step 4.2: Restore User Access (if revoked)

# After confirming user is not compromised, restore access
kubectl apply -f /path/to/gitops/rbac/team-web/alice-rolebinding.yaml

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Admission Webhooks

# Add ValidatingWebhook to prevent suspicious DNS changes
# Example: Block A records pointing to private IPs (RFC 1918)
# Example: Require approval for changes to critical zones (*.bank.com)

Step 5.2: Add Drift Detection

# Implement automated GitOps drift detection
# Alert if cluster state ≠ Git state for > 5 minutes
# Tool: FluxCD notification controller + Slack webhook

Step 5.3: Enforce GitOps Workflow

# Remove direct kubectl access for users
# Require all changes via Pull Requests in GitOps repo
# Implement branch protection: 2+ reviewers required

Success Criteria

✅ Unauthorized changes reverted within 1 hour
✅ Root cause identified (user, RBAC, controller bug)
✅ Access revoked/fixed to prevent recurrence
✅ DNS integrity verified (all zones correct)
✅ Drift detection and admission webhooks implemented

P6: DDoS Attack

Severity: 🟠 HIGH Response Time: < 1 hour Impact: DNS service degraded or unavailable due to query flood

Trigger

High query rate (> 10,000 QPS per pod)
BIND9 pods high CPU/memory utilization
Monitoring alert: “DNS response time elevated”
Users report: “DNS slow or timing out”

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+15 min)

Step 1.1: Confirm DDoS Attack

# Check BIND9 query rate
kubectl exec -n dns-system <bind9-pod> -- rndc status | grep "queries resulted"

# Check pod resource utilization
kubectl top pods -n dns-system -l app.kubernetes.io/name=bind9

# Analyze query patterns
kubectl exec -n dns-system <bind9-pod> -- rndc dumpdb -zones
kubectl exec -n dns-system <bind9-pod> -- cat /var/cache/bind/named_dump.db | head -100

Step 1.2: Identify Attack Type

Volumetric attack: Millions of queries from many IPs (botnet)
Amplification attack: Abusing AXFR or ANY queries
NXDOMAIN attack: Flood of queries for non-existent domains

Phase 2: Containment (T+15 min to T+1 hour)

Step 2.1: Enable Rate Limiting (BIND9)

# Update BIND9 configuration
kubectl edit cm -n dns-system bind9-config

# Add rate-limit directive:
# named.conf:
rate-limit {
    responses-per-second 10;
    nxdomains-per-second 5;
    errors-per-second 5;
    window 10;
};

# Restart BIND9 to apply config
kubectl rollout restart statefulset/bind9-primary -n dns-system

Step 2.2: Scale Up BIND9 Pods

# Horizontal scaling
kubectl scale statefulset bind9-secondary -n dns-system --replicas=5

# Vertical scaling (if needed)
kubectl patch statefulset bind9-primary -n dns-system -p '
spec:
  template:
    spec:
      containers:
      - name: bind9
        resources:
          requests:
            cpu: "1000m"
            memory: "1Gi"
          limits:
            cpu: "2000m"
            memory: "2Gi"
'

Step 2.3: Block Malicious IPs (if identifiable)

# If attack comes from small number of IPs, block at firewall/LoadBalancer
# Example: AWS Network ACL, GCP Cloud Armor

# Add NetworkPolicy to block specific CIDRs
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: block-attacker-ips
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bind9
  policyTypes:
  - Ingress
  ingress:
  - from:
    - ipBlock:
        cidr: 0.0.0.0/0
        except:
        - 192.0.2.0/24  # Attacker CIDR
        - 198.51.100.0/24  # Attacker CIDR
EOF

Phase 3: Eradication (T+1 hour to T+4 hours)

Step 3.1: Engage DDoS Protection Service

# If volumetric attack (> 10 Gbps), edge DDoS protection required
# Options:
# - CloudFlare DNS (proxy DNS through CloudFlare)
# - AWS Shield Advanced
# - Google Cloud Armor

# Migrate DNS to CloudFlare (example):
# 1. Add zone to CloudFlare
# 2. Update NS records at domain registrar
# 3. Configure CloudFlare → Origin (BIND9 backend)

Step 3.2: Implement Response Rate Limiting (RRL)

# BIND9 RRL configuration (more aggressive)
rate-limit {
    responses-per-second 5;
    nxdomains-per-second 2;
    referrals-per-second 5;
    nodata-per-second 5;
    errors-per-second 2;
    window 5;
    log-only no;  # Actually drop packets (not just log)
    slip 2;  # Send truncated response every 2nd rate-limited query
    max-table-size 20000;
};

Phase 4: Recovery (T+4 hours to T+24 hours)

Step 4.1: Monitor Service Health

# Check query rate stabilized
kubectl exec -n dns-system <bind9-pod> -- rndc status

# Check pod resource utilization
kubectl top pods -n dns-system

# Test DNS resolution
dig @<bind9-ip> example.com

# Expected: Normal response times (< 50ms)

Step 4.2: Scale Down (if attack subsided)

# Return to normal replica count
kubectl scale statefulset bind9-secondary -n dns-system --replicas=2

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Permanent DDoS Protection

Edge DDoS protection: CloudFlare, AWS Shield, Google Cloud Armor
Anycast DNS: Distribute load across multiple geographic locations
Autoscaling: HPA based on query rate, CPU, memory

Step 5.2: Improve Monitoring

# Add Prometheus metrics for query rate
# Add alerts:
# - Query rate > 5000 QPS per pod
# - NXDOMAIN rate > 50%
# - Response time > 100ms (p95)

Step 5.3: Document Attack Details

Attack duration: ____ hours
Peak query rate: ____ QPS
Attack type: Volumetric / Amplification / NXDOMAIN
Attack sources: IP ranges, ASNs, geolocation
Mitigation effectiveness: RRL / Scaling / Edge protection

Success Criteria

✅ DNS service restored within 1 hour
✅ Query rate normalized (< 1000 QPS per pod)
✅ Response times < 50ms (p95)
✅ Permanent DDoS protection implemented (CloudFlare, etc.)
✅ Autoscaling and monitoring in place

P7: Supply Chain Compromise

Severity: 🔴 CRITICAL Response Time: Immediate (< 15 minutes) Impact: Malicious code in controller, backdoor access, data exfiltration

Trigger

Malicious commit detected in Git history
Dependency vulnerability with active exploit (supply chain attack)
Image signature verification fails
SBOM shows unexpected dependency or binary

Response Procedure

Phase 1: Detection & Analysis (T+0 to T+30 min)

Step 1.1: Identify Compromised Component

# Check Git commit signatures
git log --show-signature | grep "BAD signature"

# Check image provenance
docker buildx imagetools inspect ghcr.io/firestoned/bindy:latest --format '{{ json .Provenance }}'

# Expected: Valid signature from GitHub Actions

# Check SBOM for unexpected dependencies
# Download SBOM from GitHub release artifacts
curl -L https://github.com/firestoned/bindy/releases/download/v1.0.0/sbom.json | jq '.components[].name'

# Expected: Only known dependencies from Cargo.toml

Step 1.2: Assess Impact

# Check if compromised version deployed to production
kubectl get deploy -n dns-system bindy -o jsonpath='{.spec.template.spec.containers[0].image}'

# If compromised image is running → **CRITICAL** (proceed to containment)
# If compromised image NOT deployed → **HIGH** (patch and prevent deployment)

Phase 2: Containment (T+30 min to T+2 hours)

Step 2.1: Isolate Compromised Controller

# Scale down compromised controller
kubectl scale deploy -n dns-system bindy --replicas=0

# Apply network policy to block egress (prevent exfiltration)
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: bindy-quarantine
  namespace: dns-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: bindy
  policyTypes:
  - Egress
  egress: []
EOF

Step 2.2: Preserve Evidence

# Save pod logs
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --all-containers > /tmp/forensics/controller-logs.txt

# Save compromised image for analysis
docker pull ghcr.io/firestoned/bindy:compromised-tag
docker save ghcr.io/firestoned/bindy:compromised-tag > /tmp/forensics/compromised-image.tar

# Scan for malware
trivy image ghcr.io/firestoned/bindy:compromised-tag --scanners vuln,secret,misconfig

Step 2.3: Rotate All Credentials

# Rotate RNDC keys
# See P4: RNDC Key Compromise

# Rotate ServiceAccount tokens (if controller potentially stole them)
kubectl delete secret -n dns-system $(kubectl get secrets -n dns-system | grep bindy-token | awk '{print $1}')
kubectl rollout restart deploy/bindy -n dns-system  # Will generate new token

Phase 3: Eradication (T+2 hours to T+8 hours)

Step 3.1: Root Cause Analysis

# Identify how malicious code was introduced:
# - Compromised developer account?
# - Malicious dependency in Cargo.toml?
# - Compromised CI/CD pipeline?
# - Insider threat?

# Check Git history for unauthorized commits
git log --all --show-signature

# Check CI/CD logs for anomalies
# GitHub Actions → Workflow runs → Check for unusual activity

# Check dependency sources
cargo tree | grep -v "crates.io"
# Expected: All dependencies from crates.io (no git dependencies)

Step 3.2: Clean Git History (if malicious commit)

# Identify malicious commit
git log --all --oneline | grep "suspicious"

# Revert malicious commit
git revert <malicious-commit-sha>

# Force push (if malicious code not yet merged to main)
git push --force origin feature-branch

# If malicious code merged to main → Contact GitHub Security
# Request help with incident response and forensics

Step 3.3: Rebuild from Clean Source

# Checkout known good commit (before compromise)
git checkout <last-known-good-commit>

# Rebuild binaries
cargo build --release

# Rebuild container image
docker build -t ghcr.io/firestoned/bindy:clean-$(date +%s) .

# Scan for vulnerabilities
cargo audit
trivy image ghcr.io/firestoned/bindy:clean-$(date +%s)

# Expected: All clean

# Push to registry
docker push ghcr.io/firestoned/bindy:clean-$(date +%s)

Phase 4: Recovery (T+8 hours to T+24 hours)

Step 4.1: Deploy Clean Controller

# Update deployment manifest
kubectl set image deploy/bindy -n dns-system \
  bindy=ghcr.io/firestoned/bindy:clean-$(date +%s)

# Remove quarantine network policy
kubectl delete networkpolicy bindy-quarantine -n dns-system

# Verify health
kubectl get pods -n dns-system -l app.kubernetes.io/name=bindy
kubectl logs -n dns-system -l app.kubernetes.io/name=bindy --tail=100

Step 4.2: Verify Service Integrity

# Test DNS resolution
dig @<bind9-ip> example.com

# Verify all zones correct
kubectl get dnszones --all-namespaces -o yaml | diff - /path/to/gitops/dnszones/

# Expected: No drift

Phase 5: Post-Incident (T+24 hours to T+1 week)

Step 5.1: Implement Supply Chain Security

# Enable Dependabot security updates
# .github/dependabot.yml:
version: 2
updates:
  - package-ecosystem: "cargo"
    directory: "/"
    schedule:
      interval: "daily"
    open-pull-requests-limit: 10

# Pin dependencies by hash (Cargo.lock already does this)
# Verify Cargo.lock is committed to Git

# Implement image signing verification
# Add admission controller (Kyverno, OPA Gatekeeper) to verify image signatures before deployment

Step 5.2: Implement Code Review Enhancements

# Require 2+ reviewers for all PRs (already implemented)
# Add CODEOWNERS for sensitive files:
# .github/CODEOWNERS:
/Cargo.toml @security-team
/Cargo.lock @security-team
/Dockerfile @security-team
/.github/workflows/ @security-team

Step 5.3: Notify Stakeholders

Users: Email notification about supply chain incident
Regulators: Report to SOX/PCI-DSS auditors (security incident)
GitHub Security: Report compromised dependency or account

Step 5.4: Update Documentation

Document supply chain incident in threat model
Update supply chain security controls in SECURITY.md
Add supply chain attack scenarios to threat model

Success Criteria

✅ Compromised component identified within 30 minutes
✅ Malicious code removed from Git history
✅ Clean controller deployed within 24 hours
✅ All credentials rotated
✅ Supply chain security improvements implemented
✅ Stakeholders notified and incident documented

Post-Incident Activities

Post-Incident Review (PIR) Template

Incident ID: INC-YYYY-MM-DD-XXXX Severity: 🔴 / 🟠 / 🟡 / 🔵 Incident Commander: [Name] Date: [YYYY-MM-DD] Duration: [Detection to resolution]

Summary

[1-2 paragraph summary of incident]

Timeline

Time	Event	Action Taken
T+0	[Detection event]	[Action]
T+15min	[Analysis]	[Action]
T+1hr	[Containment]	[Action]
T+4hr	[Eradication]	[Action]
T+24hr	[Recovery]	[Action]

Root Cause

[Detailed root cause analysis]

What Went Well ✅

[Detection was fast]
[Playbook was clear]
[Team communication was effective]

What Could Improve ❌

[Monitoring gaps]
[Playbook outdated]
[Slow escalation]

Action Items

Action	Owner	Due Date	Status
[Implement network policies]	Platform Team	2025-01-15	🔄 In Progress
[Add monitoring alerts]	SRE Team	2025-01-10	✅ Complete
[Update playbook]	Security Team	2025-01-05	✅ Complete

Metrics

MTTD (Mean Time To Detect): [X] minutes
MTTR (Mean Time To Remediate): [X] hours
SLA Met: ✅ Yes / ❌ No
Downtime: [X] minutes
Customers Impacted: [N]

References

Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team

Vulnerability Management Policy

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: PCI-DSS 6.2, SOX 404, Basel III Cyber Risk

Overview
Scope
Vulnerability Severity Levels
Remediation SLAs
Scanning Process
Remediation Process
Exception Process
Reporting and Metrics
Roles and Responsibilities
Compliance Requirements

Overview

This document defines the vulnerability management policy for the Bindy DNS Controller project. The policy ensures that security vulnerabilities in dependencies, container images, and source code are identified, tracked, and remediated in a timely manner to maintain compliance with PCI-DSS, SOX, and Basel III requirements.

Objectives

Identify vulnerabilities in all software components before deployment
Remediate vulnerabilities within defined SLAs based on severity
Track and report vulnerability metrics for compliance audits
Prevent deployment of code with CRITICAL/HIGH vulnerabilities
Maintain audit trail of vulnerability management activities

Scope

This policy applies to:

Rust dependencies (direct and transitive) listed in Cargo.lock
Container base images (Debian, Alpine, etc.)
Container runtime dependencies (libraries, binaries)
Development dependencies used in CI/CD pipelines
Third-party libraries and tools

Out of Scope

Kubernetes cluster vulnerabilities (managed by platform team)
Infrastructure vulnerabilities (managed by operations team)
Application logic vulnerabilities (covered by code review process)

Vulnerability Severity Levels

Vulnerabilities are classified using the Common Vulnerability Scoring System (CVSS v3) and mapped to severity levels:

🔴 CRITICAL (CVSS 9.0-10.0)

Definition: Vulnerabilities that can be exploited remotely without authentication and lead to:

Remote code execution (RCE)
Complete system compromise
Data exfiltration of sensitive information
Denial of service affecting multiple systems

Examples:

Unauthenticated RCE in web server
SQL injection with admin access
Memory corruption leading to arbitrary code execution

SLA: 24 hours

🟠 HIGH (CVSS 7.0-8.9)

Definition: Vulnerabilities that can be exploited with limited user interaction or authentication and lead to:

Privilege escalation
Unauthorized data access
Significant denial of service
Bypass of authentication/authorization controls

Examples:

Authenticated RCE
Cross-site scripting (XSS) with session hijacking
Path traversal allowing file read/write
Insecure deserialization

SLA: 7 days

🟡 MEDIUM (CVSS 4.0-6.9)

Definition: Vulnerabilities that require significant user interaction or specific conditions and lead to:

Limited information disclosure
Localized denial of service
Minor authorization bypass
Reduced system functionality

Examples:

Information disclosure (non-sensitive data)
CSRF with limited impact
Reflected XSS
Resource exhaustion (single process)

SLA: 30 days

🔵 LOW (CVSS 0.1-3.9)

Definition: Vulnerabilities with minimal impact that require significant preconditions:

Cosmetic issues
Minor information disclosure
Difficult-to-exploit conditions
No direct security impact

Examples:

Version disclosure
Clickjacking on non-critical pages
Minor configuration issues

SLA: 90 days or next release

Remediation SLAs

Severity	CVSS Score	Detection to Fix	Approval to Deploy	Exceptions
🔴 CRITICAL	9.0-10.0	24 hours	4 hours	CISO approval required
🟠 HIGH	7.0-8.9	7 days	1 business day	Security lead approval
🟡 MEDIUM	4.0-6.9	30 days	Next sprint	Team lead approval
🔵 LOW	0.1-3.9	90 days	Next release	Auto-approved

SLA Clock

Starts: When vulnerability is first detected by automated scan
Pauses: When risk acceptance or exception is granted
Stops: When patch is deployed to production OR exception is approved

SLA Escalation

If SLA is at risk of being missed:

T-50%: Notification to team lead
T-80%: Notification to security team
T-100%: Escalation to CISO and incident response team

Scanning Process

Automated Scanning

1. Continuous Integration (CI) Scanning

Frequency: Every PR and commit to main branch

Tools:

cargo audit for Rust dependencies
Trivy for container images

Process:

PR is opened or updated
CI workflow runs security scans
If CRITICAL/HIGH vulnerabilities found:
- CI fails
- PR is blocked from merging
- GitHub issue is created automatically
Developer must remediate before merge

Workflow: .github/workflows/pr.yaml

2. Scheduled Scanning

Frequency: Daily at 00:00 UTC

Tools:

cargo audit for dependencies
Trivy for published container images

Process:

Scan runs automatically via GitHub Actions
Results are uploaded to GitHub Security tab
If vulnerabilities found:
- GitHub issue is created with details
- Security team is notified
Vulnerabilities are tracked until remediation

Workflow: .github/workflows/security-scan.yaml

3. Release Scanning

Frequency: Every release tag

Tools:

cargo audit for final dependency snapshot
Trivy for release container image

Process:

Release is tagged
Security scans run before deployment
If CRITICAL/HIGH vulnerabilities found:
- Release fails
- Issue is created for emergency fix
Release proceeds only if all scans pass

Workflow: .github/workflows/release.yaml

Manual Scanning

Developers should run scans locally before committing:

# Scan Rust dependencies
cargo audit

# Scan container image
trivy image ghcr.io/firestoned/bindy:latest

Remediation Process

Step 1: Triage (Within 4 hours for CRITICAL, 24 hours for HIGH)

Verify vulnerability applies to Bindy:
- Check if vulnerable code path is used
- Verify affected version matches
- Assess exploitability in Bindy’s context
Assess impact:
- What data/systems are at risk?
- What is the attack vector?
- Is there a known exploit?
Determine remediation approach:
- Update dependency to patched version
- Apply workaround/mitigation
- Accept risk (if low impact)

Step 2: Remediation (Within SLA)

Option A: Update Dependency

# Update single dependency
cargo update -p <package-name>

# Verify fix
cargo audit

# Test
cargo test

Option B: Upgrade Major Version

# Update Cargo.toml
vim Cargo.toml  # Change version constraint

# Update lockfile
cargo update

# Test for breaking changes
cargo test

Option C: Apply Workaround

If no patch is available:

Disable vulnerable feature flag
Implement input validation
Add runtime checks
Document in SECURITY.md

Option D: Request Exception (See Exception Process)

Step 3: Verification

Run cargo audit to confirm vulnerability is resolved
Run cargo test to ensure no regressions
Run integration tests
Document fix in PR description

Step 4: Deployment

Create PR with fix
PR passes all CI checks (including security scans)
Code review and approval
Merge to main
Deploy to production
Close GitHub issue

Step 5: Post-Deployment

Verify vulnerability is resolved in production
Update metrics dashboard
Document lessons learned
Update runbooks if needed

Exception Process

When to Request an Exception

No patch available and vulnerability has low exploitability
Patch introduces breaking changes requiring extended migration
Vulnerability does not apply to Bindy’s use case
Compensating controls mitigate the risk

Exception Request Process

Create exception request (GitHub issue or security ticket):
- Vulnerability ID (CVE, RUSTSEC-ID)
- Severity and CVSS score
- Justification for exception
- Compensating controls
- Expiration date (max 90 days)
Approval required:
- CRITICAL: CISO approval
- HIGH: Security lead approval
- MEDIUM: Team lead approval
- LOW: Auto-approved

Document in SECURITY.md:

## Known Vulnerabilities (Risk Accepted)

### CVE-2024-XXXXX - <Package Name>
- **Severity:** HIGH
- **Affected Version:** 1.2.3
- **Status:** Risk Accepted
- **Justification:** Vulnerability requires local file system access, which is not available in Kubernetes pod security context.
- **Compensating Controls:** Pod security policy enforces readOnlyRootFilesystem=true
- **Expiration:** 2025-03-01
- **Approved By:** Jane Doe (Security Lead)
- **Date:** 2025-01-15

Review exceptions monthly:
- Check if patch is now available
- Verify compensating controls are still effective
- Renew or remediate before expiration

Reporting and Metrics

Weekly Report

Recipients: Development team, Security team

Contents:

New vulnerabilities detected
Vulnerabilities remediated
Open vulnerabilities by severity
SLA compliance percentage
Aging vulnerabilities (open >30 days)

Source: GitHub Security tab + automated report workflow

Monthly Report

Recipients: Management, Compliance team

Contents:

Vulnerability trends (month-over-month)
Mean time to remediate (MTTR) by severity
SLA compliance rate
Exception requests and approvals
Top 5 vulnerable dependencies
Compliance attestation

Source: Security metrics dashboard

Quarterly Report

Recipients: Executive team, Audit team

Contents:

Vulnerability management effectiveness
Policy compliance audit results
Risk acceptance report
Remediation process improvements
Compliance attestation (PCI-DSS, SOX, Basel III)

Source: Compliance reporting system

Key Metrics

Mean Time to Detect (MTTD): Time from CVE disclosure to detection in Bindy
- Target: <24 hours
Mean Time to Remediate (MTTR):
- CRITICAL: <24 hours
- HIGH: <7 days
- MEDIUM: <30 days
SLA Compliance Rate: Percentage of vulnerabilities remediated within SLA
- Target: >95%
Vulnerability Backlog: Open vulnerabilities by severity
- Target: Zero CRITICAL, <5 HIGH
Scan Coverage: Percentage of releases scanned
- Target: 100%

Roles and Responsibilities

Development Team

Run local security scans before committing
Remediate vulnerabilities assigned to them
Create PRs with security fixes
Test fixes for regressions
Document security changes in CHANGELOG

Security Team

Monitor daily scan results
Triage and assign vulnerabilities
Approve risk exceptions
Conduct weekly vulnerability reviews
Maintain this policy document
Report metrics to management

DevOps/SRE Team

Maintain CI/CD scanning infrastructure
Deploy security patches to production
Monitor for new container base image vulnerabilities
Coordinate emergency patching

Compliance Team

Review quarterly vulnerability reports
Validate SLA compliance for audits
Maintain audit trail documentation
Coordinate with external auditors

Compliance Requirements

PCI-DSS 6.2

Requirement: Protect all system components from known vulnerabilities by installing applicable security patches/updates.

Implementation:

Automated vulnerability scanning (cargo audit + Trivy)
Patch within SLA (CRITICAL: 24h, HIGH: 7d)
Audit trail of remediation activities
Quarterly vulnerability reports

Evidence:

GitHub Actions scan logs
Security dashboard showing zero CRITICAL vulnerabilities
CHANGELOG entries documenting patches
Exception approval records

SOX 404 - IT General Controls

Requirement: IT systems must have controls to identify and remediate security vulnerabilities.

Implementation:

Documented vulnerability management policy (this document)
Automated scanning in CI/CD pipeline
SLA-based remediation tracking
Monthly compliance reports

Evidence:

This policy document
CI/CD workflow configurations
GitHub issues tracking remediation
Monthly vulnerability management reports

Basel III - Operational/Cyber Risk

Requirement: Banks must manage cyber risk through preventive controls.

Implementation:

Preventive control: Block deployment of vulnerable code (CI gate)
Detective control: Daily scheduled scans
Corrective control: SLA-based remediation process
Risk acceptance: Exception process with approvals

Evidence:

Failed CI builds due to vulnerabilities
Scheduled scan results
Remediation SLA metrics
Exception approval documentation

References

Policy Review

This policy is reviewed and updated:

Quarterly: By security team
Annually: By compliance team
Ad-hoc: When compliance requirements change

Last Review: 2025-12-17 Next Review: 2025-03-17 Approved By: Security Team

Build Reproducibility Verification

Status: ✅ Implemented Compliance: SLSA Level 3, SOX 404 (Supply Chain), PCI-DSS 6.4.6 (Code Review) Last Updated: 2025-12-18 Owner: Security Team

Overview
SLSA Level 3 Requirements
Build Reproducibility Verification
Sources of Non-Determinism
Verification Process
Container Image Reproducibility
Continuous Verification
Troubleshooting

Overview

Build reproducibility (also called “deterministic builds” or “reproducible builds”) means that building the same source code twice produces bit-for-bit identical binaries. This is critical for:

Supply Chain Security: Verify released binaries match source code (detect tampering)
SLSA Level 3 Compliance: Required for software supply chain integrity
SOX 404 Compliance: Ensures change management controls are effective
Incident Response: Verify binaries in production match known-good builds

Why Reproducibility Matters

Attack Scenario (Without Reproducibility):

Attacker compromises CI/CD pipeline or build server
Injects malicious code during build process (e.g., backdoor in binary)
Source code in Git is clean, but distributed binary contains malware
Users cannot verify if binary matches source code

Defense (With Reproducibility):

Independent party rebuilds from source code
Compares hash of rebuilt binary with released binary
If hashes match → binary is authentic ✅
If hashes differ → binary was tampered with 🚨

Current Status

Bindy’s build process is mostly reproducible with the following exceptions:

Build Artifact	Reproducible?	Status
Rust binary (`target/release/bindy`)	✅ YES	Deterministic with Cargo.lock pinned
Container image (Chainguard)	⚠️ PARTIAL	Base image updates break reproducibility
Container image (Distroless)	⚠️ PARTIAL	Base image updates break reproducibility
CRD YAML files	✅ YES	Generated from Rust types (deterministic)
SBOM (Software Bill of Materials)	✅ YES	Generated from Cargo.lock (deterministic)

Goal: Achieve 100% reproducibility by pinning base image digests and using reproducible timestamps.

SLSA Level 3 Requirements

SLSA (Supply Chain Levels for Software Artifacts) Level 3 requires:

SLSA Requirement	Bindy Implementation	Status
Build provenance	✅ Signed commits, SBOM, container attestation	✅ Complete
Source integrity	✅ GPG/SSH signed commits, branch protection	✅ Complete
Build integrity	✅ Reproducible builds (this document)	✅ Complete
Hermetic builds	⚠️ Docker builds use network (cargo fetch)	⚠️ Partial
Build as code	✅ Dockerfile and Makefile in version control	✅ Complete
Verification	✅ Automated reproducibility checks in CI	✅ Complete

SLSA Level 3 Build Requirements

Reproducible: Same source + same toolchain = same binary
Hermetic: Build process has no network access (all deps pre-fetched)
Isolated: Build cannot access secrets or external state
Auditable: Build process fully documented and verifiable

Bindy’s Approach:

✅ Reproducible: Cargo.lock pins all dependencies, Dockerfile uses pinned base images
⚠️ Hermetic: Docker build uses network (acceptable for SLSA Level 2, working toward Level 3)
✅ Isolated: CI/CD builds in ephemeral containers, no persistent state
✅ Auditable: Build process in Makefile, Dockerfile, and GitHub Actions workflows

Build Reproducibility Verification

Prerequisites

To verify build reproducibility, you need:

Same source code: Exact commit hash (e.g., git checkout v0.1.0)
Same toolchain: Same Rust version (e.g., rustc 1.91.0)
Same dependencies: Same Cargo.lock (committed to Git)
Same build flags: Same optimization level, target triple, features

Step 1: Rebuild from Source

# Clone the repository
git clone https://github.com/firestoned/bindy.git
cd bindy

# Check out the exact release tag
git checkout v0.1.0

# Verify commit signature
git verify-commit v0.1.0

# Verify toolchain version matches release
rustc --version
# Expected: rustc 1.91.0 (stable 2024-10-17)

# Build release binary
cargo build --release --locked

# Calculate SHA-256 hash of binary
sha256sum target/release/bindy

Example Output:

abc123def456789... target/release/bindy

Step 2: Compare with Released Binary

# Download released binary from GitHub Releases
curl -LO https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy-linux-amd64

# Calculate SHA-256 hash of released binary
sha256sum bindy-linux-amd64

Expected Output:

abc123def456789... bindy-linux-amd64

Verification:

✅ PASS - Hashes match → Binary is authentic and reproducible
🚨 FAIL - Hashes differ → Binary may be tampered or build is non-deterministic

Step 3: Investigate Hash Mismatch

If hashes differ, check the following:

# 1. Verify Rust toolchain version
rustc --version
cargo --version

# 2. Verify Cargo.lock is identical
git diff v0.1.0 -- Cargo.lock

# 3. Verify build flags
cargo build --release --locked --verbose | grep "Running.*rustc"

# 4. Check for timestamp differences
objdump -s -j .comment target/release/bindy

Common Causes of Non-Determinism:

Different Rust toolchain version
Modified Cargo.lock (dependency version mismatch)
Different build flags or features
Embedded timestamps in binary (see Sources of Non-Determinism)

Sources of Non-Determinism

1. Timestamps

Problem: Build timestamps embedded in binaries make them non-reproducible.

Sources in Rust:

env!("CARGO_PKG_VERSION") → OK (from Cargo.toml, deterministic)
env!("BUILD_DATE") → ❌ NON-DETERMINISTIC (changes every build)
File modification times (mtime) → ❌ NON-DETERMINISTIC

Fix:

#![allow(unused)]
fn main() {
// ❌ BAD - Embeds build timestamp
const BUILD_DATE: &str = env!("BUILD_DATE");

// ✅ GOOD - Use Git commit timestamp (deterministic)
const BUILD_DATE: &str = env!("VERGEN_GIT_COMMIT_TIMESTAMP");
}

Using vergen for Deterministic Build Info:

Add to Cargo.toml:

[build-dependencies]
vergen = { version = "8", features = ["git", "gitcl"] }

Create build.rs:

use vergen::EmitBuilder;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    EmitBuilder::builder()
        .git_commit_timestamp()  // Use Git commit timestamp (deterministic)
        .git_sha(false)          // Short Git SHA (deterministic)
        .emit()?;
    Ok(())
}

Use in main.rs:

#![allow(unused)]
fn main() {
const BUILD_DATE: &str = env!("VERGEN_GIT_COMMIT_TIMESTAMP");
const GIT_SHA: &str = env!("VERGEN_GIT_SHA");

println!("Bindy {} ({})", env!("CARGO_PKG_VERSION"), GIT_SHA);
println!("Built: {}", BUILD_DATE);
}

Why This Works:

Git commit timestamp is fixed for a given commit (never changes)
Independent builds of the same commit will use the same timestamp
Verifiable by anyone with access to the Git repository

2. Filesystem Order

Problem: Reading files in directory order is non-deterministic (depends on filesystem).

Example:

#![allow(unused)]
fn main() {
// ❌ BAD - Directory order is non-deterministic
for entry in std::fs::read_dir("zones")? {
    let file = entry?.path();
    process_zone(file);
}

// ✅ GOOD - Sort files before processing
let mut files: Vec<_> = std::fs::read_dir("zones")?
    .collect::<Result<_, _>>()?;
files.sort_by_key(|e| e.path());
for entry in files {
    process_zone(entry.path());
}
}

3. HashMap Iteration Order

Problem: Rust HashMap iteration order is randomized for security (hash DoS protection).

Example:

#![allow(unused)]
fn main() {
use std::collections::HashMap;

// ❌ BAD - HashMap iteration order is non-deterministic
let mut zones = HashMap::new();
zones.insert("example.com", "10.0.0.1");
zones.insert("test.com", "10.0.0.2");

for (zone, ip) in &zones {
    println!("{} -> {}", zone, ip);  // Order is random!
}

// ✅ GOOD - Use BTreeMap for deterministic iteration
use std::collections::BTreeMap;

let mut zones = BTreeMap::new();
zones.insert("example.com", "10.0.0.1");
zones.insert("test.com", "10.0.0.2");

for (zone, ip) in &zones {
    println!("{} -> {}", zone, ip);  // Sorted order (deterministic)
}
}

When This Matters:

Generating configuration files (BIND9 named.conf)
Serializing data to JSON/YAML
Logging or printing debug output that’s included in build artifacts

4. Parallelism and Race Conditions

Problem: Parallel builds may produce different results if intermediate files are generated in different orders.

Example:

#![allow(unused)]
fn main() {
// ❌ BAD - Parallel iterators may produce non-deterministic output
use rayon::prelude::*;

let output = zones.par_iter()
    .map(|zone| generate_config(zone))
    .collect::<Vec<_>>()
    .join("\n");  // Order depends on which thread finishes first!

// ✅ GOOD - Sort after parallel processing
let mut output = zones.par_iter()
    .map(|zone| generate_config(zone))
    .collect::<Vec<_>>();
output.sort();  // Deterministic order
let output = output.join("\n");
}

5. Base Image Updates (Container Images)

Problem: Docker base images update frequently, breaking reproducibility.

Example:

# ❌ BAD - Uses latest version (non-reproducible)
FROM cgr.dev/chainguard/static:latest

# ✅ GOOD - Pin to specific digest
FROM cgr.dev/chainguard/static:latest@sha256:abc123def456...

How to Pin Base Image Digest:

# Get current digest
docker pull cgr.dev/chainguard/static:latest
docker inspect cgr.dev/chainguard/static:latest | jq -r '.[0].RepoDigests[0]'
# Output: cgr.dev/chainguard/static:latest@sha256:abc123def456...

# Update Dockerfile
sed -i 's|cgr.dev/chainguard/static:latest|cgr.dev/chainguard/static:latest@sha256:abc123def456...|' docker/Dockerfile.chainguard

Trade-Off:

✅ Pro: Reproducible builds (same base image every time)
⚠️ Con: No automatic security updates (must manually update digest)

Recommended Approach:

Pin digest for releases (v0.1.0, v0.2.0, etc.) → Reproducibility
Use latest for development builds → Automatic security updates
Update base image digest monthly or after CVE disclosures

Verification Process

Automated Verification (CI/CD)

Goal: Rebuild every release and verify the binary hash matches the released artifact.

GitHub Actions Workflow:

# .github/workflows/verify-reproducibility.yaml
name: Verify Build Reproducibility

on:
  release:
    types: [published]
  workflow_dispatch:
    inputs:
      tag:
        description: 'Git tag to verify (e.g., v0.1.0)'
        required: true

jobs:
  verify-reproducibility:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout source code
        uses: actions/checkout@v4
        with:
          ref: ${{ github.event.inputs.tag || github.event.release.tag_name }}

      - name: Install Rust toolchain
        uses: dtolnay/rust-toolchain@stable
        with:
          toolchain: 1.91.0  # Match release toolchain

      - name: Rebuild binary
        run: cargo build --release --locked

      - name: Calculate hash of rebuilt binary
        id: rebuilt-hash
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          echo "Rebuilt binary hash: $HASH"

      - name: Download released binary
        run: |
          TAG=${{ github.event.inputs.tag || github.event.release.tag_name }}
          curl -LO https://github.com/firestoned/bindy/releases/download/$TAG/bindy-linux-amd64

      - name: Calculate hash of released binary
        id: released-hash
        run: |
          HASH=$(sha256sum bindy-linux-amd64 | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          echo "Released binary hash: $HASH"

      - name: Compare hashes
        run: |
          REBUILT="${{ steps.rebuilt-hash.outputs.hash }}"
          RELEASED="${{ steps.released-hash.outputs.hash }}"

          if [ "$REBUILT" == "$RELEASED" ]; then
            echo "✅ PASS: Hashes match - Build is reproducible"
            exit 0
          else
            echo "🚨 FAIL: Hashes differ - Build is NOT reproducible"
            echo "Rebuilt:  $REBUILT"
            echo "Released: $RELEASED"
            exit 1
          fi

      - name: Upload verification report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: reproducibility-report
          path: |
            target/release/bindy
            bindy-linux-amd64

When to Run:

✅ Automatically: After every release (GitHub Actions release event)
✅ Manually: On-demand for any Git tag (workflow_dispatch)
✅ Scheduled: Monthly verification of latest release

Manual Verification (External Auditors)

Goal: Allow external auditors to independently verify builds without access to CI/CD.

Verification Script (scripts/verify-build.sh):

#!/usr/bin/env bash
# Verify build reproducibility for a Bindy release
#
# Usage:
#   ./scripts/verify-build.sh v0.1.0
#
# Requirements:
#   - Git
#   - Rust toolchain (rustc 1.91.0)
#   - curl, sha256sum

set -euo pipefail

TAG="${1:-}"
if [ -z "$TAG" ]; then
  echo "Usage: $0 <git-tag>"
  echo "Example: $0 v0.1.0"
  exit 1
fi

echo "============================================"
echo "Verifying build reproducibility for $TAG"
echo "============================================"

# 1. Check out the source code
echo ""
echo "[1/6] Checking out source code..."
git fetch --tags
git checkout "$TAG"
git verify-commit "$TAG" || {
  echo "⚠️  WARNING: Commit signature verification failed"
}

# 2. Verify Rust toolchain version
echo ""
echo "[2/6] Verifying Rust toolchain..."
EXPECTED_RUSTC="rustc 1.91.0"
ACTUAL_RUSTC=$(rustc --version)
if [[ "$ACTUAL_RUSTC" != "$EXPECTED_RUSTC"* ]]; then
  echo "⚠️  WARNING: Rust version mismatch"
  echo "   Expected: $EXPECTED_RUSTC"
  echo "   Actual:   $ACTUAL_RUSTC"
  echo "   Continuing anyway..."
fi

# 3. Rebuild binary
echo ""
echo "[3/6] Building release binary..."
cargo build --release --locked

# 4. Calculate hash of rebuilt binary
echo ""
echo "[4/6] Calculating hash of rebuilt binary..."
REBUILT_HASH=$(sha256sum target/release/bindy | awk '{print $1}')
echo "   Rebuilt hash: $REBUILT_HASH"

# 5. Download released binary
echo ""
echo "[5/6] Downloading released binary..."
RELEASE_URL="https://github.com/firestoned/bindy/releases/download/$TAG/bindy-linux-amd64"
curl -sL -o bindy-released "$RELEASE_URL"

# 6. Calculate hash of released binary
echo ""
echo "[6/6] Calculating hash of released binary..."
RELEASED_HASH=$(sha256sum bindy-released | awk '{print $1}')
echo "   Released hash: $RELEASED_HASH"

# Compare hashes
echo ""
echo "============================================"
echo "VERIFICATION RESULT"
echo "============================================"
if [ "$REBUILT_HASH" == "$RELEASED_HASH" ]; then
  echo "✅ PASS: Hashes match"
  echo ""
  echo "The released binary is reproducible and matches the source code."
  echo "This confirms the binary was built from the tagged commit without tampering."
  exit 0
else
  echo "🚨 FAIL: Hashes differ"
  echo ""
  echo "Rebuilt:  $REBUILT_HASH"
  echo "Released: $RELEASED_HASH"
  echo ""
  echo "The released binary does NOT match the rebuilt binary."
  echo "Possible causes:"
  echo "  - Different Rust toolchain version"
  echo "  - Non-deterministic build process"
  echo "  - Binary tampering (SECURITY INCIDENT)"
  echo ""
  echo "Next steps:"
  echo "  1. Verify Rust toolchain: rustc --version"
  echo "  2. Check build.rs for timestamps or randomness"
  echo "  3. Contact security@firestoned.io if tampering suspected"
  exit 1
fi

Make executable:

chmod +x scripts/verify-build.sh

Usage:

./scripts/verify-build.sh v0.1.0

Container Image Reproducibility

Challenge: Docker Layers are Non-Deterministic

Docker images are harder to reproduce than binaries because:

Base image updates (even with same tag, digest changes)
File timestamps in layers (mtime)
Layer order affects final hash
Docker build cache affects output

Solution: Use `SOURCE_DATE_EPOCH` for Reproducible Timestamps

Dockerfile Best Practices:

# docker/Dockerfile.chainguard
# Pin base image digest for reproducibility
ARG BASE_IMAGE_DIGEST=sha256:abc123def456...
FROM cgr.dev/chainguard/static:latest@${BASE_IMAGE_DIGEST}

# Use SOURCE_DATE_EPOCH for reproducible timestamps
ARG SOURCE_DATE_EPOCH
ENV SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH}

# Copy binary (built with same SOURCE_DATE_EPOCH)
COPY --chmod=755 target/release/bindy /usr/local/bin/bindy

USER nonroot:nonroot
ENTRYPOINT ["/usr/local/bin/bindy"]

Build with Reproducible Timestamp:

# Get Git commit timestamp (deterministic)
export SOURCE_DATE_EPOCH=$(git log -1 --format=%ct)

# Build container image
docker build \
  --build-arg SOURCE_DATE_EPOCH=$SOURCE_DATE_EPOCH \
  --build-arg BASE_IMAGE_DIGEST=sha256:abc123def456... \
  -t ghcr.io/firestoned/bindy:v0.1.0 \
  -f docker/Dockerfile.chainguard \
  .

Verify Image Reproducibility:

# Build image twice
docker build ... -t bindy:build1
docker build ... -t bindy:build2

# Compare image digests
docker inspect bindy:build1 | jq -r '.[0].Id'
docker inspect bindy:build2 | jq -r '.[0].Id'

# If digests match → Reproducible ✅
# If digests differ → Non-deterministic 🚨

Multi-Stage Build for Reproducibility

Recommended Pattern:

# Stage 1: Build binary (reproducible)
FROM rust:1.91-alpine AS builder
WORKDIR /build

# Copy dependency manifests
COPY Cargo.toml Cargo.lock ./

# Pre-fetch dependencies (layer cached, reproducible)
RUN cargo fetch --locked

# Copy source code
COPY src/ ./src/
COPY build.rs ./

# Build binary with reproducible timestamp
ARG SOURCE_DATE_EPOCH
ENV SOURCE_DATE_EPOCH=${SOURCE_DATE_EPOCH}
RUN cargo build --release --locked --offline

# Stage 2: Runtime image (reproducible with pinned base)
ARG BASE_IMAGE_DIGEST=sha256:abc123def456...
FROM cgr.dev/chainguard/static:latest@${BASE_IMAGE_DIGEST}

# Copy binary from builder
COPY --from=builder --chmod=755 /build/target/release/bindy /usr/local/bin/bindy

USER nonroot:nonroot
ENTRYPOINT ["/usr/local/bin/bindy"]

Why This Works:

Layer 1 (dependencies): Deterministic (Cargo.lock pinned)
Layer 2 (source code): Deterministic (Git commit)
Layer 3 (build): Deterministic (SOURCE_DATE_EPOCH)
Layer 4 (runtime): Deterministic (pinned base image digest)

Continuous Verification

Daily Verification Checks

Goal: Catch non-determinism regressions early (before releases).

Scheduled GitHub Actions:

# .github/workflows/reproducibility-check.yaml
name: Reproducibility Check

on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM UTC
  push:
    branches:
      - main

jobs:
  build-twice:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: dtolnay/rust-toolchain@stable
        with:
          toolchain: 1.91.0

      # Build 1
      - name: Build binary (attempt 1)
        run: cargo build --release --locked

      - name: Calculate hash (attempt 1)
        id: hash1
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          mv target/release/bindy bindy-build1

      # Clean build directory
      - name: Clean build artifacts
        run: cargo clean

      # Build 2
      - name: Build binary (attempt 2)
        run: cargo build --release --locked

      - name: Calculate hash (attempt 2)
        id: hash2
        run: |
          HASH=$(sha256sum target/release/bindy | awk '{print $1}')
          echo "hash=$HASH" >> $GITHUB_OUTPUT
          mv target/release/bindy bindy-build2

      # Compare
      - name: Verify reproducibility
        run: |
          HASH1="${{ steps.hash1.outputs.hash }}"
          HASH2="${{ steps.hash2.outputs.hash }}"

          if [ "$HASH1" == "$HASH2" ]; then
            echo "✅ PASS: Builds are reproducible"
            exit 0
          else
            echo "🚨 FAIL: Builds are NOT reproducible"
            echo "Build 1: $HASH1"
            echo "Build 2: $HASH2"

            # Show differences
            objdump -s bindy-build1 > build1.dump
            objdump -s bindy-build2 > build2.dump
            diff -u build1.dump build2.dump || true

            exit 1
          fi

When to Alert:

✅ Daily check PASS: No action needed
🚨 Daily check FAIL: Alert security team, investigate non-determinism

Troubleshooting

Build Hash Mismatch Debugging

Step 1: Verify Toolchain

# Check Rust version
rustc --version
cargo --version

# Check installed targets
rustup show

# Check default toolchain
rustup default

Expected:

rustc 1.91.0 (stable 2024-10-17)
cargo 1.91.0

Step 2: Compare Build Metadata

# Extract build metadata from binary
strings target/release/bindy | grep -E "(rustc|cargo|VERGEN)"

# Compare with released binary
strings bindy-released | grep -E "(rustc|cargo|VERGEN)"

Look for:

Different Rust version strings
Different Git commit SHAs
Embedded timestamps

Step 3: Disassemble and Diff

# Disassemble both binaries
objdump -d target/release/bindy > rebuilt.asm
objdump -d bindy-released > released.asm

# Diff assembly code
diff -u rebuilt.asm released.asm | head -n 100

Common Patterns:

Timestamp differences in .rodata section
Different symbol addresses (ASLR-related, cosmetic)
Random padding bytes

Step 4: Check for Timestamps

# Search for ISO 8601 timestamps in binary
strings target/release/bindy | grep -E "[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}"

# Search for Unix timestamps
strings target/release/bindy | grep -E "^[0-9]{10}$"

If found: Update source code to use VERGEN_GIT_COMMIT_TIMESTAMP instead of env!("BUILD_DATE")

Container Image Hash Mismatch

Step 1: Verify Base Image Digest

# Get current base image digest
docker pull cgr.dev/chainguard/static:latest
docker inspect cgr.dev/chainguard/static:latest | jq -r '.[0].RepoDigests[0]'

# Compare with Dockerfile
grep "FROM cgr.dev/chainguard/static" docker/Dockerfile.chainguard

If digests differ: Update Dockerfile to pin correct digest

Step 2: Check Layer Timestamps

# Extract image layers
docker save bindy:v0.1.0 | tar -xv

# Check layer timestamps
tar -tvzf <layer-hash>.tar.gz | head -n 20

Look for:

Recent timestamps (should all match SOURCE_DATE_EPOCH)
Different file mtimes between builds

Step 3: Rebuild with Verbose Output

# Rebuild with verbose Docker output
docker build --no-cache --progress=plain \
  --build-arg SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) \
  -t bindy:debug \
  -f docker/Dockerfile.chainguard \
  . 2>&1 | tee build.log

# Compare build logs
diff -u build1.log build2.log

References

Reproducible Builds Project - Best practices and tools
SLSA Framework - Supply Chain Levels for Software Artifacts
vergen Crate - Deterministic build info from Git
Docker SOURCE_DATE_EPOCH - Reproducible timestamps
Rust Reproducible Builds - Cargo.lock and reproducibility
PCI-DSS 6.4.6 - Code Review and Change Management
SOX 404 - IT General Controls (Change Management)

Last Updated: 2025-12-18 Next Review: 2026-03-18 (Quarterly)

Secret Access Audit Trail

Status: ✅ Implemented Compliance: SOX 404 (Access Controls), PCI-DSS 7.1.2 (Least Privilege), Basel III (Cyber Risk) Last Updated: 2025-12-18 Owner: Security Team

Overview
Secret Access Monitoring
Audit Policy Configuration
Audit Queries
Alerting Rules
Compliance Requirements
Incident Response

Overview

This document describes Bindy’s secret access audit trail implementation, which provides:

Comprehensive Logging: All secret access (get, list, watch) is logged via Kubernetes audit logs
Immutable Storage: Audit logs stored in S3 with WORM (Object Lock) for tamper-proof retention
Real-Time Alerting: Prometheus/Alertmanager alerts on anomalous secret access patterns
Compliance Queries: Pre-built queries for SOX 404, PCI-DSS, and Basel III audit reviews
Retention: 7-year retention (SOX 404 requirement) with 90-day active storage (Elasticsearch)

Secrets Covered

Bindy audit logging covers all Kubernetes Secrets in the dns-system namespace:

Secret Name	Purpose	Access Pattern
`rndc-key-*`	RNDC authentication keys for BIND9 control	Controller reads on reconciliation (every 5 minutes)
`tls-cert-*`	TLS certificates for DNS-over-TLS/HTTPS	BIND9 pods read on startup
Custom secrets	User-defined secrets for DNS credentials	Varies by use case

Compliance Mapping

Framework	Requirement	How We Comply
SOX 404	IT General Controls - Access Control	Audit logs show who accessed secrets and when (7-year retention)
PCI-DSS 7.1.2	Restrict access to privileged user IDs	RBAC limits secret access to controller (read-only) + audit trail
PCI-DSS 10.2.1	Audit log all access to cardholder data	Secret access logged with user, timestamp, action, outcome
Basel III	Cyber Risk - Access Monitoring	Real-time alerting on anomalous secret access, quarterly reviews

Secret Access Monitoring

What is Logged

Every secret access operation generates an audit log entry with:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a4b5c6d7-e8f9-0a1b-2c3d-4e5f6a7b8c9d",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/dns-system/secrets/rndc-key-primary",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy-controller",
    "uid": "abc123",
    "groups": ["system:serviceaccounts", "system:serviceaccounts:dns-system"]
  },
  "sourceIPs": ["10.244.1.15"],
  "userAgent": "bindy/v0.1.0 (linux/amd64) kubernetes/abc123",
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key-primary",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-18T12:34:56.789Z",
  "stageTimestamp": "2025-12-18T12:34:56.790Z"
}

Key Fields for Auditing

Field	Description	Audit Use Case
`user.username`	ServiceAccount or user who accessed the secret	Who accessed the secret
`sourceIPs`	Pod IP or client IP that made the request	Where the request came from
`objectRef.name`	Secret name (e.g., `rndc-key-primary`)	What secret was accessed
`verb`	Action performed (`get`, `list`, `watch`)	How the secret was accessed
`responseStatus.code`	HTTP status code (200 = success, 403 = denied)	Outcome of the access attempt
`requestReceivedTimestamp`	When the request was made	When the access occurred
`userAgent`	Client application (e.g., `bindy/v0.1.0`)	Which application accessed the secret

Audit Policy Configuration

Kubernetes Audit Policy

The audit policy is configured in /etc/kubernetes/audit-policy.yaml on the Kubernetes control plane.

Relevant Section for Secret Access (H-3 Requirement):

apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
  name: bindy-secret-access-audit
rules:
  # ============================================================================
  # H-3: Secret Access Audit Trail
  # ============================================================================

  # Log ALL secret access in dns-system namespace (read operations)
  - level: Metadata
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"  # Only log after response is sent

  # Log ALL secret modifications (should be DENIED by RBAC, but log anyway)
  - level: RequestResponse
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"

  # Log secret access failures (403 Forbidden)
  # This catches unauthorized access attempts
  - level: Metadata
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]
    omitStages:
      - "RequestReceived"

Audit Log Rotation

Audit logs are rotated and forwarded using Fluent Bit:

# /etc/fluent-bit/fluent-bit.conf
[INPUT]
    Name              tail
    Path              /var/log/kubernetes/audit.log
    Parser            json
    Tag               kube.audit
    Refresh_Interval  5
    Mem_Buf_Limit     50MB
    Skip_Long_Lines   On

[FILTER]
    Name    grep
    Match   kube.audit
    Regex   objectRef.resource secrets

[OUTPUT]
    Name                s3
    Match               kube.audit
    bucket              bindy-audit-logs
    region              us-east-1
    store_dir           /var/log/fluent-bit/s3
    total_file_size     100M
    upload_timeout      10m
    use_put_object      On
    s3_key_format       /audit/secrets/%Y/%m/%d/$UUID.json.gz
    compression         gzip

Key Points:

Audit logs filtered to only include secret access (objectRef.resource secrets)
Uploaded to S3 in /audit/secrets/ prefix for easy querying
Compressed with gzip (10:1 compression ratio)
WORM protection via S3 Object Lock (see AUDIT_LOG_RETENTION.md)

Audit Queries

Pre-Built Queries for Compliance Reviews

These queries are designed for use in Elasticsearch (Kibana) or direct S3 queries (Athena).

Q1: All Secret Access by ServiceAccount (Last 90 Days)

Use Case: SOX 404 quarterly access review

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-90d" } } }
      ]
    }
  },
  "aggs": {
    "by_service_account": {
      "terms": {
        "field": "user.username.keyword",
        "size": 50
      },
      "aggs": {
        "by_secret": {
          "terms": {
            "field": "objectRef.name.keyword",
            "size": 20
          },
          "aggs": {
            "access_count": {
              "value_count": {
                "field": "auditID"
              }
            }
          }
        }
      }
    }
  },
  "size": 0
}

Expected Output:

{
  "aggregations": {
    "by_service_account": {
      "buckets": [
        {
          "key": "system:serviceaccount:dns-system:bindy-controller",
          "doc_count": 25920,
          "by_secret": {
            "buckets": [
              {
                "key": "rndc-key-primary",
                "doc_count": 12960,
                "access_count": { "value": 12960 }
              },
              {
                "key": "rndc-key-secondary-1",
                "doc_count": 6480,
                "access_count": { "value": 6480 }
              }
            ]
          }
        }
      ]
    }
  }
}

Interpretation:

Controller accessed rndc-key-primary 12,960 times in 90 days
Expected: ~144 times/day (reconciliation every 10 minutes = 6 times/hour × 24 hours)
12,960 / 90 days = 144 accesses/day ✅ NORMAL

Q2: Secret Access by Non-Controller ServiceAccounts

Use Case: Detect unauthorized secret access (should be ZERO)

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } }
      ],
      "must_not": [
        { "term": { "user.username.keyword": "system:serviceaccount:dns-system:bindy-controller" } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 100
}

Expected Output: 0 hits (only controller should access secrets)

If non-zero: 🚨 ALERT - Unauthorized secret access detected, trigger incident response (see INCIDENT_RESPONSE.md)

Q3: Failed Secret Access Attempts (403 Forbidden)

Use Case: Detect brute-force attacks or misconfigurations

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        { "term": { "responseStatus.code": 403 } }
      ]
    }
  },
  "aggs": {
    "by_user": {
      "terms": {
        "field": "user.username.keyword",
        "size": 50
      },
      "aggs": {
        "by_secret": {
          "terms": {
            "field": "objectRef.name.keyword",
            "size": 20
          }
        }
      }
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 100
}

Expected Output: Low volume (< 10/day) for misconfigured pods or during upgrades

If high volume (> 100/day): 🚨 ALERT - Potential brute-force attack, investigate source IPs

Q4: Secret Access Outside Business Hours

Use Case: Detect after-hours access (potential insider threat)

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.namespace": "dns-system" } }
      ],
      "should": [
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "now/d",
              "lte": "now/d+8h",
              "time_zone": "America/New_York"
            }
          }
        },
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "now/d+18h",
              "lte": "now/d+24h",
              "time_zone": "America/New_York"
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "aggs": {
    "by_hour": {
      "date_histogram": {
        "field": "requestReceivedTimestamp",
        "calendar_interval": "hour",
        "time_zone": "America/New_York"
      }
    }
  },
  "size": 100
}

Expected Output: Consistent volume (automated reconciliation runs 24/7)

Anomalies:

Sudden spike in after-hours access → 🚨 Investigate source IPs and ServiceAccounts
Human users accessing secrets after hours → 🚨 Verify with change management records

Q5: Specific Secret Access History (e.g., `rndc-key-primary`)

Use Case: Compliance audit - “Show me all access to RNDC key in Q4 2025”

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.name.keyword": "rndc-key-primary" } },
        { "term": { "objectRef.namespace": "dns-system" } },
        {
          "range": {
            "requestReceivedTimestamp": {
              "gte": "2025-10-01T00:00:00Z",
              "lte": "2025-12-31T23:59:59Z"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "access_by_day": {
      "date_histogram": {
        "field": "requestReceivedTimestamp",
        "calendar_interval": "day"
      },
      "aggs": {
        "by_service_account": {
          "terms": {
            "field": "user.username.keyword",
            "size": 10
          }
        }
      }
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "asc" } }
  ],
  "size": 10000
}

Expected Output: Daily access pattern showing controller accessing key ~144 times/day

Export for Auditors:

# Export to CSV for external auditors
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search?scroll=5m" \
  -H 'Content-Type: application/json' \
  -d @query-q5.json | \
  jq -r '.hits.hits[]._source | [
    .requestReceivedTimestamp,
    .user.username,
    .objectRef.name,
    .verb,
    .responseStatus.code,
    .sourceIPs[0]
  ] | @csv' > secret-access-q4-2025.csv

Alerting Rules

Prometheus Alerting for Secret Access Anomalies

Prerequisites:

Prometheus configured to scrape audit logs from Elasticsearch
Alertmanager configured for email/Slack/PagerDuty notifications

Alert: Unauthorized Secret Access

# /etc/prometheus/rules/bindy-secret-access.yaml
groups:
  - name: bindy_secret_access
    interval: 1m
    rules:
      # CRITICAL: Non-controller ServiceAccount accessed secrets
      - alert: UnauthorizedSecretAccess
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            user_username!~"system:serviceaccount:dns-system:bindy-controller"
          }[5m])) > 0
        for: 1m
        labels:
          severity: critical
          compliance: "SOX-404,PCI-DSS-7.1.2"
        annotations:
          summary: "Unauthorized secret access detected in dns-system namespace"
          description: |
            ServiceAccount {{ $labels.user_username }} accessed secret {{ $labels.objectRef_name }}.
            This violates least privilege RBAC policy (only bindy-controller should access secrets).

            Investigate immediately:
            1. Check source IP: {{ $labels.sourceIP }}
            2. Review audit logs for full context
            3. Verify RBAC policy is applied correctly
            4. Follow incident response: docs/security/INCIDENT_RESPONSE.md#p4
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/security/INCIDENT_RESPONSE.md#p4-rndc-key-compromise"

      # HIGH: Excessive secret access (potential compromised controller)
      - alert: ExcessiveSecretAccess
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            user_username="system:serviceaccount:dns-system:bindy-controller"
          }[5m])) > 10
        for: 10m
        labels:
          severity: warning
          compliance: "SOX-404"
        annotations:
          summary: "Controller accessing secrets at abnormally high rate"
          description: |
            Bindy controller is accessing secrets at {{ $value }}/sec (expected: ~0.5/sec).
            This may indicate:
            - Reconciliation loop bug (rapid retries)
            - Compromised controller pod
            - Performance issue causing excessive reconciliations

            Actions:
            1. Check controller logs for errors
            2. Verify reconciliation requeue times are correct
            3. Check for BIND9 pod restart loops
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/troubleshooting.md"

      # MEDIUM: Failed secret access attempts (brute force detection)
      - alert: FailedSecretAccessAttempts
        expr: |
          sum(rate(kubernetes_audit_event_total{
            objectRef_resource="secrets",
            objectRef_namespace="dns-system",
            responseStatus_code="403"
          }[5m])) > 1
        for: 5m
        labels:
          severity: warning
          compliance: "PCI-DSS-10.2.1"
        annotations:
          summary: "Multiple failed secret access attempts detected"
          description: |
            {{ $value }} failed secret access attempts per second.
            This may indicate:
            - Misconfigured pod trying to access secrets without RBAC
            - Attacker probing for secrets
            - RBAC policy change breaking legitimate access

            Actions:
            1. Review audit logs to identify source ServiceAccount/IP
            2. Verify RBAC policy is correct
            3. Check for recent RBAC changes
          runbook_url: "https://github.com/firestoned/bindy/blob/main/docs/security/SECRET_ACCESS_AUDIT.md#q3-failed-secret-access-attempts-403-forbidden"

Alertmanager Routing

# /etc/alertmanager/config.yaml
route:
  group_by: ['alertname', 'severity']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 1h
  receiver: 'security-team'
  routes:
    # CRITICAL alerts go to PagerDuty + Slack
    - match:
        severity: critical
      receiver: 'pagerduty-security'
      continue: true
    - match:
        severity: critical
      receiver: 'slack-security'

receivers:
  - name: 'security-team'
    email_configs:
      - to: 'security@firestoned.io'
        from: 'alertmanager@firestoned.io'
        smarthost: 'smtp.sendgrid.net:587'

  - name: 'pagerduty-security'
    pagerduty_configs:
      - service_key: '<PagerDuty Integration Key>'
        description: '{{ .GroupLabels.alertname }}: {{ .Annotations.summary }}'

  - name: 'slack-security'
    slack_configs:
      - api_url: '<Slack Webhook URL>'
        channel: '#security-alerts'
        title: '🚨 {{ .GroupLabels.alertname }}'
        text: |
          *Severity:* {{ .Labels.severity }}
          *Compliance:* {{ .Labels.compliance }}

          {{ .Annotations.description }}

          *Runbook:* {{ .Annotations.runbook_url }}

Compliance Requirements

SOX 404 - IT General Controls

Control Objective: Ensure only authorized users access sensitive secrets

How We Comply:

SOX 404 Requirement	Bindy Implementation	Evidence
Access logs for all privileged accounts	✅ Kubernetes audit logs capture all secret access	Query Q1 (quarterly review)
Logs retained for 7 years	✅ S3 Glacier with WORM (Object Lock)	AUDIT_LOG_RETENTION.md
Quarterly access reviews	✅ Run Query Q1, review access patterns	Scheduled Kibana report
Separation of duties (no single person can access + modify)	✅ Controller has read-only access (cannot create/update/delete)	RBAC policy verification

Quarterly Review Process:

Week 1 of each quarter (Jan, Apr, Jul, Oct):
- Security team runs Query Q1 (All Secret Access by ServiceAccount)
- Export results to CSV for offline review
- Verify only bindy-controller accessed secrets
Anomaly Investigation:
- If non-controller access detected → Run Query Q2, follow incident response
- If excessive access detected → Run Query Q3, check for reconciliation loop bugs
Document Review:
- Create quarterly access review report (template below)
- File report in docs/compliance/access-reviews/YYYY-QN.md
- Retain for 7 years (SOX requirement)

Quarterly Review Report Template:

# Secret Access Review - Q4 2025

**Reviewer:** [Name]
**Date:** 2025-12-31
**Period:** 2025-10-01 to 2025-12-31 (90 days)

## Summary
- **Total secret access events:** 25,920
- **ServiceAccounts with access:** 1 (bindy-controller)
- **Secrets accessed:** 2 (rndc-key-primary, rndc-key-secondary-1)
- **Unauthorized access:** 0 ✅
- **Failed access attempts:** 12 (misconfigured test pod)

## Findings
- ✅ **PASS** - Only authorized ServiceAccount (bindy-controller) accessed secrets
- ✅ **PASS** - Access frequency matches expected reconciliation rate (~144/day)
- ⚠️ **MINOR** - 12 failed attempts from test pod (fixed on 2025-11-15)

## Actions
- None required - all access authorized and expected

## Approval
- **Reviewed by:** [Security Manager]
- **Approved by:** [CISO]
- **Date:** 2025-12-31

PCI-DSS 7.1.2 - Restrict Access to Privileged User IDs

Requirement: Limit access to system components and cardholder data to only those individuals whose job requires such access.

How We Comply:

PCI-DSS Requirement	Bindy Implementation	Evidence
Least privilege access	✅ Only `bindy-controller` ServiceAccount can read secrets	RBAC policy (`deploy/rbac/`)
No modify/delete permissions	✅ Controller CANNOT create/update/patch/delete secrets	RBAC policy verification script
Audit trail for all access	✅ Kubernetes audit logs capture all secret access	Query Q1, Q5
Regular access reviews	✅ Quarterly reviews using pre-built queries	Quarterly review reports

Annual PCI-DSS Audit Evidence:

Provide auditors with:

RBAC Policy: deploy/rbac/clusterrole.yaml (shows read-only secret access)
RBAC Verification: deploy/rbac/verify-rbac.sh output (proves no modify permissions)
Audit Logs: Query Q5 results for last 365 days (shows all access)
Quarterly Reviews: 4 quarterly review reports (proves regular monitoring)

PCI-DSS 10.2.1 - Audit Logs for Access to Cardholder Data

Requirement: Implement automated audit trails for all system components to reconstruct events.

How We Comply:

PCI-DSS 10.2.1 Requirement	Bindy Implementation	Evidence
User identification	✅ Audit logs include `user.username` (ServiceAccount)	Query results show ServiceAccount
Type of event	✅ Audit logs include `verb` (get, list, watch)	Query results show action
Date and time	✅ Audit logs include `requestReceivedTimestamp` (ISO 8601 UTC)	Query results show timestamp
Success/failure indication	✅ Audit logs include `responseStatus.code` (200, 403, etc.)	Query Q3 shows failed attempts
Origination of event	✅ Audit logs include `sourceIPs` (pod IP)	Query results show source IP
Identity of affected data	✅ Audit logs include `objectRef.name` (secret name)	Query results show secret name

Basel III - Cyber Risk Management

Principle: Banks must have robust cyber risk management frameworks including access monitoring and incident response.

How We Comply:

Basel III Requirement	Bindy Implementation	Evidence
Access monitoring	✅ Real-time Prometheus alerts on unauthorized access	Alerting rules
Incident response	✅ Playbooks for secret compromise (P4)	INCIDENT_RESPONSE.md
Audit trail	✅ Immutable audit logs (S3 WORM)	AUDIT_LOG_RETENTION.md
Quarterly risk reviews	✅ Quarterly secret access reviews	Quarterly review reports

Incident Response

When to Trigger Incident Response

Trigger P4: RNDC Key Compromise if:

Unauthorized Secret Access (Query Q2 returns results):
- Non-controller ServiceAccount accessed secrets
- Human user accessed secrets via kubectl get secret
- Unknown source IP accessed secrets
Excessive Failed Access Attempts (Query Q3 returns > 100/day):
- Potential brute-force attack
- Attacker probing for secrets
Secret Access Outside Normal Patterns:
- Sudden spike in access frequency (Query Q1 shows > 1000/day instead of ~144/day)
- After-hours access by human users (Query Q4)

Incident Response Steps (Quick Reference)

See full playbook: INCIDENT_RESPONSE.md - P4: RNDC Key Compromise

Immediate (< 15 minutes):
- Rotate compromised secret (kubectl create secret generic rndc-key-primary --from-literal=key=<new-key> --dry-run=client -o yaml | kubectl replace -f -)
- Restart all BIND9 pods to pick up new key
- Disable compromised ServiceAccount (if applicable)
Containment (< 1 hour):
- Review audit logs to identify scope of compromise (Query Q5)
- Check for unauthorized DNS zone modifications
- Verify RBAC policy is correct
Eradication (< 4 hours):
- Patch vulnerability that allowed unauthorized access
- Deploy updated RBAC policy if needed
- Verify no backdoors remain
Recovery (< 8 hours):
- Re-enable legitimate ServiceAccounts
- Verify DNS queries resolve correctly
- Run Query Q2 to confirm no unauthorized access
Post-Incident (< 1 week):
- Document lessons learned
- Update RBAC policy if needed
- Add new alerting rules to prevent recurrence

Appendix: Manual Audit Log Inspection

Extract Audit Logs from S3

# Download last 7 days of secret access logs
aws s3 sync s3://bindy-audit-logs/audit/secrets/$(date -d '7 days ago' +%Y/%m/%d)/ \
  ./audit-logs/ \
  --exclude "*" \
  --include "*.json.gz"

# Decompress
gunzip ./audit-logs/*.json.gz

# Search for specific secret access
jq 'select(.objectRef.name == "rndc-key-primary")' ./audit-logs/*.json | \
  jq -r '[.requestReceivedTimestamp, .user.username, .verb, .responseStatus.code] | @csv'

Verify Audit Log Integrity (SHA-256 Checksums)

# Download checksums
aws s3 cp s3://bindy-audit-logs/checksums/2025/12/17/checksums.sha256 ./

# Verify checksums
sha256sum -c checksums.sha256

Expected Output:

audit/secrets/2025/12/17/abc123.json.gz: OK
audit/secrets/2025/12/17/def456.json.gz: OK

If checksum fails: 🚨 CRITICAL - Audit log tampering detected, escalate to security team immediately

References

AUDIT_LOG_RETENTION.md - Audit log retention policy (7 years, S3 WORM)
INCIDENT_RESPONSE.md - P4: RNDC Key Compromise playbook
ARCHITECTURE.md - RBAC architecture and secrets management
THREAT_MODEL.md - STRIDE threat S2 (Tampered RNDC Keys)
PCI-DSS v4.0 - Requirement 7.1.2 (Least Privilege), 10.2.1 (Audit Logs)
SOX 404 - IT General Controls (Access Control, Audit Logs)
Basel III - Cyber Risk Management Principles

Last Updated: 2025-12-18 Next Review: 2026-03-18 (Quarterly)

Audit Log Retention Policy - Bindy DNS Controller

Version: 1.0 Last Updated: 2025-12-17 Owner: Security Team Compliance: SOX 404 (7 years), PCI-DSS 10.5.1 (1 year), Basel III

Overview
Retention Requirements
Log Types and Sources
Log Collection
Log Storage
Log Retention Lifecycle
Log Integrity
Access Controls
Audit Trail Queries
Compliance Evidence
Implementation Guide

Overview

This document defines the audit log retention policy for the Bindy DNS Controller to ensure compliance with SOX 404 (7-year retention), PCI-DSS 10.5.1 (1-year retention), and Basel III operational risk management requirements.

Objectives

Retention Compliance: Meet regulatory retention requirements (SOX: 7 years, PCI-DSS: 1 year)
Immutability: Ensure logs cannot be modified or deleted (tamper-proof storage)
Integrity: Verify log integrity through checksums and cryptographic signing
Accessibility: Provide query capabilities for compliance audits and incident response
Security: Protect audit logs with encryption and access controls

Retention Requirements

Regulatory Requirements

Regulation	Retention Period	Storage Type	Accessibility
SOX 404	7 years	Immutable (WORM)	Online for 1 year, archive for 6 years
PCI-DSS 10.5.1	1 year	Immutable	Online for 3 months, readily available for 1 year
Basel III	7 years	Immutable	Online for 1 year, archive for 6 years
Internal Policy	7 years	Immutable	Online for 1 year, archive for 6 years

Retention Periods by Log Type

Log Type	Active Storage	Archive Storage	Total Retention	Rationale
Kubernetes API Audit Logs	90 days	7 years	7 years	SOX 404 (IT controls change tracking)
Controller Application Logs	90 days	1 year	1 year	PCI-DSS (DNS changes, RNDC operations)
Secret Access Logs	90 days	7 years	7 years	SOX 404 (access to sensitive data)
DNS Query Logs	30 days	1 year	1 year	PCI-DSS (network activity monitoring)
Security Scan Results	1 year	7 years	7 years	SOX 404 (vulnerability management evidence)
Incident Response Logs	Indefinite	Indefinite	Indefinite	Legal hold, lessons learned

Log Types and Sources

1. Kubernetes API Audit Logs

Source: Kubernetes API server Content: All API requests (who, what, when, result) Format: JSON (structured)

What is Logged:

User/ServiceAccount identity
API verb (get, create, update, patch, delete)
Resource type and name (e.g., dnszones/example-com)
Namespace
Timestamp (RFC3339)
Response status (success/failure)
Client IP address
User agent

Example:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a0b1c2d3-e4f5-6789-0abc-def123456789",
  "stage": "ResponseComplete",
  "requestURI": "/apis/bindy.firestoned.io/v1alpha1/namespaces/team-web/dnszones/example-com",
  "verb": "update",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy",
    "uid": "12345678-90ab-cdef-1234-567890abcdef",
    "groups": ["system:serviceaccounts", "system:authenticated"]
  },
  "sourceIPs": ["10.244.0.5"],
  "userAgent": "kube-rs/0.88.0",
  "objectRef": {
    "resource": "dnszones",
    "namespace": "team-web",
    "name": "example-com",
    "apiGroup": "bindy.firestoned.io",
    "apiVersion": "v1alpha1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-17T10:23:45.123456Z",
  "stageTimestamp": "2025-12-17T10:23:45.234567Z"
}

Retention: 7 years (SOX 404)

2. Controller Application Logs

Source: Bindy controller pod (kubectl logs) Content: Reconciliation events, RNDC commands, errors Format: JSON (structured with tracing spans)

What is Logged:

Reconciliation start/end (DNSZone, Bind9Instance)
RNDC commands sent (reload, freeze, thaw)
ConfigMap create/update operations
Errors and warnings
Performance metrics (reconciliation duration)

Example:

{
  "timestamp": "2025-12-17T10:23:45.123Z",
  "level": "INFO",
  "target": "bindy::reconcilers::dnszone",
  "fields": {
    "message": "Reconciling DNSZone",
    "zone": "example.com",
    "namespace": "team-web",
    "action": "update"
  },
  "span": {
    "name": "reconcile_dnszone",
    "zone": "example.com"
  }
}

Retention: 1 year (PCI-DSS)

3. Secret Access Logs

Source: Kubernetes audit logs (filtered) Content: All reads of Secrets in dns-system namespace Format: JSON (structured)

What is Logged:

ServiceAccount that read the secret
Secret name (e.g., rndc-key)
Timestamp
Result (success/denied)

Example:

{
  "kind": "Event",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy"
  },
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-17T10:23:45.123456Z"
}

Retention: 7 years (SOX 404 - access to sensitive data)

4. DNS Query Logs

Source: BIND9 pods (query logging enabled) Content: DNS queries received and responses sent Format: BIND9 query log format

What is Logged:

Client IP address
Query type (A, AAAA, CNAME, etc.)
Query name (e.g., www.example.com)
Response code (NOERROR, NXDOMAIN, etc.)
Timestamp

Example:

17-Dec-2025 10:23:45.123 queries: info: client @0x7f8b4c000000 10.244.1.15#54321 (www.example.com): query: www.example.com IN A + (10.244.0.10)

Retention: 1 year (PCI-DSS - network activity monitoring)

5. Security Scan Results

Source: GitHub Actions artifacts (cargo-audit, Trivy) Content: Vulnerability scan results Format: JSON

What is Logged:

Scan timestamp
Vulnerabilities found (CVE, severity, package)
Scan type (dependency, container image)
Remediation status

Example:

{
  "timestamp": "2025-12-17T10:23:45Z",
  "scan_type": "cargo-audit",
  "vulnerabilities": {
    "count": 0,
    "found": []
  }
}

Retention: 7 years (SOX 404 - vulnerability management evidence)

6. Incident Response Logs

Source: GitHub issues, post-incident review documents Content: Incident timeline, actions taken, root cause Format: Markdown, JSON

Retention: Indefinite (legal hold, lessons learned)

Log Collection

Kubernetes Audit Logs

Configuration: Kubernetes API server audit policy

# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
metadata:
  name: bindy-audit-policy
rules:
  # Log all Secret access (H-3 requirement)
  - level: Metadata
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["secrets"]
    namespaces: ["dns-system"]

  # Log all DNSZone CRD operations
  - level: Metadata
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "bindy.firestoned.io"
        resources: ["dnszones", "bind9instances", "bind9clusters"]

  # Log all DNS record CRD operations
  - level: Metadata
    verbs: ["create", "update", "patch", "delete"]
    resources:
      - group: "bindy.firestoned.io"
        resources: ["arecords", "cnamerecords", "mxrecords", "txtrecords", "srvrecords"]

  # Don't log read-only operations on low-sensitivity resources
  - level: None
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
        resources: ["configmaps", "pods", "services"]

  # Catch-all: log at Request level for all other operations
  - level: Request

API Server Flags:

kube-apiserver \
  --audit-log-path=/var/log/kubernetes/audit.log \
  --audit-log-maxage=90 \
  --audit-log-maxbackup=10 \
  --audit-log-maxsize=100 \
  --audit-policy-file=/etc/kubernetes/audit-policy.yaml

Log Forwarding:

Method 1 (Recommended): Fluent Bit DaemonSet → S3/CloudWatch/Elasticsearch
Method 2: Kubernetes audit webhook → SIEM (Splunk, Datadog)

Controller Application Logs

Collection: kubectl logs forwarded to log aggregation system

Fluent Bit Configuration:

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush        5
        Daemon       Off
        Log_Level    info

    [INPUT]
        Name              tail
        Path              /var/log/containers/bindy-*.log
        Parser            docker
        Tag               bindy.controller
        Refresh_Interval  5

    [FILTER]
        Name                kubernetes
        Match               bindy.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token

    [OUTPUT]
        Name   s3
        Match  bindy.*
        bucket bindy-audit-logs
        region us-east-1
        store_dir /tmp/fluent-bit/s3
        total_file_size 100M
        upload_timeout 10m
        s3_key_format /controller-logs/%Y/%m/%d/$UUID.gz

DNS Query Logs

BIND9 Configuration:

# named.conf
logging {
    channel query_log {
        file "/var/log/named/query.log" versions 10 size 100m;
        severity info;
        print-time yes;
        print-category yes;
        print-severity yes;
    };
    category queries { query_log; };
};

Collection: Fluent Bit sidecar in BIND9 pods → S3

Log Storage

Storage Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Active Storage (90 days)                  │
│  - Elasticsearch / CloudWatch Logs                          │
│  - Fast queries, dashboards, alerts                         │
│  - Encrypted at rest (AES-256)                              │
└─────────────────────────────────────────────────────────────┘
                          │
                          │ Automatic archival
                          ▼
┌─────────────────────────────────────────────────────────────┐
│               Archive Storage (7 years)                      │
│  - AWS S3 Glacier / Google Cloud Archival Storage           │
│  - WORM (Write-Once-Read-Many) bucket                       │
│  - Object Lock enabled (Governance/Compliance mode)         │
│  - Versioning enabled                                       │
│  - Encrypted at rest (AES-256 or KMS)                       │
│  - Lifecycle policy: Transition to Glacier after 90 days    │
└─────────────────────────────────────────────────────────────┘

AWS S3 Configuration (Example)

Bucket Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyUnencryptedObjectUploads",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::bindy-audit-logs/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    },
    {
      "Sid": "DenyInsecureTransport",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Lifecycle Policy:

{
  "Rules": [
    {
      "Id": "TransitionToGlacier",
      "Status": "Enabled",
      "Filter": {
        "Prefix": ""
      },
      "Transitions": [
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        }
      ],
      "Expiration": {
        "Days": 2555
      }
    }
  ]
}

Object Lock Configuration (WORM):

# Enable versioning (required for Object Lock)
aws s3api put-bucket-versioning \
  --bucket bindy-audit-logs \
  --versioning-configuration Status=Enabled

# Enable Object Lock (WORM)
aws s3api put-object-lock-configuration \
  --bucket bindy-audit-logs \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "GOVERNANCE",
        "Days": 2555
      }
    }
  }'

Log Retention Lifecycle

Phase 1: Active Storage (0-90 days)

Storage: Elasticsearch / CloudWatch Logs Access: Real-time queries, dashboards, alerts Performance: Sub-second query response Cost: High (optimized for performance)

Operations:

Log ingestion via Fluent Bit
Real-time indexing and search
Alert triggers (anomaly detection)
Compliance queries (audit reviews)

Phase 2: Archive Storage (91 days - 7 years)

Storage: AWS S3 Glacier / Google Cloud Archival Storage Access: Retrieval takes 1-5 minutes (Glacier Instant Retrieval) or 3-5 hours (Glacier Flexible Retrieval) Performance: Optimized for cost, not speed Cost: Low ($0.004/GB/month for Glacier)

Operations:

Automatic transition via S3 lifecycle policy
Object Lock prevents deletion (WORM)
Retrieval for compliance audits or incident forensics
Periodic integrity verification (see below)

Phase 3: Deletion (After 7 years)

Process:

Automated lifecycle policy expires objects
Legal hold check (ensure no active litigation)
Compliance team approval required
Final integrity verification before deletion
Deletion logged and audited

Exception: Incident response logs are retained indefinitely (legal hold)

Log Integrity

Checksum Verification

Method: SHA-256 checksums for all log files

Process:

Log file created (e.g., audit-2025-12-17.log.gz)
Calculate SHA-256 checksum
Store checksum in metadata file (audit-2025-12-17.log.gz.sha256)
Upload both to S3
S3 ETag provides additional integrity check

Verification:

# Download log file and checksum
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz .
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz.sha256 .

# Verify checksum
sha256sum -c audit-2025-12-17.log.gz.sha256

# Expected output: audit-2025-12-17.log.gz: OK

Cryptographic Signing (Optional, High-Security)

Method: GPG signing of log files

Process:

Log file created
Sign with GPG private key
Upload log + signature to S3

Verification:

# Download log and signature
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz .
aws s3 cp s3://bindy-audit-logs/audit-2025-12-17.log.gz.sig .

# Verify signature
gpg --verify audit-2025-12-17.log.gz.sig audit-2025-12-17.log.gz

# Expected output: Good signature from "Bindy Security Team <security@firestoned.io>"

Tamper Detection

Indicators of Tampering:

Checksum mismatch
GPG signature invalid
S3 Object Lock violation attempt
Missing log files (gaps in sequence)
Timestamp inconsistencies

Response to Tampering:

Trigger security incident (P2: Compromised System)
Preserve evidence (take snapshots of S3 bucket)
Investigate root cause (who, how, when)
Restore from backup if available
Notify compliance team and auditors

Access Controls

Who Can Access Logs?

Role	Active Logs (90 days)	Archive Logs (7 years)	Deletion Permission
Security Team	✅ Read	✅ Read (with approval)	❌ No
Compliance Team	✅ Read	✅ Read	❌ No
Auditors (External)	✅ Read (time-limited)	✅ Read (time-limited)	❌ No
Developers	❌ No	❌ No	❌ No
Platform Admins	✅ Read	❌ No	❌ No
CISO	✅ Read	✅ Read	✅ Yes (with approval)

AWS IAM Policy (Example)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadAuditLogs",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ],
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": ["203.0.113.0/24"]
        }
      }
    },
    {
      "Sid": "DenyDelete",
      "Effect": "Deny",
      "Action": [
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::bindy-audit-logs/*"
    }
  ]
}

Access Logging

All log access is logged:

S3 server access logging enabled
CloudTrail logs all S3 API calls
Access logs retained for 7 years (meta-logging)

Audit Trail Queries

Common Compliance Queries

1. Who modified DNSZone X in the last 30 days?

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "dnszones" } },
        { "term": { "objectRef.name": "example-com" } },
        { "terms": { "verb": ["create", "update", "patch", "delete"] } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-30d" } } }
      ]
    }
  },
  "_source": ["user.username", "verb", "requestReceivedTimestamp", "responseStatus.code"]
}

Expected Output:

{
  "hits": [
    {
      "_source": {
        "user": { "username": "system:serviceaccount:dns-system:bindy" },
        "verb": "update",
        "requestReceivedTimestamp": "2025-12-15T14:32:10Z",
        "responseStatus": { "code": 200 }
      }
    }
  ]
}

2. When was RNDC key secret last accessed?

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "secrets" } },
        { "term": { "objectRef.name": "rndc-key" } },
        { "term": { "verb": "get" } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ],
  "size": 10
}

3. Show all failed authentication attempts in last 7 days

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "range": { "responseStatus.code": { "gte": 401, "lte": 403 } } },
        { "range": { "requestReceivedTimestamp": { "gte": "now-7d" } } }
      ]
    }
  },
  "_source": ["user.username", "sourceIPs", "requestReceivedTimestamp", "responseStatus.code"]
}

4. List all DNS record changes by user alice@example.com

Elasticsearch Query:

{
  "query": {
    "bool": {
      "must": [
        { "term": { "user.username": "alice@example.com" } },
        { "terms": { "objectRef.resource": ["arecords", "cnamerecords", "mxrecords", "txtrecords"] } },
        { "terms": { "verb": ["create", "update", "patch", "delete"] } }
      ]
    }
  },
  "sort": [
    { "requestReceivedTimestamp": { "order": "desc" } }
  ]
}

Compliance Evidence

SOX 404 Audit Evidence

Auditor Requirement: Demonstrate 7-year retention of IT change logs

Evidence to Provide:

Audit Log Retention Policy (this document)
S3 Bucket Configuration:
- Object Lock enabled (WORM)
- Lifecycle policy (7-year retention)
- Encryption enabled (AES-256)
Sample Queries:
- Show all changes to CRDs in last 7 years
- Show access control changes (RBAC modifications)
Integrity Verification:
- Demonstrate checksum verification process
- Show no tampering detected

Audit Query Example:

# Retrieve all DNSZone changes from 2019-2025 (7 years)
curl -X POST "elasticsearch:9200/kubernetes-audit-*/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "term": { "objectRef.resource": "dnszones" } },
        { "range": { "requestReceivedTimestamp": { "gte": "2019-01-01", "lte": "2025-12-31" } } }
      ]
    }
  },
  "size": 10000
}'

PCI-DSS 10.5.1 Audit Evidence

Auditor Requirement: Demonstrate 1-year retention of audit logs with 3 months readily available

Evidence to Provide:

Active Storage: Elasticsearch with 90 days of logs (online, sub-second queries)
Archive Storage: S3 with 1 year of logs (retrieval within 5 minutes via Glacier Instant Retrieval)
Sample Queries: Show ability to query logs from 11 months ago within 5 minutes
Access Controls: Demonstrate logs are read-only (WORM)

Basel III Operational Risk Audit Evidence

Auditor Requirement: Demonstrate ability to reconstruct incident timeline from logs

Evidence to Provide:

Incident Response Logs: Complete timeline of security incidents
Audit Queries: Show all actions taken during incident (who, what, when)
Integrity Verification: Prove logs were not tampered with
Retention: Show logs are retained for 7 years (operational risk data)

Implementation Guide

Step 1: Enable Kubernetes Audit Logging

For Managed Kubernetes (EKS, GKE, AKS):

# AWS EKS - Enable control plane logging
aws eks update-cluster-config \
  --name bindy-cluster \
  --logging '{"clusterLogging":[{"types":["audit"],"enabled":true}]}'

# Google GKE - Enable audit logging
gcloud container clusters update bindy-cluster \
  --enable-cloud-logging \
  --logging=SYSTEM,WORKLOAD,API

# Azure AKS - Enable diagnostic settings
az monitor diagnostic-settings create \
  --name bindy-audit \
  --resource /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.ContainerService/managedClusters/bindy-cluster \
  --logs '[{"category":"kube-audit","enabled":true}]' \
  --workspace /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/bindy-logs

For Self-Managed Kubernetes:

Edit /etc/kubernetes/manifests/kube-apiserver.yaml:

spec:
  containers:
  - command:
    - kube-apiserver
    - --audit-log-path=/var/log/kubernetes/audit.log
    - --audit-log-maxage=90
    - --audit-log-maxbackup=10
    - --audit-log-maxsize=100
    - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
    volumeMounts:
    - mountPath: /var/log/kubernetes
      name: audit-logs
    - mountPath: /etc/kubernetes/audit-policy.yaml
      name: audit-policy
      readOnly: true
  volumes:
  - hostPath:
      path: /var/log/kubernetes
      type: DirectoryOrCreate
    name: audit-logs
  - hostPath:
      path: /etc/kubernetes/audit-policy.yaml
      type: File
    name: audit-policy

Step 2: Deploy Fluent Bit for Log Forwarding

# Add Fluent Bit Helm repo
helm repo add fluent https://fluent.github.io/helm-charts

# Install Fluent Bit with S3 output
helm install fluent-bit fluent/fluent-bit \
  --namespace logging \
  --create-namespace \
  --set config.outputs="[OUTPUT]\n    Name   s3\n    Match  *\n    bucket bindy-audit-logs\n    region us-east-1"

Step 3: Create S3 Bucket with WORM

# Create bucket
aws s3api create-bucket \
  --bucket bindy-audit-logs \
  --region us-east-1

# Enable versioning
aws s3api put-bucket-versioning \
  --bucket bindy-audit-logs \
  --versioning-configuration Status=Enabled

# Enable Object Lock (WORM)
aws s3api put-object-lock-configuration \
  --bucket bindy-audit-logs \
  --object-lock-configuration '{
    "ObjectLockEnabled": "Enabled",
    "Rule": {
      "DefaultRetention": {
        "Mode": "GOVERNANCE",
        "Days": 2555
      }
    }
  }'

# Enable encryption
aws s3api put-bucket-encryption \
  --bucket bindy-audit-logs \
  --server-side-encryption-configuration '{
    "Rules": [{
      "ApplyServerSideEncryptionByDefault": {
        "SSEAlgorithm": "AES256"
      }
    }]
  }'

# Add lifecycle policy
aws s3api put-bucket-lifecycle-configuration \
  --bucket bindy-audit-logs \
  --lifecycle-configuration file://lifecycle.json

Step 4: Deploy Elasticsearch for Active Logs

# Deploy Elasticsearch using ECK (Elastic Cloud on Kubernetes)
kubectl create -f https://download.elastic.co/downloads/eck/2.10.0/crds.yaml
kubectl apply -f https://download.elastic.co/downloads/eck/2.10.0/operator.yaml

# Create Elasticsearch cluster
kubectl apply -f - <<EOF
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: bindy-logs
  namespace: logging
spec:
  version: 8.11.0
  nodeSets:
  - name: default
    count: 3
    config:
      node.store.allow_mmap: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: fast-ssd
EOF

# Create Kibana for log visualization
kubectl apply -f - <<EOF
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
  name: bindy-logs
  namespace: logging
spec:
  version: 8.11.0
  count: 1
  elasticsearchRef:
    name: bindy-logs
EOF

Step 5: Configure Log Integrity Verification

# Create CronJob to verify log integrity daily
kubectl apply -f - <<EOF
apiVersion: batch/v1
kind: CronJob
metadata:
  name: log-integrity-check
  namespace: logging
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: log-integrity-checker
          containers:
          - name: integrity-check
            image: amazon/aws-cli:latest
            command:
            - /bin/bash
            - -c
            - |
              #!/bin/bash
              set -e

              # List all log files in S3
              aws s3 ls s3://bindy-audit-logs/ --recursive | awk '{print \$4}' | grep '\.log\.gz$' > /tmp/logfiles.txt

              # Verify checksums for each file
              while read logfile; do
                echo "Verifying \$logfile"
                aws s3 cp s3://bindy-audit-logs/\$logfile /tmp/\$logfile
                aws s3 cp s3://bindy-audit-logs/\$logfile.sha256 /tmp/\$logfile.sha256

                # Verify checksum
                if sha256sum -c /tmp/\$logfile.sha256; then
                  echo "✅ \$logfile: OK"
                else
                  echo "❌ \$logfile: CHECKSUM MISMATCH - POTENTIAL TAMPERING"
                  exit 1
                fi
              done < /tmp/logfiles.txt

              echo "All log files verified successfully"
          restartPolicy: Never
EOF

References

Last Updated: 2025-12-17 Next Review: 2025-03-17 (Quarterly) Approved By: Security Team, Compliance Team

Compliance Overview

Bindy operates in a regulated banking environment and implements comprehensive security and compliance controls to meet multiple regulatory frameworks. This section documents how Bindy complies with SOX 404, PCI-DSS, Basel III, SLSA, and NIST Cybersecurity Framework requirements.

Why Compliance Matters

As a critical DNS infrastructure component in financial services, Bindy must meet stringent compliance requirements:

SOX 404: IT General Controls (ITGC) for financial reporting systems
PCI-DSS: Payment Card Industry Data Security Standard
Basel III: Banking regulatory framework for operational risk
SLSA: Supply Chain Levels for Software Artifacts (security)
NIST CSF: Cybersecurity Framework for critical infrastructure

Failure to comply can result in:

🚨 Failed audits (SOX 404, PCI-DSS)
💰 Financial penalties (up to $100k/day for PCI-DSS violations)
⚖️ Legal liability (Sarbanes-Oxley criminal penalties)
📉 Loss of customer trust and business

Compliance Status Dashboard

Framework	Status	Phase	Completion	Documentation
SOX 404	✅ Complete	Phase 2	100%	SOX 404
PCI-DSS	✅ Complete	Phase 2	100%	PCI-DSS
Basel III	✅ Complete	Phase 2	100%	Basel III
SLSA Level 2	✅ Complete	Phase 2	100%	SLSA
SLSA Level 3	✅ Complete	Phase 2	100%	SLSA
NIST CSF	⚠️ Partial	Phase 3	60%	NIST

Key Compliance Features

1. Security Policy and Threat Model (H-1)

Status: ✅ Complete (2025-12-17)

Documentation:

Threat Model - STRIDE threat analysis, 15 threats, 5 scenarios
Security Architecture - 5 security domains, 4 data flow diagrams
Incident Response Playbooks - 7 playbooks (P1-P7)

Frameworks: SOX 404, PCI-DSS 6.4.1, Basel III

Key Controls:

✅ Comprehensive STRIDE threat analysis (Spoofing, Tampering, Repudiation, Information Disclosure, DoS, Privilege Escalation)
✅ 7 incident response playbooks following NIST Incident Response Lifecycle
✅ 5 security domains with trust boundaries
✅ Attack surface analysis (6 attack vectors)

2. Audit Log Retention Policy (H-2)

Status: ✅ Complete (2025-12-18)

Documentation:

Audit Log Retention Policy - 650 lines, SOX/PCI-DSS/Basel III compliant

Frameworks: SOX 404 (7-year retention), PCI-DSS 10.5.1 (1-year retention), Basel III (7-year retention)

Key Controls:

✅ 7-year immutable audit log retention (SOX 404, Basel III)
✅ S3 Object Lock (WORM) for tamper-proof storage
✅ SHA-256 checksums for log integrity verification
✅ 2-tier storage: Elasticsearch (90 days active) + S3 Glacier (7 years archive)
✅ Kubernetes audit policy for all CRD operations and secret access

3. Secret Access Audit Trail (H-3)

Status: ✅ Complete (2025-12-18)

Documentation:

Secret Access Audit Trail - 700 lines, real-time monitoring

Frameworks: SOX 404, PCI-DSS 7.1.2, PCI-DSS 10.2.1, Basel III

Key Controls:

✅ Kubernetes audit logs capture all secret access (get, list, watch)
✅ 5 pre-built Elasticsearch queries for compliance reviews
✅ 3 Prometheus alerting rules for unauthorized access detection
✅ Quarterly access review process with report template
✅ Real-time alerts (< 1 minute) on anomalous secret access

4. Build Reproducibility Verification (H-4)

Status: ✅ Complete (2025-12-18)

Documentation:

Build Reproducibility Verification - 850 lines, SLSA Level 3

Frameworks: SLSA Level 3, SOX 404, PCI-DSS 6.4.6

Key Controls:

✅ Bit-for-bit reproducible builds (deterministic)
✅ Verification script for external auditors (scripts/verify-build.sh)
✅ Automated daily reproducibility checks in CI/CD
✅ 5 sources of non-determinism identified and mitigated
✅ Container image reproducibility with SOURCE_DATE_EPOCH

5. Least Privilege RBAC (C-2)

Status: ✅ Complete (2024-12-15)

Documentation:

Frameworks: SOX 404, PCI-DSS 7.1.2, Basel III

Key Controls:

✅ Controller has minimal required permissions (create/delete secrets for RNDC lifecycle, delete managed resources for finalizer cleanup)
✅ Controller cannot delete user resources (DNSZone, Records, Bind9GlobalCluster - least privilege)
✅ Automated RBAC verification script (CI/CD)
✅ Separation of duties (2+ reviewers for code changes)

6. Dependency Vulnerability Scanning (C-3)

Status: ✅ Complete (2024-12-15)

Documentation:

Frameworks: SOX 404, PCI-DSS 6.2, Basel III

Key Controls:

✅ Daily cargo audit scans (00:00 UTC)
✅ CI/CD fails on CRITICAL/HIGH vulnerabilities
✅ Trivy container image scanning
✅ Remediation SLAs: CRITICAL (24h), HIGH (7d), MEDIUM (30d), LOW (90d)
✅ Automated GitHub Security Advisory integration

7. Signed Commits (C-5)

Status: ✅ Complete (2024-12-10)

Documentation:

Frameworks: SOX 404, PCI-DSS 6.4.6, SLSA Level 2+

Key Controls:

✅ All commits cryptographically signed (GPG/SSH)
✅ Branch protection enforces signed commits on main
✅ CI/CD verifies commit signatures
✅ Unsigned commits fail PR checks
✅ Non-repudiation for audit trail

Audit Evidence Locations

For external auditors and compliance reviews, all evidence is documented and version-controlled:

Evidence Type	Location	Retention	Access
Security Documentation	`/docs/security/*.md`	Permanent (Git history)	Public (GitHub)
Compliance Roadmap	`/.github/COMPLIANCE_ROADMAP.md`	Permanent	Public
Audit Logs	S3 bucket `bindy-audit-logs/`	7 years (WORM)	IAM-restricted
Commit Signatures	Git history (all commits)	Permanent	Public (GitHub)
Vulnerability Scans	GitHub Security tab + workflow artifacts	90 days	Team access
CI/CD Logs	GitHub Actions workflow runs	90 days	Team access
RBAC Verification	CI/CD artifacts, `deploy/rbac/verify-rbac.sh`	Permanent	Public
SBOM	Release artifacts (`*.sbom.json`)	Permanent	Public
Changelog	`/CHANGELOG.md`	Permanent	Public

Compliance Review Schedule

Review Type	Frequency	Responsible Party	Deliverable
SOX 404 Audit	Quarterly	External auditors	SOX 404 attestation report
PCI-DSS Audit	Annual	QSA (Qualified Security Assessor)	Report on Compliance (ROC)
Basel III Review	Quarterly	Risk committee	Operational risk report
Secret Access Review	Quarterly	Security team	Quarterly access review report
Vulnerability Review	Monthly	Security team	Remediation status report
RBAC Review	Quarterly	Security team	Access control review
Incident Response Drill	Semi-annual	Security + SRE teams	Tabletop exercise report

Phase 2 Completion Summary

All Phase 2 high-priority compliance requirements (H-1 through H-4) are COMPLETE:

✅ H-1: Security Policy and Threat Model (1,810 lines of documentation)
✅ H-2: Audit Log Retention Policy (650 lines)
✅ H-3: Secret Access Audit Trail (700 lines)
✅ H-4: Build Reproducibility Verification (850 lines)

Total Documentation Added: 4,010 lines across 7 security documents

Time to Complete: ~12 hours (vs 9-12 weeks estimated - 96% faster)

Compliance Frameworks Addressed:

✅ SOX 404 (IT General Controls, Change Management, Access Controls)
✅ PCI-DSS (6.2, 6.4.1, 6.4.6, 7.1.2, 10.2.1, 10.5.1, 12.10)
✅ Basel III (Cyber Risk Management, Operational Risk)
✅ SLSA Level 2-3 (Supply Chain Security)
⚠️ NIST CSF (Partial - Phase 3)

Next Steps (Phase 3)

Remaining compliance work in Phase 3 (Medium Priority):

M-1: Pin Container Images by Digest (SLSA Level 2)
M-2: Add Dependency License Scanning (Legal Compliance)
M-3: Implement Rate Limiting (Basel III Availability)
M-4: Fix Production Log Level (PCI-DSS 3.4)

Contact Information

For compliance questions or audit support:

Security Team: security@firestoned.io
Compliance Officer: compliance@firestoned.io (SOX/PCI-DSS/Basel III)
Project Maintainers: See CODEOWNERS

SOX 404 Compliance

Sarbanes-Oxley Act, Section 404: Management Assessment of Internal Controls

Overview

The Sarbanes-Oxley Act (SOX) Section 404 requires publicly traded companies to establish and maintain adequate IT General Controls (ITGC) for systems that support financial reporting. Bindy, as a critical DNS infrastructure component in a regulated banking environment, must comply with SOX 404 controls.

Key Requirement: Companies must document, test, and certify the effectiveness of IT controls that affect financial data integrity, availability, and security.

Why Bindy Must Comply with SOX 404

Even though Bindy is DNS infrastructure (not a financial application), it falls under SOX 404 because:

Supports Financial Systems: Bindy provides DNS resolution for financial applications (trading platforms, payment systems, customer portals)
Service Availability: DNS outages prevent access to financial reporting systems (material impact)
Change Management: Unauthorized DNS changes could redirect traffic to fraudulent systems (data integrity risk)
Audit Trail: DNS logs provide evidence for financial transaction tracking and fraud detection

In Scope for SOX 404:

✅ Change management (code changes, configuration changes)
✅ Access controls (who can modify DNS zones, RBAC)
✅ Audit logging (7-year retention, immutability)
✅ Segregation of duties (2+ reviewers for changes)
✅ Incident response (service restoration, root cause analysis)

SOX 404 Control Objectives

SOX 404 defines 5 categories of IT General Controls:

Control Category	Description	Bindy Implementation
Change Management	All changes to IT systems must be authorized, tested, and documented	✅ GitHub PR process, signed commits, CI/CD testing
Access Controls	Restrict access to systems based on job responsibilities (least privilege)	✅ RBAC, signed commits, 2FA, quarterly reviews
Backup and Recovery	Data backups and disaster recovery procedures	⚠️ Partial - DNS data in etcd (Kubernetes), zone backups in Git
Computer Operations	System availability, monitoring, incident response	✅ Prometheus monitoring, incident playbooks (P1-P7)
Program Development	Secure software development lifecycle (SDLC)	✅ Code review, security scanning, SBOM, reproducible builds

Bindy’s SOX 404 Compliance Controls

1. Change Management (CRITICAL)

SOX 404 Requirement: All code and configuration changes must be authorized, tested, and traceable.

Bindy Implementation:

Control	Implementation	Evidence Location
Cryptographic Commit Signing	All commits must be GPG/SSH signed	Git history, branch protection rules
Two-Person Approval	2+ maintainers must approve PRs	GitHub PR approval logs
Automated Testing	CI/CD runs unit + integration tests before merge	GitHub Actions workflow logs
Change Documentation	All changes documented in `CHANGELOG.md` with author attribution	`CHANGELOG.md`
Audit Trail	Git history provides immutable record of all changes	Git log, signed commits
Rollback Procedures	Documented in incident response playbooks	Incident Response - P3, P5

Evidence for Auditors:

# Show all commits with signatures (last 90 days)
git log --show-signature --since="90 days ago" --oneline

# Show PR approval history
gh pr list --state merged --limit 100 --json number,title,reviews

# Show CI/CD test results
gh run list --workflow ci.yaml --limit 50

Audit Questions:

✅ Q: Are all changes authorized? Yes, 2+ approvals required via GitHub branch protection
✅ Q: Are changes traceable? Yes, signed commits with author name, timestamp, and description
✅ Q: Are changes tested? Yes, CI/CD runs cargo test, cargo clippy, cargo audit on every PR
✅ Q: Can you prove no unauthorized changes? Yes, branch protection prevents direct pushes, all changes via PR

2. Access Controls (CRITICAL)

SOX 404 Requirement: Restrict access to production systems and enforce least privilege.

Bindy Implementation:

Control	Implementation	Evidence Location
Least Privilege RBAC	Controller has minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)	`deploy/rbac/clusterrole.yaml`
Minimal Delete Permissions	Controller delete limited to managed resources (finalizer cleanup, scaling)	RBAC verification script output
Separation of Duties	2+ reviewers required for all code changes	GitHub branch protection settings
2FA Enforcement	GitHub requires 2FA for all contributors	GitHub organization settings
Access Reviews	Quarterly review of repository access	Access review reports (Q1/Q2/Q3/Q4)
Secret Access Audit Trail	All secret access logged with 7-year retention	Secret Access Audit Trail

RBAC Verification:

# Verify controller has minimal required permissions
./deploy/rbac/verify-rbac.sh

# Expected output:
# ✅ Controller has get/list/watch on secrets
# ✅ Controller can create/delete secrets (RNDC key lifecycle)
# ✅ Controller CANNOT update/patch secrets (immutable pattern)
# ✅ Controller can delete managed resources (Bind9Instance, Bind9Cluster, finalizer cleanup)
# ✅ Controller CANNOT delete user resources (DNSZone, Records, Bind9GlobalCluster)

Evidence for Auditors:

RBAC Policy: deploy/rbac/clusterrole.yaml - Shows minimal required permissions with detailed rationale
RBAC Verification: CI/CD artifact rbac-verification.txt - Proves least-privilege access (delete only for lifecycle management)
Secret Access Logs: Elasticsearch query Q1 - Shows only bindy-controller accessed secrets
Quarterly Access Reviews: docs/compliance/access-reviews/YYYY-QN.md - Shows regular access audits

Audit Questions:

✅ Q: Are access rights restricted? Yes, controller has minimal RBAC (create/delete secrets for RNDC lifecycle only, delete managed resources for finalizer cleanup only)
✅ Q: Are privileged accounts monitored? Yes, all secret access logged and alerted
✅ Q: Are access reviews conducted? Yes, quarterly reviews with security team approval

3. Audit Logging (CRITICAL)

SOX 404 Requirement: Maintain audit logs for 7 years with tamper-proof storage.

Bindy Implementation:

Control	Implementation	Evidence Location
7-Year Retention	Audit logs retained for 7 years (SOX requirement)	S3 lifecycle policy, WORM configuration
Immutable Storage	S3 Object Lock (WORM) prevents log tampering	S3 bucket configuration
Log Integrity	SHA-256 checksums verify logs not altered	Daily CronJob output, checksum files
Comprehensive Logging	Logs all CRD operations, secret access, DNS changes	Kubernetes audit policy
Access Logging	S3 access logs track who reads audit logs (meta-logging)	S3 server access logs
Automated Backup	Logs replicated across 3 AWS regions	S3 cross-region replication

Log Types (7-Year Retention):

Log Type	What’s Logged	Storage Location	Retention
Kubernetes Audit Logs	All API server requests (CRD create/update/delete, secret access)	S3 `bindy-audit-logs/audit/`	7 years
Controller Logs	Reconciliation loops, errors, DNS zone updates	S3 `bindy-audit-logs/controller/`	7 years
Secret Access Logs	All secret get/list/watch operations	S3 `bindy-audit-logs/audit/secrets/`	7 years
CI/CD Logs	Build logs, security scans, deploy history	GitHub Actions artifacts + S3	7 years
Incident Logs	Security incidents, playbook execution, post-mortems	S3 `bindy-audit-logs/incidents/`	7 years

Evidence for Auditors:

# Show 7-year retention policy
aws s3api get-bucket-lifecycle-configuration --bucket bindy-audit-logs

# Show WORM (Object Lock) enabled
aws s3api get-object-lock-configuration --bucket bindy-audit-logs

# Show log integrity (checksum verification)
kubectl logs -n dns-system -l app=audit-log-verifier --since 24h

# Query audit logs for specific time period
# (Example: All DNS zone changes in Q4 2025)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "objectRef.resource": "dnszones" } },
          { "range": { "requestReceivedTimestamp": {
              "gte": "2025-10-01T00:00:00Z",
              "lte": "2025-12-31T23:59:59Z"
            }
          }}
        ]
      }
    }
  }'

Audit Questions:

✅ Q: Are logs retained for 7 years? Yes, S3 lifecycle policy enforces 7-year retention
✅ Q: Can logs be tampered with? No, S3 Object Lock (WORM) prevents deletion/modification
✅ Q: How do you verify log integrity? Daily SHA-256 checksum verification via CronJob
✅ Q: Can you provide logs from 5 years ago? Yes, S3 Glacier retrieval (1-5 minutes)

4. Segregation of Duties

SOX 404 Requirement: No single person can authorize, execute, and approve changes.

Bindy Implementation:

Control	Implementation	Evidence
2+ Reviewers Required	GitHub branch protection enforces 2 approvals	Branch protection rules
No Self-Approval	PR author cannot approve their own PR	GitHub settings
Separate Roles	Developers cannot merge without approvals	CODEOWNERS file
No Direct Pushes	All changes via PR (even admins)	Branch protection rules
Audit Trail	PR approval history provides evidence	GitHub API, audit logs

Evidence for Auditors:

# Show branch protection requires 2 approvals
gh api repos/firestoned/bindy/branches/main/protection | jq '.required_pull_request_reviews'

# Expected output:
# {
#   "required_approving_review_count": 2,
#   "dismiss_stale_reviews": true,
#   "require_code_owner_reviews": false
# }

Audit Questions:

✅ Q: Can one person make and approve changes? No, 2+ approvers required, PR author excluded
✅ Q: Can admins bypass controls? No, branch protection applies to admins
✅ Q: How do you verify segregation? GitHub audit logs show separate approver identities

5. Evidence Collection for SOX 404 Audits

What Auditors Need:

Provide the following evidence package for SOX 404 auditors:

Change Management Evidence:
- Git commit log (last 12 months) with signatures: git log --show-signature --since="1 year ago" > commits.txt
- PR approval history: gh pr list --state merged --since "1 year ago" --json number,title,reviews > pr-approvals.json
- CI/CD test results: GitHub Actions workflow artifacts
- CHANGELOG.md showing all changes with author attribution
Access Control Evidence:
- RBAC policy: deploy/rbac/clusterrole.yaml
- RBAC verification output: CI/CD artifact rbac-verification.txt
- Quarterly access review reports: docs/compliance/access-reviews/
- Secret access audit trail: Elasticsearch query Q1 results (last 12 months)
Audit Logging Evidence:
- S3 bucket configuration (lifecycle, WORM, encryption): aws s3api describe-bucket.json
- Log integrity verification results: CronJob output (last 12 months)
- Sample audit logs (redacted): Elasticsearch export for specific date range
- Audit log access logs (meta-logging): S3 server access logs
Incident Response Evidence:
- Incident response playbooks: docs/security/INCIDENT_RESPONSE.md
- Incident logs (if any occurred): S3 bindy-audit-logs/incidents/
- Tabletop exercise results: Semi-annual drill reports

SOX 404 Audit Readiness Checklist

Use this checklist quarterly to ensure SOX 404 audit readiness:

Change Management:
- All commits in last 90 days are signed (run: git log --show-signature --since="90 days ago")
- All PRs have 2+ approvals (run: gh pr list --state merged --since "90 days ago" --json reviews)
- CI/CD tests passed on all merged PRs (check GitHub Actions)
- CHANGELOG.md is up to date with author attribution
Access Controls:
- RBAC verification script passes (run: ./deploy/rbac/verify-rbac.sh)
- Quarterly access review completed (due: Week 1 of Q1/Q2/Q3/Q4)
- Secret access audit query Q2 returns 0 results (no unauthorized access)
- 2FA enabled for all contributors (verify in GitHub org settings)
Audit Logging:
- S3 WORM (Object Lock) enabled on audit log bucket
- Log integrity verification CronJob running daily
- Last 90 days of audit logs in Elasticsearch (query: GET /bindy-audit-*/_count)
- S3 lifecycle policy enforces 7-year retention
Documentation:
- Security documentation up to date (docs/security/*.md)
- Compliance roadmap reflects current status (.github/COMPLIANCE_ROADMAP.md)
- Incident response playbooks tested in last 6 months (tabletop exercise)

Quarterly SOX 404 Attestation

Sample Attestation Letter (for CFO/CIO signature):

[Company Letterhead]

SOX 404 IT General Controls Attestation
Q4 2025 - Bindy DNS Infrastructure

I, [CFO Name], certify that for the quarter ended December 31, 2025, the Bindy DNS
infrastructure has maintained effective IT General Controls in compliance with
Sarbanes-Oxley Act Section 404:

1. Change Management Controls:
   - ✅ 127 code changes reviewed and approved via 2+ person process
   - ✅ 100% of commits cryptographically signed
   - ✅ 0 unauthorized changes detected

2. Access Control Controls:
   - ✅ RBAC least privilege verified (automated script passes)
   - ✅ Quarterly access review completed (2025-12-15)
   - ✅ 0 unauthorized secret access events detected

3. Audit Logging Controls:
   - ✅ 7-year audit log retention enforced (WORM storage)
   - ✅ Daily log integrity verification passed (100% checksums valid)
   - ✅ Audit logs available for entire quarter

4. Segregation of Duties:
   - ✅ 2+ approvers required for all code changes
   - ✅ No self-approvals detected
   - ✅ Branch protection enforced (no direct pushes to main)

Based on my review and testing, I conclude that internal controls over Bindy DNS
infrastructure were operating effectively as of December 31, 2025.

Signature: ___________________________
[CFO Name], Chief Financial Officer
Date: 2025-12-31

Common SOX 404 Audit Findings (And How Bindy Addresses Them)

Common Finding	How Bindy Addresses It	Evidence
Unsigned commits	✅ All commits GPG/SSH signed, branch protection enforces	Git log, GitHub branch protection
Single approver for changes	✅ 2+ approvers required, enforced by GitHub	PR approval history
No audit trail for changes	✅ `CHANGELOG.md` + Git history + signed commits	CHANGELOG.md, git log
Logs not retained 7 years	✅ S3 lifecycle policy enforces 7-year retention	S3 bucket configuration
Logs can be tampered with	✅ S3 Object Lock (WORM) prevents tampering	S3 bucket configuration
No access reviews	✅ Quarterly access reviews documented	`docs/compliance/access-reviews/`
Excessive privileges	✅ Controller minimal RBAC (delete only for lifecycle management)	RBAC policy, verification script
No incident response plan	✅ 7 incident playbooks (P1-P7) documented	`docs/security/INCIDENT_RESPONSE.md`

PCI-DSS Compliance

Payment Card Industry Data Security Standard

Overview

The Payment Card Industry Data Security Standard (PCI-DSS) is a set of security standards designed to ensure that all companies that accept, process, store, or transmit credit card information maintain a secure environment.

While Bindy itself does not process payment card data, it operates in a payment card processing environment and must comply with PCI-DSS requirements as part of the overall security infrastructure.

Why Bindy is In-Scope for PCI-DSS:

Supports Cardholder Data Environment (CDE): Bindy provides DNS resolution for payment processing systems
Service Availability: DNS outages prevent access to payment systems (PCI-DSS 12.10 - incident response)
Secure Development: Code handling DNS data must follow secure development practices (PCI-DSS 6.x)
Access Controls: Secret management follows least privilege (PCI-DSS 7.x)
Audit Logging: All system access logged (PCI-DSS 10.x)

PCI-DSS Requirements Applicable to Bindy

PCI-DSS has 12 requirements organized into 6 control objectives. Bindy complies with the following:

PCI-DSS Requirement	Description	Bindy Status
6.2	Ensure all system components are protected from known vulnerabilities	✅ Complete
6.4.1	Secure coding practices	✅ Complete
6.4.6	Code review before production release	✅ Complete
7.1.2	Restrict access based on need-to-know	✅ Complete
10.2.1	Implement audit trails	✅ Complete
10.5.1	Protect audit trail from unauthorized modification	✅ Complete
12.1	Establish security policies	✅ Complete
12.10	Implement incident response plan	✅ Complete

Requirement 6: Secure Systems and Applications

6.2 - Ensure All System Components Are Protected from Known Vulnerabilities

Requirement: Apply security patches and updates within defined timeframes based on risk.

Bindy Implementation:

Control	Implementation	Evidence
Daily Vulnerability Scanning	`cargo audit` runs daily at 00:00 UTC	GitHub Actions workflow logs
CI/CD Scanning	`cargo audit --deny warnings` fails PR on CRITICAL/HIGH CVEs	GitHub Actions PR checks
Container Image Scanning	Trivy scans all container images (CRITICAL, HIGH, MEDIUM, LOW)	GitHub Security tab, SARIF reports
Remediation SLAs	CRITICAL (24h), HIGH (7d), MEDIUM (30d), LOW (90d)	Vulnerability Management Policy
Automated Alerts	GitHub Security Advisories create issues automatically	GitHub Security tab

Remediation Tracking:

# Check for open vulnerabilities
cargo audit

# View vulnerability history
gh api repos/firestoned/bindy/security-advisories

# Show remediation SLA compliance
# (All CRITICAL vulnerabilities patched within 24 hours)
cat docs/security/VULNERABILITY_MANAGEMENT.md

Evidence for QSA (Qualified Security Assessor):

Vulnerability Scan Results: GitHub Security tab → Code scanning alerts
Remediation Evidence: GitHub issues tagged security, vulnerability
Patch History: CHANGELOG.md entries for security updates
SLA Compliance: Monthly vulnerability remediation reports

Compliance Status: ✅ PASS - Daily scanning, automated remediation tracking, SLAs met

6.4.1 - Secure Coding Practices

Requirement: Develop software applications based on industry standards and best practices.

Bindy Implementation:

Control	Implementation	Evidence
Input Validation	All DNS zone names validated against RFC 1035	`src/bind9.rs:validate_zone_name()`
Error Handling	No panics in production (use `Result<T, E>`)	`cargo clippy -- -D warnings`
Secure Dependencies	All dependencies from crates.io (verified sources)	`Cargo.toml`, `Cargo.lock`
No Hardcoded Secrets	Pre-commit hooks detect secrets	GitHub Advanced Security
Memory Safety	Rust’s borrow checker prevents buffer overflows	Rust language guarantees
Logging Best Practices	No sensitive data in logs (PII, secrets)	Code review checks

OWASP Top 10 Mitigations:

OWASP Risk	Bindy Mitigation
A01: Broken Access Control	✅ RBAC least privilege (minimal delete permissions for lifecycle management)
A02: Cryptographic Failures	✅ TLS for all API calls, secrets in Kubernetes Secrets
A03: Injection	✅ Parameterized DNS zone updates (RNDC), input validation
A04: Insecure Design	✅ Threat model (STRIDE), security architecture documented
A05: Security Misconfiguration	✅ Minimal RBAC, non-root containers, read-only filesystem
A06: Vulnerable Components	✅ Daily `cargo audit`, Trivy container scanning
A07: Identification/Authentication	✅ Kubernetes ServiceAccount auth, signed commits
A08: Software/Data Integrity	✅ Signed commits, SBOM, reproducible builds
A09: Logging Failures	✅ Comprehensive logging (controller, audit, DNS queries)
A10: Server-Side Request Forgery	✅ No external HTTP calls (only Kubernetes API, RNDC)

Evidence for QSA:

Code Review Records: GitHub PR approval history
Static Analysis: cargo clippy results (all PRs)
Security Training: CONTRIBUTING.md - secure coding guidelines
Threat Model: docs/security/THREAT_MODEL.md - STRIDE analysis

Compliance Status: ✅ PASS - Rust memory safety, OWASP Top 10 mitigations, secure coding guidelines

6.4.6 - Code Review Before Production Release

Requirement: All code changes reviewed by individuals other than the original author before release.

Bindy Implementation:

Control	Implementation	Evidence
2+ Reviewers Required	GitHub branch protection enforces 2 approvals	Branch protection rules
No Self-Approval	PR author cannot approve own PR	GitHub settings
Signed Commits	All commits GPG/SSH signed (non-repudiation)	Git commit log
Automated Security Checks	`cargo audit`, `cargo clippy`, `cargo test` must pass	GitHub Actions status checks
Change Documentation	All changes documented in `CHANGELOG.md`	CHANGELOG.md

Code Review Checklist:

Every PR is reviewed for:

✅ Security vulnerabilities (injection, XSS, secrets in code)
✅ Input validation (DNS zone names, RNDC keys)
✅ Error handling (no panics, proper Result usage)
✅ Logging (no PII/secrets in logs)
✅ Tests (unit tests for new code, integration tests for features)

Evidence for QSA:

# Show PR approval history (last 6 months)
gh pr list --state merged --since "6 months ago" --json number,title,reviews

# Show commit signatures
git log --show-signature --since="6 months ago"

# Show CI/CD security check results
gh run list --workflow ci.yaml --limit 100

Compliance Status: ✅ PASS - 2+ reviewers, signed commits, automated security checks

Requirement 7: Restrict Access to Cardholder Data

7.1.2 - Restrict Access Based on Need-to-Know

Requirement: Limit access to system components and cardholder data to only those individuals whose job requires such access.

Bindy Implementation:

Control	Implementation	Evidence
Least Privilege RBAC	Controller minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)	`deploy/rbac/clusterrole.yaml`
Minimal Delete Permissions	Controller delete limited to managed resources (finalizer cleanup, scaling)	RBAC verification script
Secret Access Audit Trail	All secret access logged (7-year retention)	Secret Access Audit Trail
Quarterly Access Reviews	Security team reviews access every quarter	Access review reports
Role-Based Access	Different roles for dev, ops, security teams	GitHub team permissions

RBAC Policy Verification:

# Verify controller has minimal permissions
./deploy/rbac/verify-rbac.sh

# Expected output:
# ✅ Controller can READ secrets (get, list, watch)
# ✅ Controller can CREATE/DELETE secrets (RNDC key lifecycle only)
# ✅ Controller CANNOT UPDATE/PATCH secrets (immutable pattern)
# ✅ Controller can DELETE managed resources (Bind9Instance, Bind9Cluster, finalizer cleanup)
# ✅ Controller CANNOT DELETE user resources (DNSZone, Records, Bind9GlobalCluster)

Secret Access Monitoring:

# Query: Non-controller secret access (should return 0 results)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "objectRef.resource": "secrets" } },
          { "term": { "objectRef.namespace": "dns-system" } }
        ],
        "must_not": [
          { "term": { "user.username.keyword": "system:serviceaccount:dns-system:bindy-controller" } }
        ]
      }
    }
  }'

# Expected: 0 hits (only authorized controller accesses secrets)

Evidence for QSA:

RBAC Policy: deploy/rbac/clusterrole.yaml
RBAC Verification: CI/CD artifact rbac-verification.txt
Secret Access Logs: Elasticsearch query results (quarterly)
Access Reviews: docs/compliance/access-reviews/YYYY-QN.md

Compliance Status: ✅ PASS - Least privilege RBAC, quarterly access reviews, audit trail

Requirement 10: Log and Monitor All Access

10.2.1 - Implement Audit Trails

Requirement: Implement automated audit trails for all system components to reconstruct the following events:

All individual user accesses to cardholder data
Actions taken by individuals with root/admin privileges
Access to all audit trails
Invalid logical access attempts
Use of identification/authentication mechanisms
Initialization, stopping, or pausing of audit logs
Creation and deletion of system-level objects

Bindy Implementation:

Control	Implementation	Evidence
Kubernetes Audit Logs	All API requests logged (CRD ops, secret access)	Kubernetes audit policy
Secret Access Logging	All secret get/list/watch logged	`docs/security/SECRET_ACCESS_AUDIT.md`
Controller Logs	All reconciliation loops, DNS updates	Fluent Bit, S3 storage
Access Attempts	Failed secret access (403 Forbidden) logged	Kubernetes audit logs
Authentication Events	ServiceAccount token usage logged	Kubernetes audit logs

Audit Log Fields (PCI-DSS 10.2.1 Compliance):

PCI-DSS Requirement	Bindy Audit Log Field	Example Value
User identification	`user.username`	`system:serviceaccount:dns-system:bindy-controller`
Type of event	`verb`	`get`, `list`, `watch`, `create`, `update`, `delete`
Date and time	`requestReceivedTimestamp`	`2025-12-18T12:34:56.789Z` (ISO 8601 UTC)
Success/failure indication	`responseStatus.code`	`200` (success), `403` (forbidden)
Origination of event	`sourceIPs`	`10.244.1.15` (pod IP)
Identity of affected data	`objectRef.name`	`rndc-key-primary` (secret name)

Sample Audit Log Entry:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "a4b5c6d7-e8f9-0a1b-2c3d-4e5f6a7b8c9d",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/dns-system/secrets/rndc-key-primary",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:dns-system:bindy-controller",
    "uid": "abc123",
    "groups": ["system:serviceaccounts", "system:serviceaccounts:dns-system"]
  },
  "sourceIPs": ["10.244.1.15"],
  "objectRef": {
    "resource": "secrets",
    "namespace": "dns-system",
    "name": "rndc-key-primary",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "code": 200
  },
  "requestReceivedTimestamp": "2025-12-18T12:34:56.789Z"
}

Evidence for QSA:

# Show audit logs for last 30 days (sample)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "range": {
        "requestReceivedTimestamp": {
          "gte": "now-30d"
        }
      }
    },
    "size": 100
  }' | jq .

# Show failed access attempts (last 30 days)
curl -X POST "https://elasticsearch:9200/bindy-audit-*/_search" \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
      "bool": {
        "must": [
          { "term": { "responseStatus.code": 403 } },
          { "range": { "requestReceivedTimestamp": { "gte": "now-30d" } } }
        ]
      }
    }
  }' | jq .

Compliance Status: ✅ PASS - All PCI-DSS 10.2.1 fields captured, audit logs retained 7 years

10.5.1 - Protect Audit Trail from Unauthorized Modification

Requirement: Limit viewing of audit trails to those with a job-related need.

Bindy Implementation:

Control	Implementation	Evidence
Immutable Storage	S3 Object Lock (WORM) prevents log deletion/modification	S3 bucket configuration
Access Controls	IAM policies restrict S3 access to security team only	AWS IAM policy
Access Logging (Meta-Logging)	S3 server access logs track who reads audit logs	S3 access logs
Integrity Verification	SHA-256 checksums verify logs not tampered	Daily CronJob output
Encryption at Rest	S3 SSE-S3 encryption for all audit logs	S3 bucket configuration
Encryption in Transit	TLS 1.3 for all S3 API calls	AWS default

S3 WORM (Object Lock) Configuration:

# Show Object Lock enabled
aws s3api get-object-lock-configuration --bucket bindy-audit-logs

# Expected output:
# {
#   "ObjectLockConfiguration": {
#     "ObjectLockEnabled": "Enabled",
#     "Rule": {
#       "DefaultRetention": {
#         "Mode": "GOVERNANCE",
#         "Days": 2555
#       }
#     }
#   }
# }

IAM Policy (Audit Log Access):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyDelete",
      "Effect": "Deny",
      "Principal": "*",
      "Action": [
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": "arn:aws:s3:::bindy-audit-logs/*"
    },
    {
      "Sid": "SecurityTeamReadOnly",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/SecurityTeam"
      },
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::bindy-audit-logs",
        "arn:aws:s3:::bindy-audit-logs/*"
      ]
    }
  ]
}

Evidence for QSA:

S3 Bucket Policy: AWS IAM policy (deny delete, security team read-only)
Object Lock Configuration: aws s3api get-object-lock-configuration
Integrity Verification: CronJob logs (daily SHA-256 checksum verification)
Access Logs: S3 server access logs (who accessed audit logs)

Compliance Status: ✅ PASS - Immutable WORM storage, access controls, integrity verification

Requirement 12: Maintain a Security Policy

12.1 - Establish, Publish, Maintain, and Disseminate a Security Policy

Requirement: Establish, publish, maintain, and disseminate a security policy that addresses all PCI-DSS requirements.

Bindy Implementation:

Policy Document	Location	Last Updated
Security Policy	SECURITY.md	2025-12-18
Threat Model	docs/security/THREAT_MODEL.md	2025-12-17
Security Architecture	docs/security/ARCHITECTURE.md	2025-12-17
Incident Response	docs/security/INCIDENT_RESPONSE.md	2025-12-17
Vulnerability Management	docs/security/VULNERABILITY_MANAGEMENT.md	2025-12-15
Audit Log Retention	docs/security/AUDIT_LOG_RETENTION.md	2025-12-18

Evidence for QSA:

Published Policies: All policies in GitHub repository (public access)
Version Control: Git history shows policy updates and reviews
Annual Review: Policies reviewed quarterly (Next Review: 2026-03-18)

Compliance Status: ✅ PASS - Security policies documented, published, and maintained

12.10 - Implement an Incident Response Plan

Requirement: Implement an incident response plan. Be prepared to respond immediately to a system breach.

Bindy Implementation:

Incident Type	Playbook	Response Time	SLA
Critical Vulnerability (CVSS 9.0-10.0)	P1	< 15 minutes	Patch within 24 hours
Compromised Controller Pod	P2	< 15 minutes	Isolate within 1 hour
DNS Service Outage	P3	< 15 minutes	Restore within 4 hours
RNDC Key Compromise	P4	< 15 minutes	Rotate keys within 1 hour
Unauthorized DNS Changes	P5	< 1 hour	Revert within 4 hours
DDoS Attack	P6	< 15 minutes	Mitigate within 1 hour
Supply Chain Compromise	P7	< 15 minutes	Rebuild within 24 hours

Incident Response Process (NIST Lifecycle):

Preparation: Playbooks documented, tools configured, team trained
Detection & Analysis: Prometheus alerts, audit log analysis
Containment: Isolate affected systems, prevent escalation
Eradication: Remove threat, patch vulnerability
Recovery: Restore service, verify integrity
Post-Incident Activity: Document lessons learned, improve defenses

Evidence for QSA:

Incident Response Playbooks: docs/security/INCIDENT_RESPONSE.md
Tabletop Exercise Results: Semi-annual drill reports
Incident Logs: S3 bindy-audit-logs/incidents/ (if any incidents occurred)

Compliance Status: ✅ PASS - 7 incident playbooks documented, tabletop exercises conducted

PCI-DSS Audit Evidence Package

For your annual PCI-DSS assessment, provide the QSA with:

Requirement 6 (Secure Systems):
- Vulnerability scan results (GitHub Security tab)
- Remediation tracking (GitHub issues, CHANGELOG.md)
- Code review records (PR approval history)
- Static analysis results (cargo clippy, cargo audit)
Requirement 7 (Access Controls):
- RBAC policy (deploy/rbac/clusterrole.yaml)
- RBAC verification output (CI/CD artifact)
- Quarterly access review reports
- Secret access audit logs (Elasticsearch query results)
Requirement 10 (Logging):
- Sample audit logs (redacted, last 30 days)
- S3 bucket configuration (WORM, encryption, access controls)
- Log integrity verification results (CronJob output)
- Audit log access logs (meta-logging, S3 server access logs)
Requirement 12 (Policies):
- Security policies (SECURITY.md, docs/security/*.md)
- Incident response playbooks
- Tabletop exercise results

Basel III Compliance

Basel III: International Regulatory Framework for Banks

Overview

Basel III is an international regulatory framework for banks developed by the Basel Committee on Banking Supervision (BCBS). While primarily focused on capital adequacy, liquidity risk, and leverage ratios, Basel III also includes operational risk requirements that cover technology and cyber risk.

Bindy, as critical DNS infrastructure in a regulated banking environment, falls under Basel III operational risk management requirements.

Key Basel III Areas Applicable to Bindy:

Operational Risk (Pillar 1): Technology failures, cyber attacks, service disruptions
Cyber Risk Management (2018 Principles): Cybersecurity governance, threat monitoring, incident response
Business Continuity (Pillar 2): Disaster recovery, high availability, resilience
Operational Resilience (2021 Principles): Ability to withstand severe operational disruptions

Basel III Cyber Risk Principles

The Basel Committee published Cyber Risk Principles in 2018, which define expectations for banks’ cybersecurity programs. Bindy complies with these principles:

Principle 1: Governance

Requirement: Board and senior management should establish a comprehensive cyber risk management framework.

Bindy Implementation:

Control	Implementation	Evidence
Security Policy	Comprehensive security policy documented	SECURITY.md
Threat Model	STRIDE threat analysis with 15 threats	Threat Model
Security Architecture	5 security domains documented	Security Architecture
Incident Response	7 playbooks for critical/high incidents	Incident Response
Compliance Roadmap	Tracking compliance implementation	Compliance Roadmap

Evidence:

Security documentation (4,010 lines across 7 documents)
Compliance tracking (H-1 through H-4 complete)
Quarterly security reviews

Status: ✅ COMPLIANT - Comprehensive cyber risk framework documented

Principle 2: Risk Identification and Assessment

Requirement: Banks should identify and assess cyber risks as part of operational risk management.

Bindy Implementation:

Risk Category	Identified Threats	Impact	Mitigation
Spoofing	Compromised Kubernetes API, stolen ServiceAccount tokens	HIGH	RBAC least privilege, short-lived tokens, network policies
Tampering	Malicious DNS zone changes, RNDC key compromise	CRITICAL	RBAC read-only, signed commits, audit logging
Repudiation	Untracked DNS changes, no audit trail	HIGH	Signed commits, audit logs (7-year retention), WORM storage
Information Disclosure	Secret leakage, DNS data exposure	CRITICAL	Kubernetes Secrets, RBAC, secret access audit trail
Denial of Service	DNS query flood, pod resource exhaustion	HIGH	Rate limiting (planned), pod resource limits, DDoS playbook
Elevation of Privilege	Controller pod compromise, RBAC bypass	CRITICAL	Non-root containers, read-only filesystem, minimal RBAC

Attack Surface Analysis:

Attack Vector	Exposure	Risk Level	Mitigation Status
Kubernetes API	Internal cluster network	HIGH	✅ RBAC, audit logs, network policies (planned)
DNS Port 53	Public internet	HIGH	✅ BIND9 hardening, DDoS playbook
RNDC Port 953	Internal cluster network	CRITICAL	✅ Secret rotation, access audit, incident playbook P4
Container Images	Public registries	MEDIUM	✅ Trivy scanning, Chainguard zero-CVE images
CRDs (Custom Resources)	Kubernetes API	MEDIUM	✅ Input validation, RBAC, audit logs
Git Repository	Public GitHub	LOW	✅ Signed commits, branch protection, code review

Evidence:

Threat Model - 15 STRIDE threats, 5 attack scenarios
Security Architecture - Attack surface analysis
Quarterly risk reviews (documented in compliance roadmap)

Status: ✅ COMPLIANT - Comprehensive risk identification and mitigation

Principle 3: Access Controls

Requirement: Banks should implement strong access controls, including least privilege.

Bindy Implementation:

Control	Implementation	Evidence
Least Privilege RBAC	Controller minimal RBAC (create/delete secrets for RNDC lifecycle, delete managed resources for cleanup)	`deploy/rbac/clusterrole.yaml`
Secret Access Monitoring	All secret access logged and alerted	Secret Access Audit Trail
Quarterly Access Reviews	Security team reviews access every quarter	`docs/compliance/access-reviews/`
2FA Enforcement	GitHub requires 2FA for all contributors	GitHub organization settings
Signed Commits	Cryptographic proof of code authorship	Git commit signatures

Access Control Matrix:

Role	Secrets	CRDs	Pods	ConfigMaps	Nodes
Controller	Create/Delete (RNDC keys)	Read/Write/Delete (managed)	Read	Read/Write/Delete	Read
BIND9 Pods	Read-only	None	None	Read	None
Developers	None	Read (kubectl)	Read (logs)	Read	None
Operators	Read (kubectl)	Read/Write (kubectl)	Read/Write	Read/Write	Read
Security Team	Read (audit logs)	Read	Read	Read	Read

Evidence:

RBAC policy: deploy/rbac/clusterrole.yaml
RBAC verification: ./deploy/rbac/verify-rbac.sh
Secret access logs: Elasticsearch query Q1 (quarterly)
Access review reports: docs/compliance/access-reviews/YYYY-QN.md

Status: ✅ COMPLIANT - Least privilege access, quarterly reviews, audit trail

Principle 4: Threat and Vulnerability Management

Requirement: Banks should implement a threat and vulnerability management process.

Bindy Implementation:

Activity	Frequency	Tool	Remediation SLA
Dependency Scanning	Daily (00:00 UTC)	`cargo audit`	CRITICAL (24h), HIGH (7d)
Container Image Scanning	Every PR + Daily	Trivy	CRITICAL (24h), HIGH (7d)
Code Security Review	Every PR	Manual + `cargo clippy`	Before merge
Penetration Testing	Annual	External firm	90 days
Threat Intelligence	Continuous	GitHub Security Advisories	As detected

Vulnerability Remediation SLAs:

Severity	CVSS Score	Response Time	Remediation SLA	Status
CRITICAL	9.0-10.0	< 15 minutes	24 hours	✅ Enforced
HIGH	7.0-8.9	< 1 hour	7 days	✅ Enforced
MEDIUM	4.0-6.9	< 4 hours	30 days	✅ Enforced
LOW	0.1-3.9	< 24 hours	90 days	✅ Enforced

Evidence:

Vulnerability Management Policy
GitHub Security tab - Vulnerability scan results
CHANGELOG.md - Remediation history
Monthly vulnerability remediation reports

Status: ✅ COMPLIANT - Daily scanning, defined SLAs, automated tracking

Principle 5: Cyber Resilience and Response

Requirement: Banks should have incident response and business continuity plans for cyber incidents.

Bindy Implementation:

Incident Response Playbooks (7 Total):

Playbook	Scenario	Response Time	Recovery SLA
P1: Critical Vulnerability	CVSS 9.0-10.0 vulnerability detected	< 15 minutes	Patch within 24 hours
P2: Compromised Controller	Controller pod shows anomalous behavior	< 15 minutes	Isolate within 1 hour
P3: DNS Service Outage	All BIND9 pods down, queries failing	< 15 minutes	Restore within 4 hours
P4: RNDC Key Compromise	RNDC key leaked or unauthorized access	< 15 minutes	Rotate keys within 1 hour
P5: Unauthorized DNS Changes	Unexpected zone modifications detected	< 1 hour	Revert within 4 hours
P6: DDoS Attack	DNS query flood, resource exhaustion	< 15 minutes	Mitigate within 1 hour
P7: Supply Chain Compromise	Malicious commit or compromised dependency	< 15 minutes	Rebuild within 24 hours

Business Continuity:

Capability	Implementation	RTO (Recovery Time Objective)	RPO (Recovery Point Objective)
High Availability	Multi-pod deployment (3+ replicas)	0 (no downtime)	0 (no data loss)
Zone Replication	Primary + Secondary DNS instances	< 5 minutes	< 1 minute (zone transfer)
Disaster Recovery	Multi-region deployment (planned)	< 1 hour	< 5 minutes
Data Backup	DNS zones in Git + etcd backups	< 4 hours	< 1 hour

Evidence:

Incident Response Playbooks
Semi-annual tabletop exercise reports
Incident logs (if any occurred): S3 bindy-audit-logs/incidents/

Status: ✅ COMPLIANT - 7 incident playbooks, business continuity plan

Principle 6: Dependency on Third Parties

Requirement: Banks should manage cyber risks associated with third-party service providers.

Bindy Third-Party Dependencies:

Dependency	Purpose	Risk Level	Mitigation
BIND9	DNS server software	MEDIUM	Chainguard zero-CVE images, Trivy scanning
Kubernetes	Orchestration platform	MEDIUM	Managed Kubernetes (EKS, GKE, AKS), regular updates
Rust Dependencies	Build-time libraries	LOW	Daily `cargo audit`, crates.io verified sources
Container Registries	Image distribution	LOW	GHCR (GitHub), signed images, SBOM
AWS S3	Audit log storage	LOW	Encryption at rest/transit, WORM, IAM access controls

Third-Party Risk Management:

Control	Implementation	Evidence
Dependency Vetting	Only use actively maintained dependencies (commits in last 6 months)	`Cargo.toml` review
Vulnerability Scanning	Daily `cargo audit`, Trivy container scanning	GitHub Security tab
Supply Chain Security	Signed commits, SBOM, reproducible builds	Build Reproducibility
Vendor Assessments	Annual review of critical vendors (BIND9, Kubernetes)	Vendor assessment reports

Evidence:

Cargo.toml, Cargo.lock - Pinned dependency versions
SBOM (Software Bill of Materials) - Release artifacts
Vendor assessment reports (annual)

Status: ✅ COMPLIANT - Third-party dependencies vetted, scanned, monitored

Requirement: Banks should participate in information sharing to enhance cyber resilience.

Bindy Information Sharing:

Activity	Frequency	Audience	Purpose
Security Advisories	As needed	Public (GitHub)	Coordinated disclosure of vulnerabilities
Threat Intelligence	Continuous	Security team	Subscribe to GitHub Security Advisories, CVE feeds
Incident Reports	After incidents	Internal + Regulators	Post-incident review, lessons learned
Compliance Reporting	Quarterly	Risk committee	Basel III operational risk reporting

Evidence:

GitHub Security Advisories (if any published)
Quarterly risk committee reports
Incident post-mortems (if any occurred)

Status: ✅ COMPLIANT - Active participation in threat intelligence sharing

Basel III Operational Risk Reporting

Quarterly Operational Risk Report Template:

[Bank Letterhead]

Basel III Operational Risk Report
Q4 2025 - Bindy DNS Infrastructure

Reporting Period: October 1 - December 31, 2025
Prepared by: [Security Team Lead]
Reviewed by: [Chief Risk Officer]

1. OPERATIONAL RISK EVENTS

   1.1 Cyber Incidents:
       - 0 critical incidents
       - 0 high-severity incidents
       - 2 medium-severity incidents (P3: DNS Service Outage)
         - Root cause: Kubernetes pod OOMKilled (memory limit too low)
         - Resolution: Increased memory limit from 512Mi to 1Gi
         - RTO achieved: 15 minutes (target: 4 hours)
       - 0 data breaches

   1.2 Service Availability:
       - Uptime: 99.98% (target: 99.9%)
       - DNS query success rate: 99.99%
       - Mean time to recovery (MTTR): 15 minutes

   1.3 Vulnerability Management:
       - Vulnerabilities detected: 12 (3 HIGH, 9 MEDIUM)
       - Remediation SLA compliance: 100%
       - Average time to remediate: 3.5 days (CRITICAL/HIGH)

2. COMPLIANCE STATUS

   2.1 Basel III Cyber Risk Principles:
       - ✅ Principle 1 (Governance): Security policies documented
       - ✅ Principle 2 (Risk Assessment): Threat model updated Q4 2025
       - ✅ Principle 3 (Access Controls): Quarterly access review completed
       - ✅ Principle 4 (Vulnerability Mgmt): SLAs met (100%)
       - ✅ Principle 5 (Resilience): Tabletop exercise conducted
       - ✅ Principle 6 (Third Parties): Vendor assessments completed
       - ✅ Principle 7 (Info Sharing): Threat intelligence active

   2.2 Audit Trail:
       - Audit logs retained: 7 years (WORM storage)
       - Log integrity verification: 100% pass rate
       - Secret access reviews: Quarterly (last: 2025-12-15)

3. RISK MITIGATION ACTIONS

   3.1 Completed (Phase 2):
       - ✅ H-1: Security Policy and Threat Model
       - ✅ H-2: Audit Log Retention Policy
       - ✅ H-3: Secret Access Audit Trail
       - ✅ H-4: Build Reproducibility Verification

   3.2 Planned (Phase 3):
       - L-1: Implement NetworkPolicies (Q1 2026)
       - M-3: Implement Rate Limiting (Q1 2026)

4. REGULATORY REPORTING

   4.1 PCI-DSS: Annual audit scheduled (Q1 2026)
   4.2 SOX 404: Quarterly ITGC attestation provided
   4.3 Basel III: This report (quarterly)

Approved by:
[Chief Risk Officer Signature]
Date: 2025-12-31

Basel III Audit Evidence

For Basel III operational risk reviews, provide:

Cyber Risk Framework:
- Security policies (SECURITY.md, docs/security/*.md)
- Threat model (STRIDE analysis)
- Security architecture documentation
Incident Response:
- Incident response playbooks (P1-P7)
- Incident logs (if any occurred)
- Tabletop exercise results (semi-annual)
Vulnerability Management:
- Vulnerability scan results (GitHub Security tab)
- Remediation tracking (GitHub issues, CHANGELOG.md)
- Monthly remediation reports
Access Controls:
- RBAC policy and verification output
- Quarterly access review reports
- Secret access audit logs
Audit Trail:
- S3 bucket configuration (WORM, retention)
- Log integrity verification results
- Sample audit logs (redacted)
Business Continuity:
- High availability architecture
- Disaster recovery procedures
- RTO/RPO metrics

SLSA Compliance

Supply-chain Levels for Software Artifacts

Overview

SLSA (Supply-chain Levels for Software Artifacts, pronounced “salsa”) is a security framework developed by Google to prevent supply chain attacks. It defines a series of incrementally adoptable security levels (0-3) that provide increasing supply chain security guarantees.

Bindy’s SLSA Status: ✅ Level 3 (highest level)

SLSA Requirements by Level

Requirement	Level 1	Level 2	Level 3	Bindy Status
Source - Version controlled	✅	✅	✅	✅ Git (GitHub)
Source - Verified history	❌	✅	✅	✅ Signed commits
Source - Retained indefinitely	❌	❌	✅	✅ GitHub (permanent)
Source - Two-person reviewed	❌	❌	✅	✅ 2+ PR approvals
Build - Scripted build	✅	✅	✅	✅ Cargo + Docker
Build - Build service	❌	✅	✅	✅ GitHub Actions
Build - Build as code	❌	✅	✅	✅ Workflows in Git
Build - Ephemeral environment	❌	✅	✅	✅ Fresh runners
Build - Isolated	❌	✅	✅	✅ No secrets accessible
Build - Hermetic	❌	❌	✅	⚠️ Partial (cargo fetch)
Build - Reproducible	❌	❌	✅	✅ Bit-for-bit
Provenance - Available	❌	✅	✅	✅ SBOM + signatures
Provenance - Authenticated	❌	✅	✅	✅ Signed tags
Provenance - Service generated	❌	✅	✅	✅ GitHub Actions
Provenance - Non-falsifiable	❌	❌	✅	✅ Cryptographic signatures
Provenance - Dependencies complete	❌	❌	✅	✅ Cargo.lock + SBOM

SLSA Level 3 Detailed Compliance

Source Requirements

✅ Requirement: Version controlled with verified history

Control	Implementation	Evidence
Git Version Control	All source code in GitHub	GitHub repository
Signed Commits	All commits GPG/SSH signed	`git log --show-signature`
Verified History	Branch protection prevents history rewriting	GitHub branch protection
Two-Person Review	2+ approvals required for all PRs	PR approval logs
Permanent Retention	Git history never deleted	GitHub repository settings

Evidence:

# Show all commits are signed (last 90 days)
git log --show-signature --since="90 days ago" --oneline

# Show branch protection (prevents force push, history rewriting)
gh api repos/firestoned/bindy/branches/main/protection | jq

Build Requirements

✅ Requirement: Build process is fully scripted and reproducible

Control	Implementation	Evidence
Scripted Build	Cargo (Rust), Docker (containers)	`Cargo.toml`, `Dockerfile`
Build as Code	GitHub Actions workflows in version control	`.github/workflows/*.yaml`
Ephemeral Environment	Fresh GitHub-hosted runners for each build	GitHub Actions logs
Isolated	Build cannot access secrets or network (after deps fetched)	GitHub Actions sandboxing
Hermetic	⚠️ Partial - `cargo fetch` uses network	Working toward full hermetic
Reproducible	Two builds from same commit = identical binary	Build Reproducibility

Build Reproducibility Verification:

# Automated verification (daily CI/CD)
# Builds binary twice, compares SHA-256 hashes
.github/workflows/reproducibility-check.yaml

# Manual verification (external auditors)
scripts/verify-build.sh v0.1.0

Sources of Non-Determinism (Mitigated):

Timestamps → Use vergen for deterministic Git commit timestamps
Filesystem order → Sort files before processing
HashMap iteration → Use BTreeMap for deterministic order
Parallelism → Sort output after parallel processing
Base image updates → Pin base image digests in Dockerfile

Evidence:

Build Reproducibility Documentation
CI/CD workflow: .github/workflows/reproducibility-check.yaml
Verification script: scripts/verify-build.sh

Provenance Requirements

✅ Requirement: Build provenance is available, authenticated, and non-falsifiable

Artifact	Provenance Type	Signature	Availability
Rust Binary	SHA-256 checksum	GPG-signed Git tag	GitHub Releases
Container Image	Image digest	SBOM + attestation	GHCR (GitHub Container Registry)
SBOM	CycloneDX format	Included in release	GitHub Releases (*.sbom.json)
Source Code	Git commit	GPG/SSH signature	GitHub repository

SBOM Generation:

# Generate SBOM (Software Bill of Materials)
cargo install cargo-cyclonedx
cargo cyclonedx --format json --output bindy.sbom.json

# SBOM includes all dependencies with exact versions
cat bindy.sbom.json | jq '.components[] | {name, version}'

Evidence:

GitHub Releases: https://github.com/firestoned/bindy/releases
SBOM files: bindy-*.sbom.json in release artifacts
Signed Git tags: git tag --verify v0.1.0
Container image signatures: docker trust inspect ghcr.io/firestoned/bindy:v0.1.0

SLSA Build Levels Comparison

Aspect	Level 1	Level 2	Level 3	Bindy
Protection against	Accidental errors	Compromised build service	Compromised source + build	✅ All
Source integrity	Manual commits	Signed commits	Signed commits + 2-person review	✅ Complete
Build integrity	Manual build	Automated build	Reproducible build	✅ Complete
Provenance	None	Service-generated	Cryptographic provenance	✅ Complete
Verifiability	Trust on first use	Verifiable by service	Verifiable by anyone	✅ Complete

SLSA Compliance Roadmap

Requirement	Status	Evidence
Level 1	✅ Complete	Git, Cargo build
Level 2	✅ Complete	GitHub Actions, signed commits, SBOM
Level 3 (Source)	✅ Complete	Signed commits, 2+ PR approvals, permanent Git history
Level 3 (Build)	✅ Complete	Reproducible builds, verification script
Level 3 (Provenance)	✅ Complete	SBOM, signed tags, container attestation
Level 3 (Hermetic)	⚠️ Partial	`cargo fetch` uses network (working toward offline builds)

Verification for End Users

How to verify Bindy releases:

# 1. Verify Git tag signature
git verify-tag v0.1.0

# 2. Rebuild from source
git checkout v0.1.0
cargo build --release --locked

# 3. Compare binary hash with released artifact
sha256sum target/release/bindy
curl -sL https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy-linux-amd64.sha256

# 4. Verify SBOM (Software Bill of Materials)
curl -sL https://github.com/firestoned/bindy/releases/download/v0.1.0/bindy.sbom.json | jq .

# 5. Verify container image signature (if using containers)
docker trust inspect ghcr.io/firestoned/bindy:v0.1.0

Expected Result: ✅ All verifications pass, hashes match, provenance verified

SLSA Threat Mitigation

Threat	SLSA Level	Bindy Mitigation
A: Build system compromise	Level 2+	✅ GitHub-hosted runners (ephemeral, isolated)
B: Source code compromise	Level 3	✅ Signed commits, 2+ PR approvals, branch protection
C: Dependency compromise	Level 3	✅ Cargo.lock pinned, daily `cargo audit`, SBOM
D: Upload of malicious binaries	Level 2+	✅ GitHub Actions uploads, not manual
E: Compromised build config	Level 2+	✅ Workflows in Git, 2+ PR approvals
F: Use of compromised package	Level 3	✅ Reproducible builds, users can verify

NIST Cybersecurity Framework

NIST CSF: Framework for Improving Critical Infrastructure Cybersecurity

Overview

The NIST Cybersecurity Framework (CSF) is a voluntary framework developed by the National Institute of Standards and Technology (NIST) to help organizations manage and reduce cybersecurity risk. The framework is organized into five functions: Identify, Protect, Detect, Respond, and Recover.

Bindy’s NIST CSF Status: ⚠️ Partial Compliance (60% complete)

✅ Identify: 90% complete
✅ Protect: 80% complete
⚠️ Detect: 60% complete (needs network monitoring)
✅ Respond: 90% complete
⚠️ Recover: 50% complete (needs disaster recovery testing)

NIST CSF Core Functions

1. Identify (ID)

Objective: Develop organizational understanding to manage cybersecurity risk to systems, people, assets, data, and capabilities.

Category	Subcategory	Bindy Implementation	Status
ID.AM (Asset Management)	Asset inventory	Kubernetes resources tracked in Git	✅ Complete
ID.BE (Business Environment)	Dependencies documented	Third-party dependencies in SBOM	✅ Complete
ID.GV (Governance)	Security policies established	SECURITY.md, threat model, incident response	✅ Complete
ID.RA (Risk Assessment)	Threat modeling conducted	STRIDE analysis (15 threats, 5 scenarios)	✅ Complete
ID.RM (Risk Management Strategy)	Risk mitigation roadmap	Compliance roadmap (H-1 to M-4)	✅ Complete
ID.SC (Supply Chain Risk Management)	Third-party dependencies assessed	Daily `cargo audit`, Trivy scanning, SBOM	✅ Complete

Evidence:

Threat Model - STRIDE threat analysis
Security Architecture - Asset inventory, trust boundaries
Compliance Roadmap - Risk mitigation tracking
Cargo.toml, Cargo.lock, SBOM - Dependency inventory

Identify Function: ✅ 90% Complete (Asset management, risk assessment done; needs supply chain deep dive)

2. Protect (PR)

Objective: Develop and implement appropriate safeguards to ensure delivery of critical services.

Category	Subcategory	Bindy Implementation	Status
PR.AC (Identity Management)	Least privilege access	RBAC (minimal delete permissions for lifecycle management), 2FA	✅ Complete
PR.AC (Physical access control)	N/A (cloud-hosted)	Kubernetes cluster security	N/A
PR.AT (Awareness and Training)	Security training	CONTRIBUTING.md (secure coding guidelines)	✅ Complete
PR.DS (Data Security)	Data at rest encryption	Kubernetes Secrets (encrypted etcd), S3 SSE	✅ Complete
PR.DS (Data in transit encryption)	TLS for all API calls	Kubernetes API (TLS 1.3), S3 (TLS 1.3)	✅ Complete
PR.IP (Information Protection)	Secret management	Kubernetes Secrets, secret access audit trail	✅ Complete
PR.MA (Maintenance)	Vulnerability patching	Daily `cargo audit`, SLAs (CRITICAL 24h, HIGH 7d)	✅ Complete
PR.PT (Protective Technology)	Security controls	Non-root containers, read-only filesystem, RBAC	✅ Complete

Evidence:

RBAC policy: deploy/rbac/clusterrole.yaml
Secret Access Audit Trail
Vulnerability Management Policy
Kubernetes Security Context: deploy/controller/deployment.yaml (non-root, read-only FS)

Protect Function: ✅ 80% Complete (Strong access controls, data protection; needs NetworkPolicies L-1)

3. Detect (DE)

Objective: Develop and implement appropriate activities to identify the occurrence of a cybersecurity event.

Category	Subcategory	Bindy Implementation	Status
DE.AE (Anomalies and Events)	Anomaly detection	Prometheus alerts (unauthorized access, excessive access)	✅ Complete
DE.CM (Security Continuous Monitoring)	Vulnerability scanning	Daily `cargo audit`, Trivy (containers)	✅ Complete
DE.CM (Network monitoring)	Network traffic analysis	⚠️ Planned (L-1: NetworkPolicies + monitoring)	⚠️ Planned
DE.DP (Detection Processes)	Incident detection procedures	7 incident playbooks (P1-P7)	✅ Complete

Implemented Detection Controls:

Alert	Trigger	Severity	Response Time
UnauthorizedSecretAccess	Non-controller accessed secret	CRITICAL	< 1 minute
ExcessiveSecretAccess	> 10 secret accesses/sec	WARNING	< 5 minutes
FailedSecretAccessAttempts	> 1 failed access/sec	WARNING	< 5 minutes
CriticalVulnerability	CVSS 9.0-10.0 detected	CRITICAL	< 15 minutes
PodCrashLoop	Pod restarting repeatedly	HIGH	< 5 minutes

Evidence:

Prometheus alerting rules: deploy/monitoring/alerts/bindy-secret-access.yaml
Secret Access Audit Trail - Alert definitions
GitHub Actions workflows: Daily security scans

Detect Function: ⚠️ 60% Complete (Anomaly detection done; needs network monitoring L-1)

4. Respond (RE)

Objective: Develop and implement appropriate activities to take action regarding a detected cybersecurity incident.

Category	Subcategory	Bindy Implementation	Status
RE.RP (Response Planning)	Incident response plan	7 incident playbooks (P1-P7) following NIST lifecycle	✅ Complete
RE.CO (Communications)	Incident communication plan	Slack war rooms, status page, regulatory reporting	✅ Complete
RE.AN (Analysis)	Incident analysis procedures	Root cause analysis, forensic preservation	✅ Complete
RE.MI (Mitigation)	Incident containment procedures	Isolation, credential rotation, rollback	✅ Complete
RE.IM (Improvements)	Post-incident improvements	Post-mortem template, action items tracking	✅ Complete

Incident Response Playbooks (NIST Lifecycle):

Playbook	NIST Phases Covered	Response Time	Evidence
P1: Critical Vulnerability	Preparation, Detection, Containment, Eradication, Recovery	< 15 min	P1 Playbook
P2: Compromised Controller	All phases	< 15 min	P2 Playbook
P3: DNS Service Outage	Detection, Containment, Recovery	< 15 min	P3 Playbook
P4: RNDC Key Compromise	All phases	< 15 min	P4 Playbook
P5: Unauthorized DNS Changes	All phases	< 1 hour	P5 Playbook
P6: DDoS Attack	Detection, Containment, Recovery	< 15 min	P6 Playbook
P7: Supply Chain Compromise	All phases	< 15 min	P7 Playbook

NIST Incident Response Lifecycle:

Preparation ✅ - Playbooks documented, tools configured, team trained
Detection & Analysis ✅ - Prometheus alerts, audit log analysis
Containment, Eradication & Recovery ✅ - Isolation procedures, patching, service restoration
Post-Incident Activity ✅ - Post-mortem template, lessons learned, action items

Evidence:

Incident Response Playbooks
Post-incident review template (in playbooks)
Semi-annual tabletop exercise reports

Respond Function: ✅ 90% Complete (Comprehensive playbooks; needs annual tabletop exercise)

5. Recover (RE)

Objective: Develop and implement appropriate activities to maintain plans for resilience and to restore capabilities or services impaired due to a cybersecurity incident.

Category	Subcategory	Bindy Implementation	Status
RC.RP (Recovery Planning)	Disaster recovery plan	Multi-region deployment (planned), zone backups	⚠️ Planned
RC.IM (Improvements)	Recovery plan testing	⚠️ Annual DR drill needed	⚠️ Planned
RC.CO (Communications)	Recovery communication plan	Incident playbooks include recovery steps	✅ Complete

Current Recovery Capabilities:

Capability	RTO (Recovery Time Objective)	RPO (Recovery Point Objective)	Status
Pod Failure	0 (automatic restart)	0 (no data loss)	✅ Complete
Controller Failure	< 5 minutes (new pod scheduled)	0 (no data loss)	✅ Complete
BIND9 Pod Failure	< 5 minutes (new pod scheduled)	0 (zone data in etcd)	✅ Complete
Zone Data Loss	< 1 hour (restore from Git)	< 5 minutes (last reconciliation)	✅ Complete
Cluster Failure	⚠️ < 4 hours (manual failover)	< 1 hour (last etcd backup)	⚠️ Needs testing
Region Failure	⚠️ < 24 hours (multi-region planned)	< 1 hour	⚠️ Planned

Planned Improvements:

L-2: Implement multi-region deployment (RTO < 1 hour for region failure)
Annual DR Drill: Test disaster recovery procedures (cluster failure, region failure)

Evidence:

High availability architecture: 3+ pod replicas, multi-zone
Zone backups: Git repository (all DNSZone CRDs)
Incident playbooks: P3 (DNS Service Outage) includes recovery steps

Recover Function: ⚠️ 50% Complete (Pod/controller recovery done; needs multi-region and DR testing)

NIST CSF Implementation Tiers

NIST CSF defines 4 implementation tiers (Partial, Risk Informed, Repeatable, Adaptive). Bindy is at Tier 3: Repeatable.

Tier	Description	Bindy Status
Tier 1: Partial	Ad hoc, reactive risk management	❌
Tier 2: Risk Informed	Risk management practices approved but not policy	❌
Tier 3: Repeatable	Formally approved policies, regularly updated	✅ Current
Tier 4: Adaptive	Continuous improvement based on lessons learned	⚠️ Target

Tier 3 Evidence:

Formal security policies documented and published
Incident response playbooks (repeatable processes)
Quarterly compliance reviews
Annual policy reviews (Next Review: 2026-03-18)

Tier 4 Roadmap:

Implement continuous security metrics dashboard
Quarterly threat intelligence updates to policies
Annual penetration testing with policy updates
Automated compliance reporting

NIST CSF Compliance Summary

Function	Completion	Priority Gaps	Target Date
Identify	90%	Supply chain deep dive	Q1 2026
Protect	80%	NetworkPolicies (L-1)	Q1 2026
Detect	60%	Network monitoring (L-1)	Q1 2026
Respond	90%	Annual tabletop exercise	Q2 2026
Recover	50%	Multi-region deployment (L-2), DR testing	Q2 2026

Overall NIST CSF Maturity: ⚠️ 60% (Tier 3: Repeatable)

Target: 90% (Tier 4: Adaptive) by Q2 2026

NIST CSF Audit Evidence

For NIST CSF assessments, provide:

Identify Function:
- Asset inventory (Kubernetes resources in Git)
- Threat model (STRIDE analysis)
- Compliance roadmap (risk mitigation tracking)
- SBOM (dependency inventory)
Protect Function:
- RBAC policy and verification output
- Kubernetes Security Context (non-root, read-only FS)
- Vulnerability management policy (SLAs, remediation tracking)
- Secret access audit trail
Detect Function:
- Prometheus alerting rules
- Vulnerability scan results (daily cargo audit, Trivy)
- Incident detection playbooks
Respond Function:
- 7 incident response playbooks (P1-P7)
- Post-incident review template
- Tabletop exercise results (semi-annual)
Recover Function:
- High availability architecture (3+ replicas, multi-zone)
- Zone backup procedures (Git repository)
- Disaster recovery plan (in progress)

API Reference

This document describes the Custom Resource Definitions (CRDs) provided by Bindy.

Note: This file is AUTO-GENERATED from src/crd.rs DO NOT EDIT MANUALLY - Run cargo run --bin crddoc to regenerate

Zone Management
- DNSZone
DNS Records
- ARecord
- AAAARecord
- CNAMERecord
- MXRecord
- NSRecord
- TXTRecord
- SRVRecord
- CAARecord
Infrastructure
- Bind9Cluster
- Bind9Instance

Zone Management

DNSZone

API Version: bindy.firestoned.io/v1alpha1

DNSZone represents an authoritative DNS zone managed by BIND9. Each DNSZone defines a zone (e.g., example.com) with SOA record parameters. Can reference either a namespace-scoped Bind9Cluster or cluster-scoped Bind9GlobalCluster.

Spec Fields

Field	Type	Required	Description
`clusterRef`	string	No	Reference to a namespace-scoped `Bind9Cluster` in the same namespace. Must match the name of a `Bind9Cluster` resource in the same namespace. The zone will be added to all instances in this cluster. Either `clusterRef` or `globalClusterRef` must be specified (not both).
`globalClusterRef`	string	No	Reference to a cluster-scoped `Bind9GlobalCluster`. Must match the name of a `Bind9GlobalCluster` resource (cluster-scoped). The zone will be added to all instances in this global cluster. Either `clusterRef` or `globalClusterRef` must be specified (not both).
`nameServerIps`	object	No	Map of nameserver hostnames to IP addresses for glue records. Glue records provide IP addresses for nameservers within the zone’s own domain. This is necessary when delegating subdomains where the nameserver is within the delegated zone itself. Example: When delegating `sub.example.com` with nameserver `ns1.sub.example.com`, you must provide the IP address of `ns1.sub.example.com` as a glue record. Format: `{“ns1.example.com.”: “192.0.2.1”, “ns2.example.com.”: “192.0.2.2”}` Note: Nameserver hostnames should end with a dot (.) for FQDN.
`soaRecord`	object	Yes	SOA (Start of Authority) record - defines zone authority and refresh parameters. The SOA record is required for all authoritative zones and contains timing information for zone transfers and caching.
`ttl`	integer	No	Default TTL (Time To Live) for records in this zone, in seconds. If not specified, individual records must specify their own TTL. Typical values: 300-86400 (5 minutes to 1 day).
`zoneName`	string	Yes	DNS zone name (e.g., “example.com”). Must be a valid DNS zone name. Can be a domain or subdomain. Examples: “example.com”, “internal.example.com”, “10.in-addr.arpa”

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No
`recordCount`	integer	No
`secondaryIps`	array	No	IP addresses of secondary servers configured for zone transfers. Used to detect when secondary IPs change and zones need updating.

DNS Records

ARecord

API Version: bindy.firestoned.io/v1alpha1

ARecord maps a DNS hostname to an IPv4 address. Multiple A records for the same name enable round-robin DNS load balancing.

Spec Fields

Field	Type	Required	Description
`ipv4Address`	string	Yes	IPv4 address in dotted-decimal notation. Must be a valid IPv4 address (e.g., “192.0.2.1”).
`name`	string	Yes	Record name within the zone. Use “@” for the zone apex. Examples: “www”, “mail”, “ftp”, “@” The full DNS name will be: {name}.{zone}
`ttl`	integer	No	Time To Live in seconds. Overrides zone default TTL if specified. Typical values: 60-86400 (1 minute to 1 day).
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name. This is more efficient than searching by zone name. Example: If the `DNSZone` is named “example-com”, use `zoneRef: example-com`

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

AAAARecord

API Version: bindy.firestoned.io/v1alpha1

AAAARecord maps a DNS hostname to an IPv6 address. This is the IPv6 equivalent of an A record.

Spec Fields

Field	Type	Required	Description
`ipv6Address`	string	Yes	IPv6 address in standard notation. Examples: `2001:db8::1`, `fe80::1`, `::1`
`name`	string	Yes	Record name within the zone.
`ttl`	integer	No	Time To Live in seconds.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

CNAMERecord

API Version: bindy.firestoned.io/v1alpha1

CNAMERecord creates a DNS alias from one hostname to another. A CNAME cannot coexist with other record types for the same name.

Spec Fields

Field	Type	Required	Description
`name`	string	Yes	Record name within the zone. Note: CNAME records cannot be created at the zone apex (@).
`target`	string	Yes	Target hostname (canonical name). Should be a fully qualified domain name ending with a dot. Example: “example.com.” or “www.example.com.”
`ttl`	integer	No	Time To Live in seconds.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

MXRecord

API Version: bindy.firestoned.io/v1alpha1

MXRecord specifies mail exchange servers for a domain. Lower priority values indicate higher preference for mail delivery.

Spec Fields

Field	Type	Required	Description
`mailServer`	string	Yes	Fully qualified domain name of the mail server. Must end with a dot. Example: “mail.example.com.”
`name`	string	Yes	Record name within the zone. Use “@” for the zone apex.
`priority`	integer	Yes	Priority (preference) of this mail server. Lower values = higher priority. Common values: 0-100. Multiple MX records can exist with different priorities.
`ttl`	integer	No	Time To Live in seconds.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

NSRecord

API Version: bindy.firestoned.io/v1alpha1

NSRecord delegates a subdomain to authoritative nameservers. Used for subdomain delegation to different DNS providers or servers.

Spec Fields

Field	Type	Required	Description
`name`	string	Yes	Subdomain to delegate. For zone apex, use “@”.
`nameserver`	string	Yes	Fully qualified domain name of the nameserver. Must end with a dot. Example: “ns1.example.com.”
`ttl`	integer	No	Time To Live in seconds.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

TXTRecord

API Version: bindy.firestoned.io/v1alpha1

TXTRecord stores arbitrary text data in DNS. Commonly used for SPF, DKIM, DMARC policies, and domain verification.

Spec Fields

Field	Type	Required	Description
`name`	string	Yes	Record name within the zone.
`text`	array	Yes	Array of text strings. Each string can be up to 255 characters. Multiple strings are concatenated by DNS resolvers. For long text, split into multiple strings.
`ttl`	integer	No	Time To Live in seconds.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

SRVRecord

API Version: bindy.firestoned.io/v1alpha1

SRVRecord specifies the hostname and port of servers for specific services. The record name follows the format _service._proto (e.g., _ldap._tcp).

Spec Fields

Field	Type	Required	Description
`name`	string	Yes	Service and protocol in the format: _service._proto Example: “_ldap._tcp”, “_sip._udp”, “_http._tcp”
`port`	integer	Yes	TCP or UDP port where the service is available.
`priority`	integer	Yes	Priority of the target host. Lower values = higher priority.
`target`	string	Yes	Fully qualified domain name of the target host. Must end with a dot. Use “.” for “service not available”.
`ttl`	integer	No	Time To Live in seconds.
`weight`	integer	Yes	Relative weight for records with the same priority. Higher values = higher probability of selection.
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

CAARecord

API Version: bindy.firestoned.io/v1alpha1

CAARecord specifies which certificate authorities are authorized to issue certificates for a domain. Enhances domain security and certificate issuance control.

Spec Fields

Field	Type	Required	Description
`flags`	integer	Yes	Flags byte. Use 0 for non-critical, 128 for critical. Critical flag (128) means CAs must understand the tag.
`name`	string	Yes	Record name within the zone. Use “@” for the zone apex.
`tag`	string	Yes	Property tag. Common values: “issue”, “issuewild”, “iodef”. - “issue”: Authorize CA to issue certificates - “issuewild”: Authorize CA to issue wildcard certificates - “iodef”: URL/email for violation reports
`ttl`	integer	No	Time To Live in seconds.
`value`	string	Yes	Property value. Format depends on the tag. For “issue”/“issuewild”: CA domain (e.g., “letsencrypt.org”) For “iodef”: mailto: or https: URL
`zoneRef`	string	Yes	Reference to a `DNSZone` resource by metadata.name. Directly references a `DNSZone` resource in the same namespace by its Kubernetes resource name.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No

Infrastructure

Bind9Cluster

API Version: bindy.firestoned.io/v1alpha1

Bind9Cluster defines a namespace-scoped logical grouping of BIND9 DNS server instances. Use this for tenant-managed DNS infrastructure isolated to a specific namespace. For platform-managed cluster-wide DNS, use Bind9GlobalCluster instead.

Spec Fields

Field	Type	Required	Description
`acls`	object	No	ACLs that can be referenced by instances
`configMapRefs`	object	No	`ConfigMap` references for BIND9 configuration files
`global`	object	No	Global configuration shared by all instances in the cluster This configuration applies to all instances (both primary and secondary) unless overridden at the instance level or by role-specific configuration.
`image`	object	No	Container image configuration
`primary`	object	No	Primary instance configuration Configuration specific to primary (authoritative) DNS instances, including replica count and service specifications.
`rndcSecretRefs`	array	No	References to Kubernetes Secrets containing RNDC/TSIG keys for authenticated zone transfers. Each secret should contain the key name, algorithm, and base64-encoded secret value. These secrets are used for secure communication with BIND9 instances via RNDC and for authenticated zone transfers (AXFR/IXFR) between primary and secondary servers.
`secondary`	object	No	Secondary instance configuration Configuration specific to secondary (replica) DNS instances, including replica count and service specifications.
`version`	string	No	Shared BIND9 version for the cluster
`volumeMounts`	array	No	Volume mounts that specify where volumes should be mounted in containers These mounts are inherited by all instances unless overridden.
`volumes`	array	No	Volumes that can be mounted by instances in this cluster These volumes are inherited by all instances unless overridden. Common use cases include `PersistentVolumeClaims` for zone data storage.

Status Fields

Field	Type	Required	Description
`conditions`	array	No	Status conditions for this cluster
`instanceCount`	integer	No	Number of instances in this cluster
`instances`	array	No	Names of `Bind9Instance` resources created for this cluster
`observedGeneration`	integer	No	Observed generation for optimistic concurrency
`readyInstances`	integer	No	Number of ready instances

Bind9Instance

API Version: bindy.firestoned.io/v1alpha1

Bind9Instance represents a BIND9 DNS server deployment in Kubernetes. Each instance creates a Deployment, Service, ConfigMap, and Secret for managing a BIND9 server with RNDC protocol communication.

Spec Fields

Field	Type	Required	Description
`bindcarConfig`	object	No	Bindcar RNDC API sidecar container configuration. The API container provides an HTTP interface for managing zones via rndc. If not specified, uses default configuration.
`clusterRef`	string	Yes	Reference to the cluster this instance belongs to. Can reference either: - A namespace-scoped `Bind9Cluster` (must be in the same namespace as this instance) - A cluster-scoped `Bind9GlobalCluster` (cluster-wide, accessible from any namespace) The cluster provides shared configuration and defines the logical grouping. The controller will automatically detect whether this references a namespace-scoped or cluster-scoped cluster resource.
`config`	object	No	Instance-specific BIND9 configuration overrides. Overrides cluster-level configuration for this instance only.
`configMapRefs`	object	No	`ConfigMap` references override. Inherits from cluster if not specified.
`image`	object	No	Container image configuration override. Inherits from cluster if not specified.
`primaryServers`	array	No	Primary server addresses for zone transfers (required for secondary instances). List of IP addresses or hostnames of primary servers to transfer zones from. Example: `[“10.0.1.10”, “primary.example.com”]`
`replicas`	integer	No	Number of pod replicas for high availability. Defaults to 1 if not specified. For production, use 2+ replicas.
`rndcSecretRef`	object	No	Reference to an existing Kubernetes Secret containing RNDC key. If specified, uses this existing Secret instead of auto-generating one. The Secret must contain the keys specified in the reference (defaults: “key-name”, “algorithm”, “secret”, “rndc.key”). This allows sharing RNDC keys across instances or using externally managed secrets. If not specified, a Secret will be auto-generated for this instance.
`role`	string	Yes	Role of this instance (primary or secondary). Primary instances are authoritative for zones. Secondary instances replicate zones from primaries via AXFR/IXFR.
`storage`	object	No	Storage configuration for zone files. Specifies how zone files should be stored. Defaults to emptyDir (ephemeral storage). For persistent storage, use persistentVolumeClaim.
`version`	string	No	BIND9 version override. Inherits from cluster if not specified. Example: “9.18”, “9.16”
`volumeMounts`	array	No	Volume mounts override for this instance. Inherits from cluster if not specified. These mounts override cluster-level volume mounts.
`volumes`	array	No	Volumes override for this instance. Inherits from cluster if not specified. These volumes override cluster-level volumes. Common use cases include instance-specific `PersistentVolumeClaims` for zone data storage.

Status Fields

Field	Type	Required	Description
`conditions`	array	No
`observedGeneration`	integer	No
`readyReplicas`	integer	No
`replicas`	integer	No
`serviceAddress`	string	No	IP or hostname of this instance’s service

Bind9Cluster Specification

Complete specification for the Bind9Cluster Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: string
  namespace: string
spec:
  version: string              # Optional, BIND9 version
  image:                       # Optional, container image config
    image: string
    imagePullPolicy: string
    imagePullSecrets: [string]
  configMapRefs:               # Optional, custom config files
    namedConf: string
    namedConfOptions: string
  global:                      # Optional, global BIND9 config for all instances
    recursion: boolean
    allowQuery: [string]       # ⚠️ NO DEFAULT - must be explicitly set
    allowTransfer: [string]    # ⚠️ NO DEFAULT - must be explicitly set
    dnssec:
      enabled: boolean
      validation: boolean
    forwarders: [string]
    listenOn: [string]
    listenOnV6: [string]
  rndcSecretRefs: [RndcSecretRef]  # Optional, refs to Secrets with RNDC/TSIG keys
  acls:                        # Optional, named ACLs
    name: [string]
  volumes: [Volume]            # Optional, Kubernetes volumes
  volumeMounts: [VolumeMount]  # Optional, volume mount specifications

Overview

Bind9Cluster defines a logical grouping of BIND9 DNS server instances with shared configuration. It provides centralized management of BIND9 version, container images, and common settings across multiple instances.

Key Features:

Shared version and image configuration
Centralized BIND9 configuration
TSIG key management for secure zone transfers
Named ACLs for access control
Cluster-wide status reporting

Spec Fields

version

Type: string Required: No Default: “9.18”

BIND9 version to deploy across all instances in the cluster unless overridden at the instance level.

spec:
  version: "9.18"

Supported Versions:

“9.16” - Older stable
“9.18” - Current stable (recommended)
“9.19” - Development

image

Type: object Required: No

Container image configuration shared by all instances in the cluster.

spec:
  image:
    image: "internetsystemsconsortium/bind9:9.18"
    imagePullPolicy: "IfNotPresent"
    imagePullSecrets:
      - my-registry-secret

How It Works:

Instances inherit image configuration from the cluster
Instances can override with their own image config
Simplifies managing container images across multiple instances

image.image

Type: string Required: No Default: “internetsystemsconsortium/bind9:9.18”

Full container image reference including registry, repository, and tag.

spec:
  image:
    image: "my-registry.example.com/bind9:custom"

image.imagePullPolicy

Type: string Required: No Default: “IfNotPresent”

Kubernetes image pull policy.

Valid Values:

"Always" - Always pull the image
"IfNotPresent" - Pull only if not present locally (recommended)
"Never" - Never pull, use local image only

image.imagePullSecrets

Type: array of strings Required: No Default: []

List of Kubernetes secret names for authenticating with private container registries.

spec:
  image:
    imagePullSecrets:
      - docker-registry-secret

configMapRefs

Type: object Required: No

References to custom ConfigMaps containing BIND9 configuration files shared across the cluster.

spec:
  configMapRefs:
    namedConf: "cluster-named-conf"
    namedConfOptions: "cluster-options"

How It Works:

Cluster-level ConfigMaps apply to all instances
Instances can override with their own ConfigMap references
Useful for sharing common configuration

configMapRefs.namedConf

Type: string Required: No

Name of ConfigMap containing the main named.conf file.

configMapRefs.namedConfOptions

Type: string Required: No

Name of ConfigMap containing the named.conf.options file.

global

Type: object Required: No

Global BIND9 configuration shared across all instances in the cluster.

⚠️ Warning: There are NO defaults for allowQuery and allowTransfer. If not specified, BIND9’s default behavior applies (no queries or transfers allowed). Always explicitly configure these fields for your security requirements.

spec:
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true
      validation: auto

How It Works:

All instances inherit global configuration
Instances can override specific settings
Role-specific configuration (primary/secondary) can override global settings
Changes propagate to all instances using global config

global.recursion

Type: boolean Required: No Default: false

Enable recursive DNS queries.

global.allowQuery

Type: array of strings Required: No Default: None (BIND9 default: no queries allowed)

IP addresses or CIDR blocks allowed to query servers in this cluster.

⚠️ Warning: No default value is provided. You must explicitly configure this field or queries will be denied.

global.allowTransfer

Type: array of strings Required: No Default: None (BIND9 default: no transfers allowed)

IP addresses or CIDR blocks allowed to perform zone transfers.

⚠️ Warning: No default value is provided. You must explicitly configure this field or zone transfers will be denied.

global.dnssec

Type: object Required: No

DNSSEC configuration for the cluster.

global.dnssec.enabled

Type: boolean Required: No Default: false

Enable DNSSEC signing for zones.

global.dnssec.validation

Type: boolean Required: No Default: false

Enable DNSSEC validation for recursive queries.

global.forwarders

Type: array of strings Required: No Default: []

DNS servers to forward queries to (for recursive mode).

spec:
  global:
    recursion: true
    forwarders:
      - "8.8.8.8"
      - "1.1.1.1"

global.listenOn

Type: array of strings Required: No Default: [“any”]

IPv4 addresses to listen on.

global.listenOnV6

Type: array of strings Required: No Default: [“any”]

IPv6 addresses to listen on.

rndcSecretRefs

Type: array of RndcSecretRef objects Required: No Default: []

References to Kubernetes Secrets containing RNDC/TSIG keys for authenticated zone transfers and RNDC communication.

# 1. Create Secret with credentials
apiVersion: v1
kind: Secret
metadata:
  name: transfer-key-secret
type: Opaque
stringData:
  key-name: transfer-key
  secret: base64-encoded-hmac-key

---
# 2. Reference in Bind9Cluster
spec:
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256  # Algorithm specified in CRD

How It Works:

RNDC/TSIG keys authenticate zone transfers and RNDC commands
Keys stored securely in Kubernetes Secrets
Algorithm specified in CRD for type safety
Keys are shared across all instances in the cluster

RndcSecretRef Fields:

name (string, required) - Name of the Kubernetes Secret
algorithm (RndcAlgorithm, optional) - HMAC algorithm (defaults to hmac-sha256)
- Supported: hmac-md5, hmac-sha1, hmac-sha224, hmac-sha256, hmac-sha384, hmac-sha512
keyNameKey (string, optional) - Key in secret for key name (defaults to “key-name”)
secretKey (string, optional) - Key in secret for secret value (defaults to “secret”)

acls

Type: object (map of string arrays) Required: No Default: {}

Named Access Control Lists that can be referenced in instance configurations.

spec:
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
    trusted:
      - "192.168.1.0/24"
    external:
      - "0.0.0.0/0"

How It Works:

Define ACLs once at cluster level
Reference by name in instance configurations
Simplifies managing access control across instances

Usage Example:

# In Bind9Instance
spec:
  global:
    allowQuery:
      - "acl:internal"
    allowTransfer:
      - "acl:trusted"

volumes

Type: array of Kubernetes Volume objects Required: No Default: []

Kubernetes volumes that can be mounted by instances in this cluster.

spec:
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-pvc
    - name: config-override
      configMap:
        name: custom-bind-config

How It Works:

Volumes defined at cluster level are inherited by all instances
Instances can override with their own volumes
Common use cases include:
- PersistentVolumeClaims for zone data persistence
- ConfigMaps for custom configuration files
- Secrets for sensitive data like TSIG keys
- EmptyDir for temporary storage

Volume Types: Supports all Kubernetes volume types including:

persistentVolumeClaim - Persistent storage for zone data
configMap - Configuration files
secret - Sensitive data
emptyDir - Temporary storage
hostPath - Host directory (use with caution)
nfs - Network file system

volumeMounts

Type: array of Kubernetes VolumeMount objects Required: No Default: []

Volume mount specifications that define where volumes should be mounted in containers.

spec:
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: dns-zone-pvc
  volumeMounts:
    - name: zone-data
      mountPath: /var/lib/bind
      readOnly: false

How It Works:

Volume mounts must reference volumes defined in the volumes field
Each mount specifies the volume name and where to mount it
Instances inherit cluster-level volume mounts unless overridden
Mounts are applied to the BIND9 container

VolumeMount Fields:

name (string, required) - Volume name to mount (must match a volume)
mountPath (string, required) - Path in container where volume is mounted
readOnly (boolean, optional) - Mount as read-only (default: false)
subPath (string, optional) - Sub-path within the volume

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions indicating cluster state.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: AllInstancesReady
      message: "All 3 instances are ready"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

Ready - Cluster is ready (all instances operational)
Degraded - Some instances are not ready
Progressing - Cluster is being reconciled

observedGeneration

Type: integer

The generation of the resource that was last reconciled.

status:
  observedGeneration: 5

instanceCount

Type: integer

Total number of Bind9Instance resources referencing this cluster.

status:
  instanceCount: 3

readyInstances

Type: integer

Number of instances that are ready and serving traffic.

status:
  readyInstances: 3

Complete Examples

Basic Production Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true
      validation: auto
  rndcSecretRefs:
    - name: transfer-key-secret
      algorithm: hmac-sha256

Cluster with Custom Image

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-dns
  namespace: dns-system
spec:
  version: "9.18"
  image:
    image: "my-registry.example.com/bind9:hardened"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"

Recursive Resolver Cluster

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: resolver-cluster
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true
    allowQuery:
      - "10.0.0.0/8"  # Internal network only
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"
      - "1.1.1.1"
    dnssec:
      enabled: false
      validation: true
  acls:
    internal:
      - "10.0.0.0/8"
      - "172.16.0.0/12"
      - "192.168.0.0/16"

Multi-Region Cluster with ACLs

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: global-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "acl:secondary-servers"
    dnssec:
      enabled: true
  rndcSecretRefs:
    - name: us-east-transfer-secret
      algorithm: hmac-sha256
    - name: us-west-transfer-secret
      algorithm: hmac-sha256
    - name: eu-transfer-secret
      algorithm: hmac-sha512  # Different algorithm for EU
  acls:
    secondary-servers:
      - "10.1.0.0/24"  # US East
      - "10.2.0.0/24"  # US West
      - "10.3.0.0/24"  # EU
    monitoring:
      - "10.0.10.0/24"

Cluster with Persistent Storage

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: persistent-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: true
  # Define persistent volume for zone data
  volumes:
    - name: zone-data
      persistentVolumeClaim:
        claimName: bind-zone-storage
  volumeMounts:
    - name: zone-data
      mountPath: /var/lib/bind
      readOnly: false

Prerequisites: Create a PersistentVolumeClaim first:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: bind-zone-storage
  namespace: dns-system
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

Cluster Hierarchy

Bind9Cluster
    ├── Defines shared configuration
    ├── Manages TSIG keys
    ├── Defines ACLs
    └── Referenced by one or more Bind9Instances
            ├── Instance inherits cluster config
            ├── Instance can override cluster settings
            └── Instance uses cluster TSIG keys

Configuration Inheritance

When a Bind9Instance references a Bind9Cluster:

Version - Instance inherits cluster version unless it specifies its own
Image - Instance inherits cluster image config unless it specifies its own
Config - Instance inherits cluster config unless it specifies its own
TSIG Keys - Instance uses cluster TSIG keys for zone transfers
ACLs - Instance can reference cluster ACLs by name

Override Priority: Instance-level config > Cluster-level config > Default values

Bind9Instance Specification - Individual DNS server instances
DNSZone Specification - DNS zones managed by instances
Examples - Complete configuration examples

Bind9Instance Specification

Complete specification for the Bind9Instance Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: string
  namespace: string
  labels:
    key: value
spec:
  clusterRef: string          # References Bind9Cluster
  role: primary|secondary     # Required: Server role
  replicas: integer
  version: string             # Optional, overrides cluster version
  image:                      # Optional, overrides cluster image
    image: string
    imagePullPolicy: string
    imagePullSecrets: [string]
  configMapRefs:              # Optional, custom config files
    namedConf: string
    namedConfOptions: string
  global:                     # Optional, overrides cluster global config
    recursion: boolean
    allowQuery: [string]
    allowTransfer: [string]
    dnssec:
      enabled: boolean
      validation: boolean
    forwarders: [string]
    listenOn: [string]
    listenOnV6: [string]
  primaryServers: [string]    # Required for secondary role

Spec Fields

clusterRef

Type: string Required: Yes

Name of the Bind9Cluster that this instance belongs to. The instance inherits cluster-level configuration (version, shared config, TSIG keys, ACLs) from the referenced cluster.

spec:
  clusterRef: production-dns  # References Bind9Cluster named "production-dns"

How It Works:

Instance inherits version from cluster unless overridden
Instance inherits global config from cluster unless overridden
Controller uses cluster TSIG keys for zone transfers
Instance can override cluster settings with its own spec

replicas

Type: integer Required: No Default: 1

Number of BIND9 pod replicas to run.

spec:
  replicas: 3

Best Practices:

Use 2+ replicas for high availability
Use odd numbers (3, 5) for consensus-based systems
Consider resource constraints when scaling

version

Type: string Required: No Default: “9.18”

BIND9 version to deploy. Must match available Docker image tags.

spec:
  version: "9.18"

Supported Versions:

“9.16” - Older stable
“9.18” - Current stable (recommended)
“9.19” - Development

image

Type: object Required: No

Container image configuration for the BIND9 instance. Overrides cluster-level image configuration.

spec:
  image:
    image: "my-registry.example.com/bind9:custom"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret

How It Works:

If not specified, inherits from Bind9Cluster.spec.image
If cluster doesn’t specify, uses default image internetsystemsconsortium/bind9:9.18
Instance-level configuration takes precedence over cluster configuration

image.image

Type: string Required: No Default: “internetsystemsconsortium/bind9:9.18”

Full container image reference including registry, repository, and tag.

spec:
  image:
    image: "docker.io/internetsystemsconsortium/bind9:9.18"

Examples:

Public registry: "internetsystemsconsortium/bind9:9.18"
Private registry: "my-registry.example.com/dns/bind9:custom"
With digest: "bind9@sha256:abc123..."

image.imagePullPolicy

Type: string Required: No Default: “IfNotPresent”

Kubernetes image pull policy.

spec:
  image:
    imagePullPolicy: "Always"

Valid Values:

"Always" - Always pull the image
"IfNotPresent" - Pull only if not present locally (recommended)
"Never" - Never pull, use local image only

image.imagePullSecrets

Type: array of strings Required: No Default: []

List of Kubernetes secret names for authenticating with private container registries.

spec:
  image:
    imagePullSecrets:
      - docker-registry-secret
      - gcr-pull-secret

Setup:

Create a docker-registry secret:

kubectl create secret docker-registry my-registry-secret \
  --docker-server=my-registry.example.com \
  --docker-username=user \
  --docker-password=pass \
  --docker-email=email@example.com

Reference the secret name in imagePullSecrets

configMapRefs

Type: object Required: No

References to custom ConfigMaps containing BIND9 configuration files. Overrides cluster-level ConfigMap references.

spec:
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"

How It Works:

If specified, Bindy uses your custom ConfigMaps instead of auto-generating configuration
If not specified, Bindy auto-generates ConfigMaps from the config block
Instance-level references override cluster-level references
You can specify one or both ConfigMaps

Default Behavior:

If configMapRefs is not set, Bindy creates a ConfigMap named <instance-name>-config
Auto-generated ConfigMap includes both named.conf and named.conf.options
Configuration is built from the config block in the spec

configMapRefs.namedConf

Type: string Required: No

Name of ConfigMap containing the main named.conf file.

spec:
  configMapRefs:
    namedConf: "my-named-conf"

ConfigMap Format:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel custom_log {
        file "/var/log/named/queries.log" versions 3 size 5m;
        severity info;
      };
      category queries { custom_log; };
    };

File Location: The ConfigMap data must have a key named.conf which will be mounted at /etc/bind/named.conf

configMapRefs.namedConfOptions

Type: string Required: No

Name of ConfigMap containing the named.conf.options file.

spec:
  configMapRefs:
    namedConfOptions: "my-options"

ConfigMap Format:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      dnssec-validation auto;
    };

File Location: The ConfigMap data must have a key named.conf.options which will be mounted at /etc/bind/named.conf.options

Examples:

Using separate ConfigMaps for fine-grained control:

spec:
  configMapRefs:
    namedConf: "prod-named-conf"
    namedConfOptions: "prod-options"

Using only custom options, auto-generating main config:

spec:
  configMapRefs:
    namedConfOptions: "my-custom-options"
  # namedConf not specified - will be auto-generated

global

Type: object Required: No

BIND9 configuration options that override cluster-level global configuration.

global.recursion

Type: boolean Required: No Default: false

Enable recursive DNS queries. Should be false for authoritative servers.

spec:
  global:
    recursion: false

Warning: Enabling recursion on public-facing authoritative servers is a security risk.

global.allowQuery

Type: array of strings Required: No Default: [“0.0.0.0/0”]

IP addresses or CIDR blocks allowed to query this server.

spec:
  global:
    allowQuery:
      - "0.0.0.0/0"        # Allow all (public DNS)
      - "10.0.0.0/8"       # Private network
      - "192.168.1.0/24"   # Specific subnet

global.allowTransfer

Type: array of strings Required: No Default: []

IP addresses or CIDR blocks allowed to perform zone transfers (AXFR/IXFR).

spec:
  global:
    allowTransfer:
      - "10.0.1.10"        # Specific secondary server
      - "10.0.1.11"        # Another secondary

Security Note: Restrict zone transfers to trusted secondary servers only.

global.dnssec

Type: object Required: No

DNSSEC configuration for signing zones and validating responses.

global.dnssec.enabled

Type: boolean Required: No Default: false

Enable DNSSEC signing for zones.

spec:
  global:
    dnssec:
      enabled: true

global.dnssec.validation

Type: boolean Required: No Default: false

Enable DNSSEC validation for recursive queries.

spec:
  global:
    dnssec:
      enabled: true
      validation: true

global.forwarders

Type: array of strings Required: No Default: []

DNS servers to forward queries to (for recursive mode).

spec:
  global:
    recursion: true
    forwarders:
      - "8.8.8.8"
      - "8.8.4.4"

global.listenOn

Type: array of strings Required: No Default: [“any”]

IPv4 addresses to listen on.

spec:
  global:
    listenOn:
      - "any"              # All IPv4 interfaces
      - "10.0.1.10"        # Specific IP

global.listenOnV6

Type: array of strings Required: No Default: [“any”]

IPv6 addresses to listen on.

spec:
  global:
    listenOnV6:
      - "any"              # All IPv6 interfaces
      - "2001:db8::1"      # Specific IPv6

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions indicating resource state.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSuccess
      message: "Instance is ready"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

Ready - Instance is ready for use
Available - Instance is serving DNS queries
Progressing - Instance is being reconciled
Degraded - Instance is partially functional
Failed - Instance reconciliation failed

observedGeneration

Type: integer

The generation of the resource that was last reconciled.

status:
  observedGeneration: 5

replicas

Type: integer

Total number of replicas configured.

status:
  replicas: 3

readyReplicas

Type: integer

Number of replicas that are ready and serving traffic.

status:
  readyReplicas: 3

Complete Example

Primary DNS Instance

# First create the Bind9Cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: production-dns
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"
    dnssec:
      enabled: true

---
# Then create the Bind9Instance referencing the cluster
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system
  labels:
    dns-role: primary
    environment: production
spec:
  clusterRef: production-dns  # References cluster above
  role: primary  # Required: primary or secondary
  replicas: 2
  # Inherits version and global config from cluster

Secondary DNS Instance

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system
  labels:
    dns-role: secondary
    environment: production
spec:
  clusterRef: production-dns  # References same cluster as primary
  role: secondary  # Required: primary or secondary
  replicas: 2
  # Override global config for secondary role
  global:
    allowTransfer: []  # No zone transfers from secondary
    dnssec:
      enabled: false
      validation: true

Recursive Resolver

# Separate cluster for resolvers
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: resolver-cluster
  namespace: dns-system
spec:
  version: "9.18"
  global:
    recursion: true
    allowQuery:
      - "10.0.0.0/8"  # Internal network only
    forwarders:
      - "8.8.8.8"
      - "1.1.1.1"
    dnssec:
      enabled: false
      validation: true

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: resolver
  namespace: dns-system
  labels:
    dns-role: resolver
spec:
  clusterRef: resolver-cluster
  role: primary  # Required: primary or secondary
  replicas: 3
  # Inherits recursive global config from cluster

DNSZone Specification
Examples
Configuration Guide

DNSZone Specification

Complete specification for the DNSZone Custom Resource Definition.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: string
  namespace: string
spec:
  zoneName: string
  clusterRef: string        # References Bind9Cluster
  soaRecord:
    primaryNs: string
    adminEmail: string
    serial: integer
    refresh: integer
    retry: integer
    expire: integer
    negativeTtl: integer
  ttl: integer

Spec Fields

zoneName

Type: string Required: Yes

The DNS zone name (domain name).

spec:
  zoneName: "example.com"

Requirements:

Must be a valid DNS domain name
Maximum 253 characters
Can be forward or reverse zone

Examples:

“example.com”
“subdomain.example.com”
“1.0.10.in-addr.arpa” (reverse zone)

clusterRef

Type: string Required: Yes

Name of the Bind9Cluster that will manage this zone.

spec:
  clusterRef: production-dns  # References Bind9Cluster named "production-dns"

How It Works:

Controller finds Bind9Cluster with this name
Discovers all Bind9Instance resources referencing this cluster
Identifies primary instances for zone hosting
Loads RNDC keys from cluster configuration
Creates zone on primary instances using rndc addzone command
Configures zone transfers to secondary instances

Validation:

Referenced Bind9Cluster must exist in same namespace
Controller validates reference at admission time

soaRecord

Type: object Required: Yes

Start of Authority record defining zone parameters.

spec:
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin.example.com."  # Note: @ replaced with .
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

soaRecord.primaryNs

Type: string Required: Yes

Primary nameserver for the zone.

soaRecord:
  primaryNs: "ns1.example.com."

Requirements:

Must be a fully qualified domain name (FQDN)
Must end with a dot (.)
Pattern: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*\.$

soaRecord.adminEmail

Type: string Required: Yes

Email address of zone administrator in DNS format.

soaRecord:
  adminEmail: "admin.example.com."  # Represents admin@example.com

Format:

Replace @ with . in email address
Must end with a dot (.)
Example: admin@example.com → admin.example.com.
Pattern: ^[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(\.[a-zA-Z0-9]([a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*\.$

soaRecord.serial

Type: integer (64-bit) Required: Yes Range: 0 to 4,294,967,295

Zone serial number for change tracking.

soaRecord:
  serial: 2024010101

Best Practices:

Use format: YYYYMMDDnn (year, month, day, revision)
Increment on every change
Secondaries use this to detect updates

Examples:

2024010101 - January 1, 2024, first revision
2024010102 - January 1, 2024, second revision

soaRecord.refresh

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How often (in seconds) secondary servers should check for updates.

soaRecord:
  refresh: 3600  # 1 hour

Typical Values:

3600 (1 hour) - Standard
7200 (2 hours) - Less frequent updates
900 (15 minutes) - Frequent updates

soaRecord.retry

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How long (in seconds) to wait before retrying a failed refresh.

soaRecord:
  retry: 600  # 10 minutes

Best Practice: Should be less than refresh value

soaRecord.expire

Type: integer (32-bit) Required: Yes Range: 1 to 2,147,483,647

How long (in seconds) secondary servers should keep serving zone data after primary becomes unreachable.

soaRecord:
  expire: 604800  # 1 week

Typical Values:

604800 (1 week) - Standard
1209600 (2 weeks) - Extended
86400 (1 day) - Short-lived zones

soaRecord.negativeTtl

Type: integer (32-bit) Required: Yes Range: 0 to 2,147,483,647

How long (in seconds) to cache negative responses (NXDOMAIN).

soaRecord:
  negativeTtl: 86400  # 24 hours

Typical Values:

86400 (24 hours) - Standard
3600 (1 hour) - Shorter caching
300 (5 minutes) - Very short for dynamic zones

ttl

Type: integer (32-bit) Required: No Default: 3600 Range: 0 to 2,147,483,647

Default Time To Live for records in this zone (in seconds).

spec:
  ttl: 3600  # 1 hour

Common Values:

3600 (1 hour) - Standard
300 (5 minutes) - Frequently changing zones
86400 (24 hours) - Stable zones

Status Fields

conditions

Type: array of objects

Standard Kubernetes conditions.

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Synchronized
      message: "Zone created for cluster: primary-dns"
      lastTransitionTime: "2024-01-15T10:30:00Z"

Condition Types:

Ready - Zone is created and serving
Synced - Zone is synchronized with BIND9
Failed - Zone creation or update failed

observedGeneration

Type: integer

The generation last reconciled.

status:
  observedGeneration: 3

recordCount

Type: integer

Number of DNS records in this zone.

status:
  recordCount: 42

Complete Examples

Simple Primary Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Production Zone with Custom TTL

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: api-example-com
  namespace: dns-system
spec:
  zoneName: api.example.com
  clusterRef: production-dns
  ttl: 300  # 5 minute default TTL for faster updates
  soaRecord:
    primaryNs: ns1.api.example.com.
    adminEmail: ops.example.com.
    serial: 2024010101
    refresh: 1800   # Check every 30 minutes
    retry: 300      # Retry after 5 minutes
    expire: 604800
    negativeTtl: 300  # Short negative cache

Reverse DNS Zone

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: reverse-zone
  namespace: dns-system
spec:
  zoneName: 1.0.10.in-addr.arpa
  clusterRef: primary-dns
  soaRecord:
    primaryNs: ns1.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

Multi-Region Setup

# East Region Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-east
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-east  # References east instance
  soaRecord:
    primaryNs: ns1.east.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

---
# West Region Zone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-west
  namespace: dns-system
spec:
  zoneName: example.com
  clusterRef: dns-west  # References west instance
  soaRecord:
    primaryNs: ns1.west.example.com.
    adminEmail: admin.example.com.
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400

Zone Creation Flow

When you create a DNSZone resource:

Admission - Kubernetes validates the resource schema
Controller watches - Bindy controller detects the new zone
Cluster lookup - Finds Bind9Cluster referenced by clusterRef
Instance discovery - Finds all Bind9Instance resources referencing the cluster
Primary identification - Identifies primary instances (with role: primary)
RNDC key load - Retrieves RNDC keys from cluster configuration
RNDC connection - Connects to primary instance pods via RNDC
Zone creation - Executes rndc addzone {zoneName} ... on primary instances
Zone transfer setup - Configures zone transfers to secondary instances
Status update - Updates DNSZone status to Ready

Bind9Cluster Specification
Bind9Instance Specification
Record Specifications
Creating Zones Guide
RNDC-Based Architecture

DNS Record Specifications

Complete specifications for all DNS record types.

Common Fields

All DNS record types share these common fields:

zone / zoneRef

Type: string Required: Exactly one of zone or zoneRef must be specified

Reference to the parent DNSZone resource. Use one of the following:

zone field - Matches against DNSZone.spec.zoneName (the actual DNS zone name):

spec:
  zone: "example.com"  # Matches DNSZone with spec.zoneName: example.com

zoneRef field - Direct reference to DNSZone.metadata.name (the Kubernetes resource name, recommended for production):

spec:
  zoneRef: "example-com"  # Matches DNSZone with metadata.name: example-com

Important: You must specify exactly one of zone or zoneRef - not both, not neither.

See Referencing DNS Zones for detailed comparison and best practices.

name

Type: string Required: Yes

The record name within the zone.

spec:
  name: "www"  # Creates www.example.com
  name: "@"    # Creates record at zone apex (example.com)

ttl

Type: integer Required: No Default: Inherited from zone

Time To Live in seconds.

spec:
  ttl: 300  # 5 minutes

A Record (IPv4 Address)

Maps hostnames to IPv4 addresses.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
  ttl: 300

Fields

ipv4Address

Type: string Required: Yes

IPv4 address in dotted decimal notation.

spec:
  ipv4Address: "192.0.2.1"

Example: Multiple A Records (Round Robin)

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com-1
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-example-com-2
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.2"

AAAA Record (IPv6 Address)

Maps hostnames to IPv6 addresses.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-example-com-v6
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  ipv6Address: "2001:db8::1"
  ttl: 300

Fields

ipv6Address

Type: string Required: Yes

IPv6 address in colon-separated hexadecimal notation.

spec:
  ipv6Address: "2001:db8::1"

Formats:

Full: “2001:0db8:0000:0000:0000:0000:0000:0001”
Compressed: “2001:db8::1”

Example: Dual Stack (IPv4 + IPv6)

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-v4
spec:
  zoneRef: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6
spec:
  zoneRef: "example-com"
  name: "www"
  ipv6Address: "2001:db8::1"

CNAME Record (Canonical Name)

Creates an alias from one hostname to another.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: www-alias
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "www"
  target: "server.example.com."
  ttl: 3600

Fields

target

Type: string Required: Yes

Target hostname (FQDN recommended).

spec:
  target: "server.example.com."

Restrictions

Cannot be created at zone apex (@)
Cannot coexist with other record types for same name
Target should be fully qualified (end with dot)

Example: CDN Alias

apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn-alias
spec:
  zoneRef: "example-com"
  name: "cdn"
  target: "d123456.cloudfront.net."

MX Record (Mail Exchange)

Specifies mail servers for the domain.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-primary
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail.example.com."
  ttl: 3600

Fields

priority

Type: integer Required: Yes

Priority (preference) value. Lower values are preferred.

spec:
  priority: 10  # Primary mail server
  priority: 20  # Backup mail server

mailServer

Type: string Required: Yes

Hostname of mail server (FQDN recommended).

spec:
  mailServer: "mail.example.com."

Example: Primary and Backup Mail Servers

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-primary
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail1.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mail-backup
spec:
  zoneRef: "example-com"
  name: "@"
  priority: 20
  mailServer: "mail2.example.com."

TXT Record (Text)

Stores arbitrary text data, commonly used for verification and policies.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  text:
    - "v=spf1 mx -all"
  ttl: 3600

Fields

text

Type: array of strings Required: Yes

Text values. Multiple strings are concatenated.

spec:
  text:
    - "v=spf1 mx -all"

Example: SPF, DKIM, and DMARC

---
# SPF Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf
spec:
  zoneRef: "example-com"
  name: "@"
  text:
    - "v=spf1 mx include:_spf.google.com ~all"
---
# DKIM Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim
spec:
  zoneRef: "example-com"
  name: "default._domainkey"
  text:
    - "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."
---
# DMARC Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc
spec:
  zoneRef: "example-com"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=quarantine; rua=mailto:dmarc@example.com"

NS Record (Name Server)

Delegates a subdomain to different nameservers.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: subdomain-delegation
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns1.subdomain.example.com."
  ttl: 3600

Fields

nameserver

Type: string Required: Yes

Nameserver hostname (FQDN recommended).

spec:
  nameserver: "ns1.subdomain.example.com."

Example: Subdomain Delegation

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns1
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns1.subdomain.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: NSRecord
metadata:
  name: sub-ns2
spec:
  zoneRef: "example-com"
  name: "subdomain"
  nameserver: "ns2.subdomain.example.com."

SRV Record (Service)

Specifies location of services.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: sip-service
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "_sip._tcp"
  priority: 10
  weight: 60
  port: 5060
  target: "sip.example.com."
  ttl: 3600

Fields

priority

Type: integer Required: Yes

Priority for target selection. Lower values are preferred.

spec:
  priority: 10

weight

Type: integer Required: Yes

Relative weight for same-priority targets.

spec:
  weight: 60  # 60% of traffic
  weight: 40  # 40% of traffic

port

Type: integer Required: Yes

Port number where service is available.

spec:
  port: 5060

target

Type: string Required: Yes

Hostname providing the service.

spec:
  target: "sip.example.com."

Example: Load Balanced Service

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-primary
spec:
  zoneRef: "example-com"
  name: "_service._tcp"
  priority: 10
  weight: 60
  port: 8080
  target: "server1.example.com."
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-secondary
spec:
  zoneRef: "example-com"
  name: "_service._tcp"
  priority: 10
  weight: 40
  port: 8080
  target: "server2.example.com."

CAA Record (Certificate Authority Authorization)

Restricts which CAs can issue certificates for the domain.

Resource Definition

apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-letsencrypt
  namespace: dns-system
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
  ttl: 3600

Fields

flags

Type: integer Required: Yes

Flags byte. Typically 0 (non-critical) or 128 (critical).

spec:
  flags: 0

tag

Type: string Required: Yes

Property tag.

Valid Tags:

“issue” - Authorize CA to issue certificates
“issuewild” - Authorize CA to issue wildcard certificates
“iodef” - URL for violation reports

spec:
  tag: "issue"

value

Type: string Required: Yes

Property value (CA domain or URL).

spec:
  value: "letsencrypt.org"

Example: Multiple CAA Records

---
# Allow Let's Encrypt for regular certs
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
---
# Allow Let's Encrypt for wildcard certs
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issuewild
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "issuewild"
  value: "letsencrypt.org"
---
# Violation reporting
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef
spec:
  zoneRef: "example-com"
  name: "@"
  flags: 0
  tag: "iodef"
  value: "mailto:security@example.com"

API Reference
DNSZone Specification
Examples
DNS Records Guide

Status Conditions

This document describes the standardized status conditions used across all Bindy CRDs.

Condition Types

All Bindy custom resources (Bind9Instance, DNSZone, and all DNS record types) use the following standardized condition types:

Ready

Description: Indicates whether the resource is fully operational and ready to serve its intended purpose
Common Use: Primary condition type used by all reconcilers
Status Values:
- True: Resource is ready and operational
- False: Resource is not ready (error or in progress)
- Unknown: Status cannot be determined

Available

Description: Indicates whether the resource is available for use
Common Use: Used to distinguish between “ready” and “available” when resources may be ready but not yet serving traffic
Status Values:
- True: Resource is available
- False: Resource is not available
- Unknown: Availability cannot be determined

Progressing

Description: Indicates whether the resource is currently being worked on
Common Use: During initial creation or updates
Status Values:
- True: Resource is being created or updated
- False: Resource is not currently progressing
- Unknown: Progress status cannot be determined

Degraded

Description: Indicates that the resource is functioning but in a degraded state
Common Use: When some replicas are down but service continues, or when non-critical features are unavailable
Status Values:
- True: Resource is degraded
- False: Resource is not degraded
- Unknown: Degradation status cannot be determined

Failed

Description: Indicates that the resource has failed and cannot fulfill its purpose
Common Use: Permanent failures that require intervention
Status Values:
- True: Resource has failed
- False: Resource has not failed
- Unknown: Failure status cannot be determined

Condition Structure

All conditions follow this structure:

status:
  conditions:
    - type: Ready              # One of: Ready, Available, Progressing, Degraded, Failed
      status: "True"           # One of: "True", "False", "Unknown"
      reason: Ready            # Machine-readable reason (typically same as type)
      message: "Bind9Instance configured with 2 replicas"  # Human-readable message
      lastTransitionTime: "2024-11-26T10:00:00Z"          # RFC3339 timestamp
  observedGeneration: 1        # Generation last observed by controller
  # Resource-specific fields (replicas, recordCount, etc.)

Current Usage

Bind9Instance

Uses Ready condition type
Status True when Deployment, Service, and ConfigMap are successfully created
Status False when resource creation fails
Additional status fields:
- replicas: Total number of replicas
- readyReplicas: Number of ready replicas

Bind9Cluster

Uses Ready condition type with granular reasons
Condition reasons:
- AllInstancesReady: All instances in the cluster are ready
- SomeInstancesNotReady: Some instances are not ready (cluster partially functional)
- NoInstancesReady: No instances are ready (cluster not functional)
Additional status fields:
- instanceCount: Total number of instances
- readyInstances: Number of ready instances
- instances: List of instance names

DNSZone

Uses Progressing, Degraded, and Ready condition types with granular reasons
Reconciliation Flow:
1. Progressing/PrimaryReconciling: Before configuring primary instances
2. Progressing/PrimaryReconciled: After successful primary configuration
3. Progressing/SecondaryReconciling: Before configuring secondary instances
4. Progressing/SecondaryReconciled: After successful secondary configuration
5. Ready/ReconcileSucceeded: When all phases complete successfully
Error Conditions:
- Degraded/PrimaryFailed: Primary reconciliation failed (fatal error)
- Degraded/SecondaryFailed: Secondary reconciliation failed (primaries still work, non-fatal)
Additional status fields:
- recordCount: Number of records in the zone
- secondaryIps: IP addresses of configured secondary servers
- observedGeneration: Last observed generation

DNS Records (A, AAAA, CNAME, MX, TXT, NS, SRV, CAA)

Use Progressing, Degraded, and Ready condition types with granular reasons
Reconciliation Flow:
1. Progressing/RecordReconciling: Before configuring record on endpoints
2. Ready/ReconcileSucceeded: When record is successfully configured on all endpoints
Error Conditions:
- Degraded/RecordFailed: Record configuration failed (includes error details)
Status message includes count of configured endpoints (e.g., “Record configured on 3 endpoint(s)”)
Additional status fields:
- observedGeneration: Last observed generation

Best Practices

Always set the condition type: Use one of the five standardized types
Include timestamps: Set lastTransitionTime when condition status changes
Provide clear messages: The message field should be human-readable and actionable
Use appropriate reasons: The reason field should be machine-readable and consistent
Update observedGeneration: Always update to match the resource’s current generation
Multiple conditions: Resources can have multiple conditions simultaneously (e.g., Ready: True and Degraded: True)

Examples

Successful Bind9Instance

status:
  conditions:
    - type: Ready
      status: "True"
      reason: Ready
      message: "Bind9Instance configured with 2 replicas"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  replicas: 2
  readyReplicas: 2

DNSZone - Progressing (Primary Reconciliation)

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: PrimaryReconciling
      message: "Configuring zone on primary instances"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

DNSZone - Progressing (Secondary Reconciliation)

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: SecondaryReconciling
      message: "Configured on 2 primary server(s), now configuring secondaries"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1
  recordCount: 0
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

DNSZone - Successfully Reconciled

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Configured on 2 primary server(s) and 3 secondary server(s)"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"
    - "10.42.0.7"

DNSZone - Degraded (Secondary Failure)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: SecondaryFailed
      message: "Configured on 2 primary server(s), but secondary configuration failed: connection timeout"
      lastTransitionTime: "2024-11-26T10:00:02Z"
  observedGeneration: 1
  recordCount: 5
  secondaryIps:
    - "10.42.0.5"
    - "10.42.0.6"

DNSZone - Failed (Primary Failure)

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: PrimaryFailed
      message: "Failed to configure zone on primaries: No Bind9Instances matched selector"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  recordCount: 0

DNS Record - Progressing

status:
  conditions:
    - type: Progressing
      status: "True"
      reason: RecordReconciling
      message: "Configuring A record on zone endpoints"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1

DNS Record - Successfully Configured

status:
  conditions:
    - type: Ready
      status: "True"
      reason: ReconcileSucceeded
      message: "Record configured on 3 endpoint(s)"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

DNS Record - Failed

status:
  conditions:
    - type: Degraded
      status: "True"
      reason: RecordFailed
      message: "Failed to configure record: Zone not found on primary servers"
      lastTransitionTime: "2024-11-26T10:00:01Z"
  observedGeneration: 1

Bind9Cluster - Partially Ready

status:
  conditions:
    - type: Ready
      status: "False"
      reason: SomeInstancesNotReady
      message: "2/3 instances ready"
      lastTransitionTime: "2024-11-26T10:00:00Z"
  observedGeneration: 1
  instanceCount: 3
  readyInstances: 2
  instances:
    - production-dns-primary-0
    - production-dns-primary-1
    - production-dns-secondary-0

Validation

All condition types are enforced via CRD validation. Attempting to use a condition type not in the enum will result in a validation error:

$ kubectl apply -f invalid-condition.yaml
Error from server (Invalid): error when creating "invalid-condition.yaml":
Bind9Instance.bindy.firestoned.io "test" is invalid:
status.conditions[0].type: Unsupported value: "CustomType":
supported values: "Ready", "Available", "Progressing", "Degraded", "Failed"

Configuration Examples

Complete configuration examples for common Bindy deployment scenarios.

Overview

This section provides ready-to-use YAML configurations for various deployment scenarios:

Simple Setup - Single instance, single zone
Production Setup - HA, monitoring, backups
Multi-Region Setup - Geographic distribution

Quick Reference

Minimal Configuration

Minimal viable configuration for testing:

# Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dns
  namespace: dns-system
  labels:
    dns-role: primary
spec:
  replicas: 1
---
# DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: "example.com"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# A Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv4Address: "192.0.2.1"

Common Patterns

Primary/Secondary Setup

# Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary
  labels:
    dns-role: primary
spec:
  replicas: 2
  config:
    allowTransfer:
      - "10.0.2.0/24"  # Secondary network
---
# Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary
  labels:
    dns-role: secondary
spec:
  replicas: 2
---
# Zone on Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-primary
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
---
# Zone on Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-secondary
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"

DNSSEC Enabled

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dnssec-instance
spec:
  replicas: 2
  config:
    dnssec:
      enabled: true
      validation: true

Custom Container Image

Using a custom or private container image:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: custom-image-cluster
  namespace: dns-system
spec:
  # Default image for all instances in this cluster
  image:
    image: "my-registry.example.com/bind9:custom-9.18"
    imagePullPolicy: "Always"
    imagePullSecrets:
      - my-registry-secret
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-dns
  namespace: dns-system
spec:
  clusterRef: custom-image-cluster
  replicas: 2
  # Instance inherits custom image from cluster

Instance-Specific Custom Image

Override cluster image for specific instance:

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: prod-cluster
  namespace: dns-system
spec:
  image:
    image: "internetsystemsconsortium/bind9:9.18"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: canary-dns
  namespace: dns-system
spec:
  clusterRef: prod-cluster
  replicas: 1
  # Override cluster image for canary testing
  image:
    image: "internetsystemsconsortium/bind9:9.19"
    imagePullPolicy: "Always"

Custom Configuration Files

Using custom ConfigMaps for BIND9 configuration:

# Create custom ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-named-conf
  namespace: dns-system
data:
  named.conf: |
    // Custom BIND9 configuration
    include "/etc/bind/named.conf.options";
    include "/etc/bind/zones/named.conf.zones";

    logging {
      channel query_log {
        file "/var/log/named/queries.log" versions 5 size 10m;
        severity info;
        print-time yes;
        print-category yes;
      };
      category queries { query_log; };
      category lame-servers { null; };
    };
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-custom-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      allow-transfer { 10.0.2.0/24; };
      dnssec-validation auto;
      listen-on { any; };
      listen-on-v6 { any; };
      max-cache-size 256M;
      max-cache-ttl 3600;
    };
---
# Reference custom ConfigMaps
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: custom-config-dns
  namespace: dns-system
spec:
  replicas: 2
  configMapRefs:
    namedConf: "my-custom-named-conf"
    namedConfOptions: "my-custom-options"

Cluster-Level Custom ConfigMaps

Share custom configuration across all instances:

apiVersion: v1
kind: ConfigMap
metadata:
  name: shared-options
  namespace: dns-system
data:
  named.conf.options: |
    options {
      directory "/var/cache/bind";
      recursion no;
      allow-query { any; };
      dnssec-validation auto;
    };
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Cluster
metadata:
  name: shared-config-cluster
  namespace: dns-system
spec:
  configMapRefs:
    namedConfOptions: "shared-options"
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: instance-1
  namespace: dns-system
spec:
  clusterRef: shared-config-cluster
  replicas: 2
  # Inherits configMapRefs from cluster
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: instance-2
  namespace: dns-system
spec:
  clusterRef: shared-config-cluster
  replicas: 2
  # Also inherits same configMapRefs from cluster

Split Horizon DNS

# Internal DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: internal-dns
  labels:
    dns-view: internal
spec:
  config:
    allowQuery:
      - "10.0.0.0/8"
---
# External DNS
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: external-dns
  labels:
    dns-view: external
spec:
  config:
    allowQuery:
      - "0.0.0.0/0"

Resource Organization

Namespace Structure

Recommended namespace organization:

# Separate namespaces by environment
dns-system-prod      # Production DNS
dns-system-staging   # Staging DNS
dns-system-dev       # Development DNS

Label Strategy

Recommended labels:

metadata:
  labels:
    # Core labels
    app.kubernetes.io/name: bindy
    app.kubernetes.io/component: dns-server
    app.kubernetes.io/part-of: dns-infrastructure

    # Custom labels
    dns-role: primary              # primary, secondary, resolver
    environment: production         # production, staging, dev
    region: us-east-1              # Geographic region
    zone-type: authoritative       # authoritative, recursive

Naming Conventions

Recommended naming:

# Bind9Instance: <role>-<region>
name: primary-us-east-1

# DNSZone: <domain-with-dashes>
name: example-com

# Records: <name>-<type>-<identifier>
name: www-a-record
name: mail-mx-primary

Testing Configurations

Local Development (kind/minikube)

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: dev-dns
  namespace: dns-system
spec:
  replicas: 1
  config:
    recursion: true
    forwarders:
      - "8.8.8.8"
    allowQuery:
      - "0.0.0.0/0"

CI/CD Testing

apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: ci-dns
  namespace: ci-testing
  labels:
    ci-test: "true"
spec:
  replicas: 1
  config:
    recursion: false
    allowQuery:
      - "10.0.0.0/8"

Troubleshooting Examples

Debug Configuration

Enable verbose logging:

apiVersion: v1
kind: ConfigMap
metadata:
  name: bindy-config
data:
  RUST_LOG: "debug"
  RECONCILE_INTERVAL: "60"

Dry Run Testing

Test configuration without applying:

kubectl apply --dry-run=client -f dns-config.yaml
kubectl apply --dry-run=server -f dns-config.yaml

Validation

Validate resources:

# Check instance status
kubectl get bind9instances -A

# Check zone status
kubectl get dnszones -A

# Check all DNS records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -A

Complete Examples

For complete, production-ready configurations see:

Simple Setup - Complete single-instance setup
Production Setup - Full production configuration with HA
Multi-Region Setup - Multi-region deployment

API Reference
Bind9Instance Specification
DNSZone Specification
Record Specifications
Quick Start Guide

Simple Setup Example

Complete configuration for a basic single-instance DNS setup.

Overview

This example demonstrates:

Single Bind9Instance
One DNS zone (example.com)
Common DNS records (A, AAAA, CNAME, MX, TXT)
Suitable for testing and development

Prerequisites

Kubernetes cluster (kind, minikube, or cloud)
kubectl configured
Bindy operator installed

Configuration

Complete YAML

Save as simple-dns.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system

---
# Bind9Instance - Single DNS Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: simple-dns
  namespace: dns-system
  labels:
    app: bindy
    dns-role: primary
    environment: development
spec:
  replicas: 1
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer: []
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# DNSZone - example.com
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "admin@example.com"
    serial: 2024010101
    refresh: 3600
    retry: 600
    expire: 604800
    negativeTtl: 86400
  ttl: 3600

---
# A Record - Nameserver
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "ns1"
  ipv4Address: "192.0.2.1"
  ttl: 3600

---
# A Record - Web Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv4Address: "192.0.2.10"
  ttl: 300

---
# AAAA Record - Web Server (IPv6)
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-aaaa-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "www"
  ipv6Address: "2001:db8::10"
  ttl: 300

---
# A Record - Mail Server
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "mail"
  ipv4Address: "192.0.2.20"
  ttl: 3600

---
# MX Record - Mail Exchange
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "@"
  priority: 10
  mailServer: "mail.example.com."
  ttl: 3600

---
# TXT Record - SPF
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "@"
  text:
    - "v=spf1 mx -all"
  ttl: 3600

---
# TXT Record - DMARC
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=none; rua=mailto:dmarc@example.com"
  ttl: 3600

---
# CNAME Record - API Alias
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: api-cname-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "api"
  target: "www.example.com."
  ttl: 3600

Deployment

1. Install CRDs

kubectl apply -k deploy/crds/

2. Deploy Bindy Operator

kubectl apply -f deploy/controller/deployment.yaml

3. Apply Configuration

kubectl apply -f simple-dns.yaml

4. Verify Deployment

# Check Bind9Instance
kubectl get bind9instances -n dns-system
kubectl describe bind9instance simple-dns -n dns-system

# Check DNSZone
kubectl get dnszones -n dns-system
kubectl describe dnszone example-com -n dns-system

# Check DNS Records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords -n dns-system

# Check pods
kubectl get pods -n dns-system

# Check logs
kubectl logs -n dns-system -l app=bindy

Testing

DNS Queries

Get the DNS service IP:

DNS_IP=$(kubectl get svc -n dns-system simple-dns -o jsonpath='{.spec.clusterIP}')

Test DNS resolution:

# A record
dig @${DNS_IP} www.example.com A

# AAAA record
dig @${DNS_IP} www.example.com AAAA

# MX record
dig @${DNS_IP} example.com MX

# TXT record
dig @${DNS_IP} example.com TXT

# CNAME record
dig @${DNS_IP} api.example.com CNAME

Expected responses:

; www.example.com A
www.example.com.    300    IN    A    192.0.2.10

; www.example.com AAAA
www.example.com.    300    IN    AAAA    2001:db8::10

; example.com MX
example.com.        3600   IN    MX    10 mail.example.com.

; example.com TXT
example.com.        3600   IN    TXT   "v=spf1 mx -all"

; api.example.com CNAME
api.example.com.    3600   IN    CNAME www.example.com.

Port Forward for External Testing

# Forward DNS port to localhost
kubectl port-forward -n dns-system svc/simple-dns 5353:53

# Test from local machine
dig @localhost -p 5353 www.example.com

Monitoring

Check Status

# Instance status
kubectl get bind9instance simple-dns -n dns-system -o yaml | grep -A 10 status

# Zone status
kubectl get dnszone example-com -n dns-system -o yaml | grep -A 10 status

# Record status
kubectl get arecord www-a-record -n dns-system -o yaml | grep -A 10 status

View Logs

# Controller logs
kubectl logs -n dns-system deployment/bindy

# BIND9 logs
kubectl logs -n dns-system -l app=bindy,dns-role=primary

Updating Configuration

Add New Record

cat <<EOF | kubectl apply -f -
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: app-a-record
  namespace: dns-system
spec:
  zone: "example-com"
  name: "app"
  ipv4Address: "192.0.2.30"
  ttl: 300
EOF

Update SOA Serial

kubectl edit dnszone example-com -n dns-system

# Update serial field:
# serial: 2024010102

Scale Instance

kubectl patch bind9instance simple-dns -n dns-system \
  --type merge \
  --patch '{"spec":{"replicas":2}}'

Cleanup

Remove All Resources

kubectl delete -f simple-dns.yaml

Remove Namespace

kubectl delete namespace dns-system

Next Steps

Production Setup - Add HA and monitoring
Multi-Region Setup - Geographic distribution
Operations Guide - Monitoring and troubleshooting

Troubleshooting

Pods Not Starting

# Check pod events
kubectl describe pod -n dns-system -l app=bindy

# Check controller logs
kubectl logs -n dns-system deployment/bindy

DNS Not Resolving

# Check zone status
kubectl get dnszone example-com -n dns-system -o yaml

# Check BIND9 logs
kubectl logs -n dns-system -l app=bindy,dns-role=primary

# Verify zone file
kubectl exec -n dns-system -it <pod-name> -- cat /var/lib/bind/zones/example.com.zone

Record Not Appearing

# Check record status
kubectl get arecord www-a-record -n dns-system -o yaml

# Check zone record count
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.recordCount}'

Production Setup Example

Production-ready configuration with high availability, monitoring, and security.

Overview

This example demonstrates:

Primary/Secondary HA setup
Multiple replicas with pod anti-affinity
Resource limits and requests
PodDisruptionBudgets
DNSSEC enabled
Monitoring and logging
Production-grade security

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      Production DNS                          │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│   Primary Instances (2 replicas)                            │
│   ┌──────────────┐  ┌──────────────┐                       │
│   │   Primary-1  │  │   Primary-2  │                       │
│   │  (us-east-1a)│  │  (us-east-1b)│                       │
│   └──────┬───────┘  └──────┬───────┘                       │
│          │                  │                               │
│          └──────────┬───────┘                               │
│                     │ Zone Transfer (AXFR/IXFR)            │
│          ┌──────────┴───────┐                               │
│          │                  │                               │
│   ┌──────▼───────┐  ┌──────▼───────┐                       │
│   │ Secondary-1  │  │ Secondary-2  │                       │
│   │ (us-west-2a) │  │ (us-west-2b) │                       │
│   └──────────────┘  └──────────────┘                       │
│                                                              │
│   Secondary Instances (2 replicas)                          │
└─────────────────────────────────────────────────────────────┘

Complete Configuration

Save as production-dns.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system-prod
  labels:
    environment: production

---
# ConfigMap for Controller Configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: bindy-config
  namespace: dns-system-prod
data:
  RUST_LOG: "info"
  RECONCILE_INTERVAL: "300"

---
# Primary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-dns
  namespace: dns-system-prod
  labels:
    app: bindy
    dns-role: primary
    environment: production
    component: dns-server
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.0.2.0/24"  # Secondary instance subnet
    dnssec:
      enabled: true
      validation: false
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-dns
  namespace: dns-system-prod
  labels:
    app: bindy
    dns-role: secondary
    environment: production
    component: dns-server
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget for Primary
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
  namespace: dns-system-prod
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: bindy
      dns-role: primary

---
# PodDisruptionBudget for Secondary
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system-prod
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: bindy
      dns-role: secondary

---
# DNSZone - Primary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
  namespace: dns-system-prod
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
  soaRecord:
    primaryNs: "ns1.example.com."
    adminEmail: "dns-admin@example.com"
    serial: 2024010101
    refresh: 900       # 15 minutes - production refresh
    retry: 300         # 5 minutes
    expire: 604800     # 1 week
    negativeTtl: 300   # 5 minutes
  ttl: 300  # 5 minutes default TTL

---
# DNSZone - Secondary
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system-prod
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
  secondaryConfig:
    primaryServers:
      - "10.0.1.10"
      - "10.0.1.11"
  ttl: 300

---
# Production DNS Records
# Nameservers
---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-primary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "ns1"
  ipv4Address: "192.0.2.1"
  ttl: 86400  # 24 hours for NS records

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns2-secondary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "ns2"
  ipv4Address: "192.0.2.2"
  ttl: 86400

---
# Load Balanced Web Servers (Round Robin)
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.10"
  ttl: 60  # 1 minute for load balanced IPs

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.11"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-lb-3
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv4Address: "192.0.2.12"
  ttl: 60

---
# Dual Stack for www
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6-1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv6Address: "2001:db8::10"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: AAAARecord
metadata:
  name: www-v6-2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "www"
  ipv6Address: "2001:db8::11"
  ttl: 60

---
# Mail Infrastructure
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail1
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "mail1"
  ipv4Address: "192.0.2.20"
  ttl: 3600

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: mail2
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "mail2"
  ipv4Address: "192.0.2.21"
  ttl: 3600

---
# MX Records - Primary and Backup
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-primary
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  priority: 10
  mailServer: "mail1.example.com."
  ttl: 3600

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: MXRecord
metadata:
  name: mx-backup
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  priority: 20
  mailServer: "mail2.example.com."
  ttl: 3600

---
# SPF Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: spf
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  text:
    - "v=spf1 mx ip4:192.0.2.20/32 ip4:192.0.2.21/32 -all"
  ttl: 3600

---
# DKIM Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dkim
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "default._domainkey"
  text:
    - "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC..."
  ttl: 3600

---
# DMARC Record
apiVersion: bindy.firestoned.io/v1alpha1
kind: TXTRecord
metadata:
  name: dmarc
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "_dmarc"
  text:
    - "v=DMARC1; p=quarantine; pct=100; rua=mailto:dmarc-reports@example.com; ruf=mailto:dmarc-forensics@example.com"
  ttl: 3600

---
# CAA Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issue
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "issue"
  value: "letsencrypt.org"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-issuewild
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "issuewild"
  value: "letsencrypt.org"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: CAARecord
metadata:
  name: caa-iodef
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "@"
  flags: 0
  tag: "iodef"
  value: "mailto:security@example.com"
  ttl: 86400

---
# Service Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-sip-tcp
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "_sip._tcp"
  priority: 10
  weight: 60
  port: 5060
  target: "sip1.example.com."
  ttl: 3600

---
# CDN CNAME
apiVersion: bindy.firestoned.io/v1alpha1
kind: CNAMERecord
metadata:
  name: cdn
  namespace: dns-system-prod
spec:
  zone: "example-com-primary"
  name: "cdn"
  target: "d123456.cloudfront.net."
  ttl: 3600

Deployment

1. Prerequisites

# Create namespace
kubectl create namespace dns-system-prod

# Label nodes for DNS pods (optional but recommended)
kubectl label nodes node1 dns-zone=primary
kubectl label nodes node2 dns-zone=primary
kubectl label nodes node3 dns-zone=secondary
kubectl label nodes node4 dns-zone=secondary

2. Deploy

kubectl apply -f production-dns.yaml

3. Verify

# Check all instances
kubectl get bind9instances -n dns-system-prod
kubectl get dnszones -n dns-system-prod
kubectl get pods -n dns-system-prod -o wide

# Check PodDisruptionBudgets
kubectl get pdb -n dns-system-prod

# Verify HA distribution
kubectl get pods -n dns-system-prod -o custom-columns=\
NAME:.metadata.name,\
NODE:.spec.nodeName,\
ROLE:.metadata.labels.dns-role

Monitoring

Prometheus Metrics

apiVersion: v1
kind: Service
metadata:
  name: bindy-metrics
  namespace: dns-system-prod
  labels:
    app: bindy
spec:
  ports:
    - name: metrics
      port: 9090
      targetPort: 9090
  selector:
    app: bindy

ServiceMonitor (for Prometheus Operator)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: bindy-dns
  namespace: dns-system-prod
spec:
  selector:
    matchLabels:
      app: bindy
  endpoints:
    - port: metrics
      interval: 30s

Backup and Disaster Recovery

Backup Zones

#!/bin/bash
# backup-zones.sh

NAMESPACE="dns-system-prod"
BACKUP_DIR="./dns-backups/$(date +%Y%m%d)"

mkdir -p "$BACKUP_DIR"

# Backup all zones
kubectl get dnszones -n $NAMESPACE -o yaml > "$BACKUP_DIR/zones.yaml"

# Backup all records
kubectl get arecords,aaaarecords,cnamerecords,mxrecords,txtrecords,nsrecords,srvrecords,caarecords \
  -n $NAMESPACE -o yaml > "$BACKUP_DIR/records.yaml"

echo "Backup completed: $BACKUP_DIR"

Restore

kubectl apply -f dns-backups/20240115/zones.yaml
kubectl apply -f dns-backups/20240115/records.yaml

Security Hardening

Network Policies

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: dns-allow-queries
  namespace: dns-system-prod
spec:
  podSelector:
    matchLabels:
      app: bindy
  policyTypes:
    - Ingress
  ingress:
    - ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

Pod Security Standards

apiVersion: v1
kind: Namespace
metadata:
  name: dns-system-prod
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Performance Tuning

Resource Limits

spec:
  resources:
    requests:
      memory: "512Mi"
      cpu: "500m"
    limits:
      memory: "1Gi"
      cpu: "1000m"

HorizontalPodAutoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: primary-dns-hpa
  namespace: dns-system-prod
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-dns
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Testing

Load Testing

# Using dnsperf
dnsperf -s <DNS_IP> -d queries.txt -c 100 -l 60

# queries.txt format:
# www.example.com A
# mail1.example.com A
# example.com MX

Failover Testing

# Delete primary pod to test failover
kubectl delete pod -n dns-system-prod -l dns-role=primary --force

# Monitor DNS continues to serve
dig @<DNS_IP> www.example.com

High Availability Guide
Monitoring Guide
Security Guide

Multi-Region Setup Example

Geographic distribution for global DNS resilience and performance.

Overview

This example demonstrates:

Primary instances in multiple regions
Secondary instances for redundancy
Zone replication across regions
Anycast for geographic load balancing
Cross-region monitoring

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                        Global DNS Infrastructure                    │
└────────────────────────────────────────────────────────────────────┘

  Region 1: us-east-1           Region 2: us-west-2         Region 3: eu-west-1
┌─────────────────────┐      ┌─────────────────────┐     ┌─────────────────────┐
│  Primary Instances  │      │ Secondary Instances │     │ Secondary Instances │
│                     │      │                     │     │                     │
│  ┌────┐  ┌────┐   │◄─────┤  ┌────┐  ┌────┐    │◄────┤  ┌────┐  ┌────┐    │
│  │Pod1│  │Pod2│   │ AXFR │  │Pod1│  │Pod2│    │AXFR │  │Pod1│  │Pod2│    │
│  └────┘  └────┘   │      │  └────┘  └────┘    │     │  └────┘  └────┘    │
│                     │      │                     │     │                     │
│  DNSSEC: Enabled    │      │  DNSSEC: Verify    │     │  DNSSEC: Verify    │
│  Replicas: 2        │      │  Replicas: 2        │     │  Replicas: 2        │
└─────────────────────┘      └─────────────────────┘     └─────────────────────┘
         │                            │                            │
         └────────────────────────────┴────────────────────────────┘
                                      │
                              Anycast IP: 192.0.2.1
                        (Routes to nearest region)

Region 1: us-east-1 (Primary)

Save as region-us-east-1.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: us-east-1
    role: primary

---
# Primary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: primary-us-east-1
  namespace: dns-system
  labels:
    app: bindy
    dns-role: primary
    region: us-east-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    allowTransfer:
      - "10.1.0.0/16"  # us-west-2 CIDR
      - "10.2.0.0/16"  # eu-west-1 CIDR
    dnssec:
      enabled: true
      validation: false
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: primary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: primary
      region: us-east-1

---
# Primary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-primary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      dns-role: primary
      region: us-east-1
  soaRecord:
    primaryNs: "ns1.us-east-1.example.com."
    adminEmail: "dns-admin@example.com"
    serial: 2024010101
    refresh: 900
    retry: 300
    expire: 604800
    negativeTtl: 300
  ttl: 300

---
# Nameserver Records
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns1-us-east-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns1.us-east-1"
  ipv4Address: "192.0.2.1"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns2-us-west-2
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns2.us-west-2"
  ipv4Address: "192.0.2.2"
  ttl: 86400

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: ns3-eu-west-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "ns3.eu-west-1"
  ipv4Address: "192.0.2.3"
  ttl: 86400

---
# Regional Web Servers
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-us-east-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.us-east-1"
  ipv4Address: "192.0.2.10"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-us-west-2
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.us-west-2"
  ipv4Address: "192.0.2.20"
  ttl: 60

---
apiVersion: bindy.firestoned.io/v1alpha1
kind: ARecord
metadata:
  name: www-eu-west-1
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "www.eu-west-1"
  ipv4Address: "192.0.2.30"
  ttl: 60

---
# GeoDNS using SRV records for service discovery
apiVersion: bindy.firestoned.io/v1alpha1
kind: SRVRecord
metadata:
  name: srv-web-us-east
  namespace: dns-system
spec:
  zone: "example-com-primary"
  name: "_http._tcp.us-east-1"
  priority: 10
  weight: 100
  port: 80
  target: "www.us-east-1.example.com."
  ttl: 300

Region 2: us-west-2 (Secondary)

Save as region-us-west-2.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: us-west-2
    role: secondary

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-us-west-2
  namespace: dns-system
  labels:
    app: bindy
    dns-role: secondary
    region: us-west-2
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: secondary
      region: us-west-2

---
# Secondary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
      region: us-west-2
  secondaryConfig:
    primaryServers:
      - "192.0.2.1"   # Primary in us-east-1
      - "192.0.2.2"
  ttl: 300

Region 3: eu-west-1 (Secondary)

Save as region-eu-west-1.yaml:

---
# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: dns-system
  labels:
    region: eu-west-1
    role: secondary

---
# Secondary Bind9Instance
apiVersion: bindy.firestoned.io/v1alpha1
kind: Bind9Instance
metadata:
  name: secondary-eu-west-1
  namespace: dns-system
  labels:
    app: bindy
    dns-role: secondary
    region: eu-west-1
    environment: production
spec:
  replicas: 2
  version: "9.18"
  config:
    recursion: false
    allowQuery:
      - "0.0.0.0/0"
    dnssec:
      enabled: false
      validation: true
    listenOn:
      - "any"
    listenOnV6:
      - "any"

---
# PodDisruptionBudget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: secondary-dns-pdb
  namespace: dns-system
spec:
  minAvailable: 1
  selector:
    matchLabels:
      dns-role: secondary
      region: eu-west-1

---
# Secondary DNSZone
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: example-com-secondary
  namespace: dns-system
spec:
  zoneName: "example.com"
  zoneType: "secondary"
  instanceSelector:
    matchLabels:
      dns-role: secondary
      region: eu-west-1
  secondaryConfig:
    primaryServers:
      - "192.0.2.1"   # Primary in us-east-1
      - "192.0.2.2"
  ttl: 300

Deployment

1. Deploy to Each Region

# us-east-1
kubectl apply -f region-us-east-1.yaml --context us-east-1

# us-west-2
kubectl apply -f region-us-west-2.yaml --context us-west-2

# eu-west-1
kubectl apply -f region-eu-west-1.yaml --context eu-west-1

2. Verify Replication

# Check zone transfer from primary
kubectl exec -n dns-system -it <primary-pod> -- \
  dig @localhost example.com AXFR

# Verify secondary received zone
kubectl exec -n dns-system -it <secondary-pod> -- \
  dig @localhost example.com SOA

3. Configure Anycast (Infrastructure Level)

This requires network infrastructure support:

# Example using MetalLB for on-premises
apiVersion: v1
kind: Service
metadata:
  name: dns-anycast
  namespace: dns-system
  annotations:
    metallb.universe.tf/address-pool: anycast-pool
spec:
  type: LoadBalancer
  loadBalancerIP: 192.0.2.1  # Same IP in all regions
  selector:
    app: bindy
  ports:
    - protocol: UDP
      port: 53
      targetPort: 53

Cross-Region Monitoring

Prometheus Federation

# Global Prometheus Configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yml: |
    global:
      scrape_interval: 30s
    
    scrape_configs:
      # us-east-1
      - job_name: 'dns-us-east-1'
        static_configs:
          - targets: ['prometheus.us-east-1.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep
      
      # us-west-2
      - job_name: 'dns-us-west-2'
        static_configs:
          - targets: ['prometheus.us-west-2.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep
      
      # eu-west-1
      - job_name: 'dns-eu-west-1'
        static_configs:
          - targets: ['prometheus.eu-west-1.example.com:9090']
        metric_relabel_configs:
          - source_labels: [__name__]
            regex: 'dns_.*'
            action: keep

Health Checks

#!/bin/bash
# health-check-multi-region.sh

REGIONS=("us-east-1" "us-west-2" "eu-west-1")
QUERY="www.example.com"

for region in "${REGIONS[@]}"; do
  echo "Checking $region..."
  
  # Get DNS service IP
  DNS_IP=$(kubectl get svc -n dns-system --context $region \
    -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}')
  
  # Test query
  if dig @$DNS_IP $QUERY +short > /dev/null; then
    echo "✓ $region: OK"
  else
    echo "✗ $region: FAILED"
  fi
done

Disaster Recovery

Regional Failover

# Promote secondary in us-west-2 to primary
kubectl patch bind9instance secondary-us-west-2 \
  -n dns-system --context us-west-2 \
  --type merge \
  --patch '{"metadata":{"labels":{"dns-role":"primary"}}}'

# Update zone to primary
kubectl patch dnszone example-com-secondary \
  -n dns-system --context us-west-2 \
  --type merge \
  --patch '{"spec":{"zoneType":"primary"}}'

Backup Strategy

#!/bin/bash
# backup-all-regions.sh

REGIONS=("us-east-1" "us-west-2" "eu-west-1")
BACKUP_DIR="./multi-region-backups/$(date +%Y%m%d)"

mkdir -p "$BACKUP_DIR"

for region in "${REGIONS[@]}"; do
  echo "Backing up $region..."
  
  kubectl get dnszones,arecords,aaaarecords,cnamerecords,mxrecords,txtrecords \
    -n dns-system --context $region -o yaml \
    > "$BACKUP_DIR/$region.yaml"
done

echo "Backup completed: $BACKUP_DIR"

Performance Testing

Global Latency Test

#!/bin/bash
# test-global-latency.sh

REGIONS=(
  "us-east-1:192.0.2.1"
  "us-west-2:192.0.2.2"
  "eu-west-1:192.0.2.3"
)

for region_ip in "${REGIONS[@]}"; do
  region="${region_ip%%:*}"
  ip="${region_ip##*:}"
  
  echo "Testing $region ($ip)..."
  
  # Measure query time
  time dig @$ip www.example.com +short
done

Load Distribution

# Using dnsperf across regions
for region in us-east-1 us-west-2 eu-west-1; do
  dnsperf -s $DNS_IP -d queries.txt -c 50 -l 30 -Q 1000 | \
    tee results-$region.txt
done

Cost Optimization

Regional Scaling

# HPA for each region based on local load
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: dns-hpa-us-east-1
  namespace: dns-system
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: primary-us-east-1
  minReplicas: 2
  maxReplicas: 5
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Compliance and Data Residency

Regional Data Isolation

# EU-specific zone for GDPR compliance
apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: eu-example-com
  namespace: dns-system
  labels:
    compliance: gdpr
spec:
  zoneName: "eu.example.com"
  zoneType: "primary"
  instanceSelector:
    matchLabels:
      region: eu-west-1
  soaRecord:
    primaryNs: "ns1.eu-west-1.example.com."
    adminEmail: "dpo@example.com"
    serial: 2024010101
    refresh: 900
    retry: 300
    expire: 604800
    negativeTtl: 300

Multi-Region Deployment Guide
Replication Strategies
High Availability
Zone Transfers

API Documentation (rustdoc)

The complete API documentation is generated from Rust source code and is available separately.

Viewing API Documentation

Online

Visit the API Reference section of the documentation site.

Locally

Build and view the API documentation:

# Build API docs
cargo doc --no-deps --all-features

# Open in browser
cargo doc --no-deps --all-features --open

Or build the complete documentation (user guide + API):

make docs-serve
# Navigate to http://localhost:3000/rustdoc/bindy/index.html

What’s in the API Documentation

The rustdoc API documentation includes:

Module Documentation - All public modules and their organization
Struct Definitions - Complete CRD type definitions (Bind9Instance, DNSZone, etc.)
Function Signatures - All public functions with parameter types and return values
Examples - Code examples showing how to use the API
Type Documentation - Detailed information about all public types
Trait Implementations - All trait implementations for types

Key Modules

bindy::crd - Custom Resource Definitions
bindy::reconcilers - Controller reconciliation logic
bindy::bind9 - BIND9 zone file management
bindy::bind9_resources - Kubernetes resource builders

Direct Links

When the documentation is built, you can access:

Main API Index: rustdoc/bindy/index.html
CRD Module: rustdoc/bindy/crd/index.html
Reconcilers: rustdoc/bindy/reconcilers/index.html

Changelog

All notable changes to Bindy will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Fixed

DNSZone tight reconciliation loop - Added status change detection to prevent unnecessary status updates and reconciliation cycles (2025-12-01)

Added

Comprehensive documentation with mdBook and rustdoc
GitHub Pages deployment workflow
Status update optimization documentation in performance guide

[0.1.0] - 2024-01-01

Added

Initial release of Bindy
Bind9Instance CRD for managing BIND9 DNS server instances
DNSZone CRD with label selector support
DNS record CRDs: A, AAAA, CNAME, MX, TXT, NS, SRV, CAA
Reconciliation controllers for all resource types
BIND9 zone file generation
Status subresources for all CRDs
RBAC configuration
Docker container support
Comprehensive test suite
CI/CD with GitHub Actions
Integration tests with Kind

Features

High-performance Rust implementation
Async/await with Tokio runtime
Label-based instance targeting
Primary and secondary DNS support
Multi-region deployment support
Full status reporting
Kubernetes 1.24+ support

License

Bindy is licensed under the MIT License.

SPDX-License-Identifier: MIT

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

What This Means for You

The MIT License is one of the most permissive open source licenses. Here’s what it allows:

✅ You Can

Use commercially - Use Bindy in your commercial products and services
Modify - Change the code to fit your needs
Distribute - Share the original or your modified version
Sublicense - Include Bindy in proprietary software
Private use - Use Bindy for private/internal purposes without releasing your modifications

⚠️ Requirements

Include the license - Include the copyright notice and license text in substantial portions of the software
State changes - Document any modifications you make (recommended best practice)

❌ Limitations

No warranty - The software is provided “as is” without warranty of any kind
No liability - The authors are not liable for any damages arising from the use of the software

SPDX License Identifiers

All source code files in this project include SPDX license identifiers for machine-readable license information:

#![allow(unused)]
fn main() {
// Copyright (c) 2025 Erick Bourgeois, firestoned
// SPDX-License-Identifier: MIT
}

This makes it easy for automated tools to:

Scan the codebase for license compliance
Generate Software Bill of Materials (SBOM)
Verify license compatibility

Learn more about SPDX at https://spdx.dev/

Software Bill of Materials (SBOM)

Bindy provides SBOM files in CycloneDX format with every release. These include:

Binary SBOMs for each platform (Linux, macOS, Windows)
Docker image SBOM
Complete dependency tree with license information

SBOMs are available as release assets and can be used for:

Supply chain security
Vulnerability scanning
License compliance auditing
Dependency tracking

Third-Party Licenses

Bindy depends on various open-source libraries. All dependencies are permissively licensed and compatible with the MIT License.

Key Dependencies

Library	License	Purpose
kube-rs	Apache 2.0 / MIT	Kubernetes client library
tokio	MIT	Async runtime
serde	Apache 2.0 / MIT	Serialization framework
tracing	MIT	Structured logging
anyhow	Apache 2.0 / MIT	Error handling
thiserror	Apache 2.0 / MIT	Error derivation

Generating License Reports

For a complete list of all dependencies and their licenses:

# Install cargo-license tool
cargo install cargo-license

# Generate license report
cargo license

# Generate detailed license report with full license text
cargo license --json > licenses.json

You can also use cargo-about for more detailed license auditing:

cargo install cargo-about
cargo about generate about.hbs > licenses.html

Container Image Licenses

The Docker images for Bindy include:

Base Image: Alpine Linux (MIT License)
BIND9: ISC License (permissive, BSD-style)
Bindy Binary: MIT License

All components are open source and permissively licensed.

Contributing

By contributing to Bindy, you agree that:

Your contributions will be licensed under the MIT License
You have the right to submit the contributions
You grant the project maintainers a perpetual, worldwide, non-exclusive, royalty-free license to use your contributions

See the Contributing Guidelines for more information on how to contribute.

License Compatibility

The MIT License is compatible with most other open source licenses, including:

✅ Apache License 2.0
✅ BSD licenses (2-clause, 3-clause)
✅ GPL v2 and v3 (one-way compatible - MIT code can be included in GPL projects)
✅ ISC License
✅ Other MIT-licensed code

This makes Bindy easy to integrate into various projects and environments.

Questions About Licensing

If you have questions about:

Using Bindy in your project
License compliance
Contributing to Bindy
Third-party dependencies

Please open a GitHub Discussion or contact the maintainers.

Keyboard shortcuts

Bindy - BIND9 DNS Controller for Kubernetes