Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Common Issues

Solutions to frequently encountered problems.

Bind9Instance Issues

Pods Not Starting

Symptom: Bind9Instance created but pods not running

Diagnosis:

kubectl get pods -n dns-system -l instance=primary-dns
kubectl describe pod -n dns-system <pod-name>

Common Causes:

  1. Image pull errors - Check image name and registry access
  2. Resource limits - Insufficient CPU/memory on nodes
  3. RBAC issues - ServiceAccount lacks permissions

Solution:

# Check events
kubectl get events -n dns-system

# Fix resource limits
kubectl edit bind9instance primary-dns -n dns-system
# Increase resources.requests and resources.limits

# Verify RBAC
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy

ConfigMap Not Created

Symptom: ConfigMap missing for Bind9Instance

Diagnosis:

kubectl get configmap -n dns-system
kubectl logs -n dns-system deployment/bindy | grep ConfigMap

Solution:

# Check controller logs for errors
kubectl logs -n dns-system deployment/bindy --tail=50

# Delete and recreate instance
kubectl delete bind9instance primary-dns -n dns-system
kubectl apply -f instance.yaml

DNSZone Issues

No Instances Match Selector

Symptom: DNSZone status shows “No Bind9Instances matched selector”

Diagnosis:

kubectl get bind9instances -n dns-system --show-labels
kubectl get dnszone example-com -n dns-system -o yaml | yq '.spec.instanceSelector'

Solution:

# Verify labels on instances
kubectl label bind9instance primary-dns dns-role=primary -n dns-system

# Or update zone selector
kubectl edit dnszone example-com -n dns-system

Zone File Not Created

Symptom: Zone exists but no zone file in BIND9

Diagnosis:

kubectl exec -n dns-system deployment/primary-dns -- ls -la /var/lib/bind/zones/
kubectl logs -n dns-system deployment/bindy | grep "example-com"

Solution:

# Check if zone reconciliation succeeded
kubectl describe dnszone example-com -n dns-system

# Trigger reconciliation by updating zone
kubectl annotate dnszone example-com reconcile=true -n dns-system

DNS Record Issues

DNSZone Not Found

Symptom: Controller logs show “DNSZone not found” errors for a zone that exists

Example Error:

ERROR Failed to find DNSZone for zone 'internal-local' in namespace 'dns-system'

Root Cause: Mismatch between how the record references the zone and the actual DNSZone fields.

Diagnosis:

# Check what the record is trying to reference
kubectl get arecord www-example -n dns-system -o yaml | grep -A2 spec:

# Check available DNSZones
kubectl get dnszones -n dns-system

# Check the DNSZone details
kubectl get dnszone example-com -n dns-system -o yaml

Understanding the Problem:

DNS records can reference zones using two different fields:

  1. zone field - Matches against DNSZone.spec.zoneName (the actual DNS zone name like example.com)
  2. zoneRef field - Matches against DNSZone.metadata.name (the Kubernetes resource name like example-com)

Common mistakes:

  • Using zone: internal-local when spec.zoneName: internal.local (dots vs dashes)
  • Using zone: example-com when it should be zone: example.com
  • Using zoneRef: example.com when it should be zoneRef: example-com

Solution:

Option 1: Use zone field with the actual DNS zone name

spec:
  zone: example.com  # Must match DNSZone spec.zoneName
  name: www

Option 2: Use zoneRef field with the resource name (recommended)

spec:
  zoneRef: example-com  # Must match DNSZone metadata.name
  name: www

Example Fix:

Given this DNSZone:

apiVersion: bindy.firestoned.io/v1alpha1
kind: DNSZone
metadata:
  name: internal-local      # ← Resource name
  namespace: dns-system
spec:
  zoneName: internal.local  # ← Actual zone name

Wrong:

spec:
  zone: internal-local  # ✗ This looks for spec.zoneName = "internal-local"

Correct:

# Method 1: Use actual zone name
spec:
  zone: internal.local  # ✓ Matches spec.zoneName

# Method 2: Use resource name (more efficient)
spec:
  zoneRef: internal-local  # ✓ Matches metadata.name

Verification:

# After fixing, check the record reconciles
kubectl describe arecord www-example -n dns-system

# Should see no errors in events
kubectl get events -n dns-system --sort-by='.lastTimestamp' | tail -10

See Records Guide - Referencing DNS Zones for more details.

Record Not Appearing in Zone

Symptom: ARecord created but not in zone file

Diagnosis:

# Check record status
kubectl get arecord www-example -n dns-system -o yaml

# Check zone file
kubectl exec -n dns-system deployment/primary-dns -- cat /var/lib/bind/zones/example.com.zone

Solution:

# Verify zone reference is correct (use zone or zoneRef)
kubectl get arecord www-example -n dns-system -o yaml | grep -E 'zone:|zoneRef:'

# Check available DNSZones
kubectl get dnszones -n dns-system

# Update if incorrect - use zone (matches spec.zoneName) or zoneRef (matches metadata.name)
kubectl edit arecord www-example -n dns-system

DNS Query Not Resolving

Symptom: dig/nslookup fails to resolve

Diagnosis:

# Get DNS service IP
SERVICE_IP=$(kubectl get svc primary-dns -n dns-system -o jsonpath='{.spec.clusterIP}')

# Test query
dig @$SERVICE_IP www.example.com

# Check BIND9 logs
kubectl logs -n dns-system -l instance=primary-dns | tail -20

Solutions:

  1. Record doesn’t exist:
kubectl get arecords -n dns-system
kubectl apply -f record.yaml
  1. Zone not loaded:
kubectl logs -n dns-system -l instance=primary-dns | grep "loaded serial"
  1. Network policy blocking:
kubectl get networkpolicies -n dns-system

Zone Transfer Issues

Secondary Not Receiving Transfers

Symptom: Secondary instance not getting zone updates

Diagnosis:

# Check secondary logs
kubectl logs -n dns-system -l dns-role=secondary | grep transfer

# Check if zone has secondary IPs configured
kubectl get dnszone example-com -n dns-system -o jsonpath='{.status.secondaryIps}'

# Check if secondaries are discovered
kubectl get bind9instance -n dns-system -l role=secondary -o jsonpath='{.items[*].status.podIP}'

Automatic Configuration:

As of v0.1.0, Bindy automatically discovers secondary IPs and configures zone transfers:

  • Secondary pods are discovered via Kubernetes API using label selectors (role=secondary)
  • Primary zones are configured with also-notify and allow-transfer directives
  • Secondary IPs are stored in DNSZone.status.secondaryIps for tracking
  • When secondary pods restart/reschedule and get new IPs, zones are automatically updated

Manual Verification:

# Check if zone has secondary IPs in status
kubectl get dnszone example-com -n dns-system -o yaml | yq '.status.secondaryIps'

# Expected output: List of secondary pod IPs
# - 10.244.1.5
# - 10.244.2.8

# Verify zone configuration on primary
kubectl exec -n dns-system deployment/primary-dns -- \
  curl -s localhost:8080/api/zones/example.com | jq '.alsoNotify, .allowTransfer'

If Automatic Configuration Fails:

  1. Verify secondary instances are labeled correctly:

    kubectl get bind9instance -n dns-system -o yaml | yq '.items[].metadata.labels'
    
    # Expected labels for secondaries:
    # role: secondary
    # cluster: <cluster-name>
    
  2. Check DNSZone reconciler logs:

    kubectl logs -n dns-system deployment/bindy | grep "secondary"
    
  3. Verify network connectivity:

    # Test AXFR from secondary to primary
    kubectl exec -n dns-system deployment/secondary-dns -- \
      dig @primary-dns-service AXFR example.com
    

Recovery After Secondary Pod Restart:

When secondary pods are rescheduled and get new IPs:

  1. Detection: Reconciler automatically detects IP change within 5-10 minutes (next reconciliation)
  2. Update: Zones are deleted and recreated with new secondary IPs
  3. Transfer: Zone transfers resume automatically with new IPs

Manual Trigger (if needed):

# Force reconciliation by updating zone annotation
kubectl annotate dnszone example-com -n dns-system \
  reconcile.bindy.firestoned.io/trigger="$(date +%s)" --overwrite

Performance Issues

High Query Latency

Symptom: DNS queries taking too long

Diagnosis:

# Test query time
time dig @$SERVICE_IP example.com

# Check resource usage
kubectl top pods -n dns-system -l instance=primary-dns

Solutions:

  1. Increase resources:
spec:
  resources:
    limits:
      cpu: "1000m"
      memory: "1Gi"
  1. Add more replicas:
spec:
  replicas: 3
  1. Enable caching (if appropriate for your use case)

RBAC Issues

Forbidden Errors in Logs

Symptom: Controller logs show “Forbidden” errors

Diagnosis:

kubectl logs -n dns-system deployment/bindy | grep Forbidden

# Check permissions
kubectl auth can-i create deployments \
  --as=system:serviceaccount:dns-system:bindy \
  -n dns-system

Solution:

# Reapply RBAC
kubectl apply -f deploy/rbac/

# Verify ClusterRoleBinding
kubectl get clusterrolebinding bindy-rolebinding -o yaml

# Restart controller
kubectl rollout restart deployment/bindy -n dns-system

Next Steps