Kubernetes Production Secrets: Advanced Patterns for 2025
Cap
8 min read
kubernetesproductiondevopsscalabilitymonitoring
Kubernetes Production Secrets: Advanced Patterns for 2025
Real-world lessons from operating 50+ clusters across multiple environments
🎯 The 2025 Kubernetes Reality
After 5 years of running Kubernetes in production, managing 15,000+ pods across 50+ clusters, here are the patterns that separate successful deployments from disasters.
🔧 Advanced Resource Management
1. Dynamic Resource Allocation with VPA
# Production VPA configuration that actually works
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-service-vpa
namespace: production
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
updatePolicy:
updateMode: "Auto"
minReplicas: 3
resourcePolicy:
containerPolicies:
- containerName: api-container
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 4Gi
controlledResources: ["cpu", "memory"]
controlledValues: RequestsAndLimits
---
# Custom resource recommendations based on traffic patterns
apiVersion: v1
kind: ConfigMap
metadata:
name: resource-profiles
data:
low-traffic.yaml: |
cpu: 200m
memory: 256Mi
medium-traffic.yaml: |
cpu: 500m
memory: 512Mi
high-traffic.yaml: |
cpu: 1
memory: 1Gi
peak-traffic.yaml: |
cpu: 2
memory: 2Gi
2. Multi-Tier Node Allocation Strategy
# Node affinity for different workload tiers
apiVersion: apps/v1
kind: Deployment
metadata:
name: critical-service
spec:
replicas: 5
template:
spec:
nodeSelector:
node-tier: "critical"
spot-instance: "false"
tolerations:
- key: "critical-only"
operator: "Equal"
value: "true"
effect: "NoSchedule"
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["critical-service"]
topologyKey: kubernetes.io/hostname
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: instance-type
operator: In
values: ["c5.2xlarge", "c5.4xlarge"]
---
# Background jobs on spot instances
apiVersion: batch/v1
kind: Job
metadata:
name: data-processing
spec:
template:
spec:
nodeSelector:
node-tier: "background"
spot-instance: "true"
tolerations:
- key: "spot-instance"
operator: "Equal"
value: "true"
effect: "NoSchedule"
restartPolicy: OnFailure
🔒 Security Hardening Patterns
1. Zero-Trust Network Policies
# Default deny-all baseline
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
---
# Microservice communication policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-service-policy
spec:
podSelector:
matchLabels:
app: api-service
policyTypes:
- Ingress
- Egress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-system
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: TCP
port: 53
- protocol: UDP
port: 53
2. Pod Security Standards Implementation
# Restricted pod security configuration
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: secure-app
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 65534
runAsGroup: 65534
fsGroup: 65534
seccompProfile:
type: RuntimeDefault
containers:
- name: app
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
volumeMounts:
- name: tmp-volume
mountPath: /tmp
readOnly: false
- name: cache-volume
mountPath: /app/cache
readOnly: false
volumes:
- name: tmp-volume
emptyDir: {}
- name: cache-volume
emptyDir: {}
📊 Observability & Monitoring
1. Custom Metrics with Prometheus
# ServiceMonitor for custom application metrics
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: api-service-metrics
labels:
app: api-service
spec:
selector:
matchLabels:
app: api-service
endpoints:
- port: metrics
interval: 15s
path: /metrics
relabelings:
- sourceLabels: [__meta_kubernetes_pod_name]
targetLabel: pod
- sourceLabels: [__meta_kubernetes_namespace]
targetLabel: namespace
---
# PrometheusRule for alerting
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: api-service-alerts
spec:
groups:
- name: api-service.rules
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) / rate(http_requests_total[5m]) > 0.05
for: 2m
labels:
severity: warning
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value | humanizePercentage }} for {{ $labels.service }}"
- alert: ResponseTimeHigh
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 0.5
for: 3m
labels:
severity: critical
annotations:
summary: "High response time"
description: "95th percentile latency is {{ $value }}s"
2. Distributed Tracing Configuration
# OpenTelemetry Collector configuration
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-config
data:
config.yaml: |
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
processors:
batch:
timeout: 1s
send_batch_size: 1024
memory_limiter:
limit_mib: 512
exporters:
jaeger:
endpoint: jaeger-collector:14250
tls:
insecure: true
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
traces:
receivers: [otlp, jaeger]
processors: [memory_limiter, batch]
exporters: [jaeger]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus]
---
# Sidecar injection for automatic instrumentation
apiVersion: apps/v1
kind: Deployment
metadata:
name: instrumented-app
spec:
template:
metadata:
annotations:
sidecar.opentelemetry.io/inject: "true"
spec:
containers:
- name: app
env:
- name: OTEL_SERVICE_NAME
value: "api-service"
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://localhost:4317"
⚡ High-Performance Patterns
1. Advanced Horizontal Pod Autoscaler
# Multi-metric HPA with custom metrics
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: advanced-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
minReplicas: 3
maxReplicas: 100
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: custom_requests_per_second
target:
type: AverageValue
averageValue: "100"
- type: Object
object:
metric:
name: queue_messages_ready
describedObject:
apiVersion: v1
kind: Service
name: rabbitmq
target:
type: Value
value: "50"
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 10
periodSeconds: 60
selectPolicy: Max
2. Cluster Autoscaler Optimization
# Node pool configuration for optimal scaling
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-status
namespace: kube-system
data:
nodes.max: "200"
nodes.min: "10"
scale-down-delay-after-add: "10m"
scale-down-unneeded-time: "5m"
skip-nodes-with-local-storage: "false"
skip-nodes-with-system-pods: "false"
---
# Priority class for critical workloads
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: high-priority
value: 1000000
globalDefault: false
description: "High priority class for critical services"
---
# Pod disruption budget
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-service-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: api-service
🔄 GitOps & Deployment Strategies
1. Advanced Blue-Green Deployment
# Blue-Green with Argo Rollouts
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: api-service
spec:
replicas: 10
strategy:
blueGreen:
activeService: api-service-active
previewService: api-service-preview
autoPromotionEnabled: false
scaleDownDelaySeconds: 30
prePromotionAnalysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: api-service-preview
postPromotionAnalysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: api-service-active
selector:
matchLabels:
app: api-service
template:
metadata:
labels:
app: api-service
spec:
containers:
- name: api-service
image: myapp:latest
ports:
- containerPort: 8080
---
# Analysis template for automated quality gates
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: success-rate
spec:
args:
- name: service-name
metrics:
- name: success-rate
interval: 60s
successCondition: result[0] >= 0.95
failureLimit: 3
provider:
prometheus:
address: http://prometheus:9090
query: |
sum(rate(http_requests_total{service="{{args.service-name}}",status!~"5.."}[5m])) /
sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))
💾 Stateful Application Patterns
1. Advanced StatefulSet Configuration
# Production database StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-primary
spec:
serviceName: postgres-primary
replicas: 1
selector:
matchLabels:
app: postgres-primary
template:
spec:
initContainers:
- name: postgres-init
image: postgres:15
command:
- /bin/bash
- -c
- |
if [ ! -f /var/lib/postgresql/data/postgresql.conf ]; then
initdb -D /var/lib/postgresql/data
echo "host replication replicator 0.0.0.0/0 md5" >> /var/lib/postgresql/data/pg_hba.conf
fi
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
containers:
- name: postgres
image: postgres:15
env:
- name: POSTGRES_DB
value: myapp
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-secret
key: username
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
- name: POSTGRES_REPLICATION_USER
value: replicator
- name: POSTGRES_REPLICATION_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: replication-password
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
- name: postgres-config
mountPath: /etc/postgresql/postgresql.conf
subPath: postgresql.conf
livenessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U $POSTGRES_USER -d $POSTGRES_DB
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
exec:
command:
- /bin/sh
- -c
- pg_isready -U $POSTGRES_USER -d $POSTGRES_DB
initialDelaySeconds: 5
periodSeconds: 5
volumes:
- name: postgres-config
configMap:
name: postgres-config
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "fast-ssd"
resources:
requests:
storage: 100Gi
🚨 Production Lessons Learned
Key Metrics That Matter
# Essential monitoring queries
# 1. Pod restart frequency (indicates instability)
increase(kube_pod_container_status_restarts_total[1h]) > 5
# 2. Memory pressure detection
(container_memory_usage_bytes / container_spec_memory_limit_bytes) > 0.9
# 3. Node resource exhaustion
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.85
# 4. Persistent volume space
(kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.8
# 5. API server latency
histogram_quantile(0.99, apiserver_request_duration_seconds_bucket) > 1
Common Anti-Patterns to Avoid
# ❌ DON'T: Resource limits without requests
resources:
limits:
memory: "1Gi"
# Missing requests causes poor scheduling
# ✅ DO: Always set both requests and limits
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
# ❌ DON'T: Running as root
securityContext:
runAsUser: 0
# ✅ DO: Use non-root user
securityContext:
runAsUser: 65534
runAsNonRoot: true
readOnlyRootFilesystem: true
🎯 2025 Production Checklist
Before Every Deployment:
- Resource requests/limits defined
- Readiness/liveness probes configured
- Security context properly set
- Network policies in place
- Monitoring/alerting configured
- Pod disruption budgets created
- Backup and disaster recovery tested
Monthly Reviews:
- Resource utilization analysis
- Security vulnerability scans
- Performance baseline updates
- Cost optimization opportunities
- Capacity planning adjustments
Running Kubernetes in production is an ongoing journey of optimization, monitoring, and continuous improvement. These patterns have saved us countless midnight pages and multi-million dollar outages.
WY
Cap
Senior Golang Backend & Web3 Developer with 10+ years of experience building scalable systems and blockchain solutions.
View Full Profile →