If you run Kubernetes in the EU, NIS2 is part of your day-to-day now. The directive has applied since October 2024, and each member state has been enforcing it through local law. I have spent the last few months hardening real clusters for these requirements, so this post is the practical version of what I learned.
This is not legal advice. It is the technical checklist I wish I had from day one.
What NIS2 Means for K8s Operators (The Short Version)
NIS2 expands the original NIS directive. If your organization is classified as an essential or important entity (energy, transport, health, digital infrastructure, and others), you need to show:
- Risk management measures for your IT systems
- Incident detection and reporting (24 hours for early warning, 72 hours for a full report)
- Supply chain security
- Business continuity planning
- Audit logging with retention
Kubernetes is involved in all of that. Here is how I tackle each area.
1. Network Policies: Stop the Lateral Movement
By default, pods can usually talk to each other freely. For compliance, that is a problem. NIS2 expects segmentation and access control.
I start every namespace with default deny:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Then I allow only the flows I actually need:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-api
namespace: production
spec:
podSelector:
matchLabels:
app: api
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
I always test the policy instead of assuming it works:
# Quick test - this should timeout if deny-all is working
kubectl run test-pod --rm -it --image=busybox --restart=Never \
-n production -- wget -qO- --timeout=3 http://api:8080
Also check your CNI. Flannel does not enforce NetworkPolicies. Calico, Cilium, and Weave do.
2. RBAC Audit: Who Can Do What
NIS2 requires least privilege access control. In Kubernetes, that means regular RBAC audits.
I usually start by finding overprivileged service accounts:
# List all ClusterRoleBindings that grant cluster-admin
kubectl get clusterrolebindings -o json | \
jq -r '.items[] | select(.roleRef.name=="cluster-admin") |
.metadata.name + " -> " + (.subjects[]? | .kind + "/" + .name)'
Then I check for risky permissions:
# Find roles that can exec into pods (potential data exfiltration)
kubectl get roles,clusterroles -A -o json | \
jq -r '.items[] | select(.rules[]? |
.resources[]? == "pods/exec") | .metadata.name'
I keep a small RBAC review script and run it monthly:
#!/bin/bash
echo "=== Cluster Admin Bindings ==="
kubectl get clusterrolebindings -o json | \
jq -r '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name'
echo -e "\n=== Wildcard Permissions ==="
kubectl get clusterroles -o json | \
jq -r '.items[] | select(.rules[]? |
(.verbs[]? == "*") or (.resources[]? == "*")) | .metadata.name'
echo -e "\n=== Service Accounts with Secrets Access ==="
kubectl get roles,clusterroles -A -o json | \
jq -r '.items[] | select(.rules[]? |
.resources[]? == "secrets") |
.metadata.namespace + "/" + .metadata.name'
3. Image Scanning with Trivy
Supply chain security is a core NIS2 requirement. I need to know what is inside every image and catch critical issues before production.
I use Trivy both for ad hoc checks and broader cluster scans:
# Scan a specific image
trivy image --severity HIGH,CRITICAL your-registry.io/app:latest
# Scan all images currently running in your cluster
kubectl get pods -A -o jsonpath='{range .items[*]}{range .spec.containers[*]}{.image}{"\n"}{end}{end}' | \
sort -u | while read img; do
echo "Scanning: $img"
trivy image --severity HIGH,CRITICAL --quiet "$img"
done
For continuous coverage, I deploy Trivy Operator:
helm repo add aqua https://aquasecurity.github.io/helm-charts/
helm install trivy-operator aqua/trivy-operator \
--namespace trivy-system \
--create-namespace \
--set trivy.severity="HIGH,CRITICAL"
That gives me VulnerabilityReport CRDs for workloads:
# Check vulnerability reports
kubectl get vulnerabilityreports -A \
-o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: Critical={.report.summary.criticalCount} High={.report.summary.highCount}{"\n"}{end}'
4. Runtime Security with Falco
NIS2 expects real-time detection, not only preventive controls. Falco watches syscalls and alerts on suspicious behavior inside containers.
I deploy it with Helm:
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco \
--namespace falco \
--create-namespace \
--set falcosidekick.enabled=true \
--set falcosidekick.config.slack.webhookurl="https://hooks.slack.com/services/YOUR/WEBHOOK" \
--set driver.kind=ebpf
Default Falco rules are a good start, and I add extra rules for the risks I care about under NIS2:
# custom-rules.yaml
- rule: Sensitive File Access in Container
desc: Detect reads of sensitive files that could indicate data exfiltration
condition: >
container and open_read and
(fd.name startswith /etc/shadow or
fd.name startswith /etc/passwd or
fd.name startswith /run/secrets)
output: >
Sensitive file opened in container
(file=%fd.name user=%user.name container=%container.name
image=%container.image.repository pod=%k8s.pod.name ns=%k8s.ns.name)
priority: WARNING
tags: [nis2, data_access]
- rule: Unexpected Outbound Connection
desc: Container making connections to unexpected external IPs
condition: >
container and evt.type=connect and fd.typechar=4 and
fd.ip != "0.0.0.0" and not fd.snet in (rfc_1918_addresses)
output: >
Unexpected outbound connection
(dest=%fd.name user=%user.name container=%container.name
pod=%k8s.pod.name ns=%k8s.ns.name)
priority: NOTICE
tags: [nis2, network]
5. Policy Enforcement with Kyverno
I also need guardrails that stop insecure manifests before they hit the cluster. Kyverno works well as an admission controller and uses standard Kubernetes YAML.
Install it:
helm repo add kyverno https://kyverno.github.io/kyverno/
helm install kyverno kyverno/kyverno \
--namespace kyverno \
--create-namespace
These are the baseline policies I enforce for NIS2:
# Require non-root containers
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-non-root
spec:
validationFailureAction: Enforce
rules:
- name: check-non-root
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Running as root is not allowed for NIS2 compliance"
pattern:
spec:
containers:
- securityContext:
runAsNonRoot: true
---
# Require image pull from approved registries only
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-image-registries
spec:
validationFailureAction: Enforce
rules:
- name: validate-registries
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Images must come from approved registries"
pattern:
spec:
containers:
- image: "your-registry.io/*"
---
# Require resource limits (prevents noisy neighbor DoS)
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-resource-limits
spec:
validationFailureAction: Enforce
rules:
- name: check-limits
match:
any:
- resources:
kinds:
- Pod
validate:
message: "Resource limits are required"
pattern:
spec:
containers:
- resources:
limits:
memory: "?*"
cpu: "?*"
6. Audit Logging and Retention
NIS2 expects logs that support incident investigation. Kubernetes audit logs tell me who did what, and when.
I define an audit policy like this:
# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log all changes to pods, deployments, secrets
- level: RequestResponse
resources:
- group: ""
resources: ["pods", "secrets", "configmaps"]
- group: "apps"
resources: ["deployments", "statefulsets", "daemonsets"]
verbs: ["create", "update", "patch", "delete"]
# Log authentication decisions
- level: Metadata
resources:
- group: "authentication.k8s.io"
# Log RBAC changes
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
# Skip noisy read-only requests to reduce volume
- level: None
verbs: ["get", "list", "watch"]
resources:
- group: ""
resources: ["events"]
To enable it on a kubeadm cluster, I update /etc/kubernetes/manifests/kube-apiserver.yaml:
spec:
containers:
- command:
- kube-apiserver
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
- --audit-log-path=/var/log/kubernetes/audit.log
- --audit-log-maxage=90
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
For centralized logging, I ship audit logs to my SIEM. I use Fluent Bit:
helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-bit fluent/fluent-bit \
--namespace logging \
--create-namespace \
--set config.outputs="[OUTPUT]\n Name es\n Match *\n Host elasticsearch.logging.svc\n Port 9200\n Index k8s-audit\n Retry_Limit 5"
NIS2 does not define one exact retention window, but many national implementations expect 12 to 18 months. I set ILM policy in Elasticsearch to match that.
7. Incident Response: Meeting the 24h/72h Deadlines
NIS2 reporting timelines are strict:
- 24 hours: Early warning to the national CSIRT after you become aware of a significant incident
- 72 hours: Full incident notification with initial assessment
- 1 month: Final report
That means alerting has to work well. This is how I connect Falco to an incident workflow:
# falcosidekick values for incident alerting
config:
slack:
webhookurl: "https://hooks.slack.com/services/YOUR/WEBHOOK"
minimumpriority: "warning"
webhook:
address: "https://your-incident-api.internal/falco"
minimumpriority: "critical"
pagerduty:
routingkey: "YOUR_ROUTING_KEY"
minimumpriority: "critical"
I also run Prometheus alerts for cluster-level anomalies:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: nis2-security-alerts
namespace: monitoring
spec:
groups:
- name: nis2-compliance
rules:
- alert: UnauthorizedAPIAccess
expr: |
sum(rate(apiserver_audit_event_total{verb=~"create|update|delete",code=~"4.."}[5m])) > 10
for: 2m
labels:
severity: critical
compliance: nis2
annotations:
summary: "High rate of unauthorized API calls detected"
runbook: "Check audit logs, may require NIS2 24h notification"
- alert: PodSecurityViolation
expr: |
sum(increase(falco_events{priority=~"Critical|Emergency"}[10m])) > 0
for: 1m
labels:
severity: critical
compliance: nis2
annotations:
summary: "Falco detected critical security event"
I document the incident process and make sure everyone knows exactly what to do. I also keep an early warning template ready so someone can submit it quickly within 24 hours if needed.
8. Supply Chain Security: SBOMs and Signed Images
NIS2 explicitly covers supply chain risk. For containers, I focus on two controls: SBOMs and signature verification.
Generate SBOMs with Syft
# Generate SBOM for an image
syft your-registry.io/app:latest -o spdx-json > app-sbom.spdx.json
# Attach SBOM to your image in the registry
cosign attach sbom --sbom app-sbom.spdx.json your-registry.io/app:latest
Sign Images with Cosign/Sigstore
# Generate a keypair (do this once, store the private key securely)
cosign generate-key-pair
# Sign your image after building
cosign sign --key cosign.key your-registry.io/app:latest
# Verify the signature
cosign verify --key cosign.pub your-registry.io/app:latest
For keyless signing in CI/CD, I use Sigstore Fulcio:
# In your CI pipeline - signs with OIDC identity, no keys to manage
cosign sign your-registry.io/app:latest
Then I enforce signature checks with Kyverno:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signatures
spec:
validationFailureAction: Enforce
rules:
- name: check-signature
match:
any:
- resources:
kinds:
- Pod
verifyImages:
- imageReferences:
- "your-registry.io/*"
attestors:
- entries:
- keys:
publicKeys: |-
-----BEGIN PUBLIC KEY-----
YOUR_COSIGN_PUBLIC_KEY_HERE
-----END PUBLIC KEY-----
The Compliance Checklist
Here is the condensed checklist I keep on hand:
- Default-deny NetworkPolicies in all namespaces
- RBAC audit done, no wildcard cluster-admin floating around
- Image scanning in CI/CD pipeline (Trivy)
- Continuous scanning in-cluster (Trivy Operator)
- Runtime detection deployed (Falco)
- Admission policies enforced (Kyverno or OPA Gatekeeper)
- Kubernetes audit logging enabled with 12+ month retention
- Centralized logging with Fluent Bit or similar
- Alerting pipeline with on-call rotation
- Incident response runbook with NIS2 reporting templates
- SBOM generation for all production images
- Image signing in CI/CD
- Signature verification in admission control
- Regular penetration testing (at least annually)
Wrapping Up
I do not treat NIS2 as a one-time checkbox. In practice, it is continuous hardening, monitoring, and response work. Kubernetes already has strong tools for this, and most of the effort is in configuration and operations, not custom code.
If I had to prioritize, I would start with network policies, RBAC review, and image scanning. Then I would add runtime detection and admission policies. After that, I would tighten logging and incident response.
The 24-hour reporting requirement is what usually causes pain. If the alerting and response path are not ready before an incident, the timeline gets very hard to meet.
If you are running through the same process, I would be interested to hear what worked for your environment.