Summary
The postmortem investigates why a Kubernetes DaemonSet could not expose a node label as an environment variable for its pods.
The expected behavior was that the INSTANCE variable would contain the value of the label grafana-map assigned to each node, but the deployment failed because that label was not correctly referenced in the pod spec.
Root Cause
- The
fieldRefwas incorrectly pointed tospec.nodeName.grafana-map.value, which is not a valid field path. spec.nodeNamereturns the node’s name, not its labels.- There was no fallback logic for nodes lacking the
grafana-maplabel, leading to empty ornullvalues.
Why This Happens in Real Systems
- Misunderstanding of
fieldRefcapabilities – developers often assume they can reach arbitrary node labels throughfieldRef. - Label explosion – many clusters iterate labels for telemetry but forget the correct syntax.
- Lack of validation tooling – IaC pipelines rarely flag that
spec.nodeName.grafana-map.valueis unsupported.
Real-World Impact
- Runtime errors in Grafana dashboards that expected the
INSTANCEvalue. - Increased debugging time because logs contained
INSTANCE=with no value. - Potential data loss if dashboards filtered data by node label values that were never set.
Example or Code (if necessary and relevant)
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: grafana-daemon
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:latest
env:
- name: INSTANCE
valueFrom:
fieldRef:
fieldPath: metadata.labels['grafana-map']
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
How Senior Engineers Fix It
- Use the proper field path:
metadata.labels['grafana-map']to reference the node label. - Add a default value with
value: defaultto avoid empty variables. - Validate DaemonSet manifests using
kustomize build --validateor Open Policy Agent checks. - Test locally by running
kubectl describe node <node>to ensure labels exist.
Why Juniors Miss It
- Assuming
fieldRefcan read any node metadata instead of the documented fields. - Over-reliance on auto-completion in IDEs, which may not expose label references.
- Skipping documentation review and missing the explanation about
metadata.labelsvs.spec.nodeName.