Summary
When multiple pods share a Linux PID namespace (most commonly via hostPID: true), PodContainerByPIDNs returns an arbitrary pod for the shared namespace inode, causing spans to be decorated with the wrong pod's k8s metadata (k8s.pod.name, k8s.namespace.name, k8s.deployment.name, service.name, service.namespace, service.instance.id, etc.).
This is especially severe on OpenShift clusters where many system DaemonSets run with hostPID: true by default (OVN-Kubernetes, machine-config-daemon, node-exporter, multus, tuned, etc.), creating a large collision pool under the host init_pid_ns inode (4026531836).
Root Cause
PodContainerByPIDNs (pkg/kube/store.go) looks up pod metadata using only the PID namespace inode:
func (s *Store) PodContainerByPIDNs(pidns uint32) (*kube.CachedObjMeta, string) {
if infos, ok := s.namespaces[pidns]; ok {
for _, info := range infos {
if om, ok := s.podsByContainer[info.ContainerID]; ok {
// ...
return om, containerName
}
break // "the namespace is the same for all pids in the container"
}
}
return nil, ""
}
The comment "the namespace is the same for all pids in the container, we need to check one only" is correct for isolated PID namespaces, but incorrect when hostPID: true -- all such pods share the host's init_pid_ns inode. The function iterates a Go map (non-deterministic order), picks the first entry, and breaks. This means every span whose PID lives in the shared namespace is attributed to whichever pod Go's map iteration happens to return first.
Symptoms
- Spans from DaemonSet X show
service.name, service.namespace of Deployment Y
- Particularly acute on OpenShift due to the large number of system DaemonSets with
hostPID: true
Reproduction
- Deploy OBI as a DaemonSet (with
hostPID: true, as required)
- Deploy an application pod (Deployment, no
hostPID)
- Deploy a second DaemonSet with
hostPID: true that serves HTTP (e.g., a node-proxy, node-exporter with HTTP endpoints, or any hostPID workload)
- Configure OBI to instrument both workloads
- Observe that spans from one workload carry the other's k8s metadata
Environment
- Kubernetes/OpenShift with multiple
hostPID: true DaemonSets
- OBI running as a DaemonSet with
hostPID: true
- Any application workload instrumented by OBI
Summary
When multiple pods share a Linux PID namespace (most commonly via
hostPID: true),PodContainerByPIDNsreturns an arbitrary pod for the shared namespace inode, causing spans to be decorated with the wrong pod's k8s metadata (k8s.pod.name,k8s.namespace.name,k8s.deployment.name,service.name,service.namespace,service.instance.id, etc.).This is especially severe on OpenShift clusters where many system DaemonSets run with
hostPID: trueby default (OVN-Kubernetes, machine-config-daemon, node-exporter, multus, tuned, etc.), creating a large collision pool under the hostinit_pid_nsinode (4026531836).Root Cause
PodContainerByPIDNs(pkg/kube/store.go) looks up pod metadata using only the PID namespace inode:The comment "the namespace is the same for all pids in the container, we need to check one only" is correct for isolated PID namespaces, but incorrect when
hostPID: true-- all such pods share the host'sinit_pid_nsinode. The function iterates a Go map (non-deterministic order), picks the first entry, andbreaks. This means every span whose PID lives in the shared namespace is attributed to whichever pod Go's map iteration happens to return first.Symptoms
service.name,service.namespaceof Deployment YhostPID: trueReproduction
hostPID: true, as required)hostPID)hostPID: truethat serves HTTP (e.g., a node-proxy, node-exporter with HTTP endpoints, or any hostPID workload)Environment
hostPID: trueDaemonSetshostPID: true