OCPBUGS-74970: Fix kubelet certificate wait loop in criometricsproxy.yaml#6125
OCPBUGS-74970: Fix kubelet certificate wait loop in criometricsproxy.yaml#6125aksjadha wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThe PR inverts the kubelet server certificate wait condition across three node configuration templates and updates an arbiter volumeMount path. InitContainers now wait until /var/lib/kubelet/pki/kubelet-server-current.pem exists before proceeding. ChangesKubelet Certificate Readiness Wait
🎯 3 (Moderate) | ⏱️ ~20 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aksjadha The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
… init container's mountPath /var/lib/kubelet[C
af45f24 to
ee0dcdd
Compare
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix is working as expected. Before fix, there were multiple restarts of pods as init container was not waiting for file to exist. With fix, init container checking if file exists and checking correct mount path i,e /var/lib/kubelet/, there are no multiple restart. |
|
@aksjadha: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| - | | ||
| echo -n "Waiting for kubelet key and certificate to be available" | ||
| while [ -n "$(test -e /var/lib/kubelet/pki/kubelet-server-current.pem)" ] ; do | ||
| while [ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ] ; do |
There was a problem hiding this comment.
The tries variable is incremented but never checked, so the loop could run forever if the certificate never appears.
| volumeMounts: | ||
| - name: var-lib-kubelet | ||
| mountPath: "/var" | ||
| mountPath: "/var/lib/kubelet" |
There was a problem hiding this comment.
Can you also add readOnly:true. In my understanding it just reads the cert
Related bug: Fix kubelet certificate wait loop and mount path in criometricsproxy.yaml
The previous condition
[ -n "$(test -e ...)" ]always evaluated to false becausetest -eproduces no stdout output — it communicates via exit code only. So-nalways evaluated to false, causing the loop to exit immediately instead of waiting for the kubelet certificate to appear.The init container mounts the host's /
var/lib/kubeletat/var. So inside the init container, the host's/var/lib/kubelet/pki/kubelet-server-current.pemappears at/var/pki/kubelet-server-current.pem— but the script checks/var/lib/kubelet/pki/kubelet-server-current.pem, which doesn't exist at that path inside the init container.- What I did
[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file exists/var/lib/kubeletfrom/varto match main container so the script's path/var/lib/kubelet/pki/kubelet-server-current.pemresolves correctly.- How to verify it
kube-rbac-proxy-crio-ippod logs in namespaceopenshift-machine-config-operatorto verify the CRI-O metrics proxy init container correctly waits for the kubelet certificate before proceeding- Description for the changelog
criometricsproxy.yamlacross all node roles (arbiter, master, worker)[ -n "$(test -e /var/lib/kubelet/pki/kubelet-server-current.pem)" ]was incorrect. Replaced with the correct condition[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file existsSummary by CodeRabbit