[release-4.21] OCPBUGS-88295, OCPBUGS-88297, OCPBUGS-82146, OCPBUGS-78330, OCPBUGS-85550: Replace OLM-based Istio install with Sail Library#1442
Conversation
|
Skipping CI for Draft Pull Request. |
|
@gcs278: This pull request references NE-2471 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.z" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
c647f7b to
b47ed5e
Compare
|
No longer pursuing this |
|
@gcs278: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Some new information makes this backport attractive again. |
|
@gcs278: Reopened this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
b47ed5e to
af43e28
Compare
530e487 to
96ad9e5
Compare
|
/test ? |
|
i manually pullled in #1444 for now - because we need to bump to istio 1.28.5, and might as well bump the GWAPI CRDs /test e2e-aws-operator-techpreview |
9b7956c to
9df9dba
Compare
|
PR lgtm:
/approve still needs a pass from @Miciah and from QE team |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rikatz The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-88297, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-82146, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-78330, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-88297, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-82146, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-78330, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira cherrypick OCPBUGS-85550 |
|
@gcs278: An error was encountered cloning bug for cherrypick for bug OCPBUGS-85550 on the Jira server at https://redhat.atlassian.net. No known errors were detected, please see the full error message for details. Full error message.
request failed. Please analyze the request body for more details. Status code: 400: {"errorMessages":[],"errors":{"customfield_10980":"Field does not support update 'customfield_10980'","customfield_10978":"Field does not support update 'customfield_10978'","customfield_10979":"Field does not support update 'customfield_10979'"}}
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira cherrypick OCPBUGS-85550 |
|
@gcs278: An error was encountered cloning bug for cherrypick for bug OCPBUGS-85550 on the Jira server at https://redhat.atlassian.net. No known errors were detected, please see the full error message for details. Full error message.
request failed. Please analyze the request body for more details. Status code: 400: {"errorMessages":[],"errors":{"customfield_10980":"Field does not support update 'customfield_10980'","customfield_10978":"Field does not support update 'customfield_10978'","customfield_10979":"Field does not support update 'customfield_10979'"}}
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-88297, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-82146, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-78330, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. This pull request references Jira Issue OCPBUGS-85550, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-88297, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-82146, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-78330, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-85550, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-88295, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: This pull request references Jira Issue OCPBUGS-88297, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-82146, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-78330, which is valid. 7 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (iamin@redhat.com), skipping review request. This pull request references Jira Issue OCPBUGS-85550, which is valid. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-ovn-hypershift-conformance |
|
/lgtm |
|
verified all the related bugs OCPBUGS-82146, OCPBUGS-78330 and NO-OLM check OCPBUGS-88297 -> no verbose INFO logging found in CIO logs OCPBUGS-85550 -> controllers are started in istiod pod logs /verified by @rhamini3 |
|
@rhamini3: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/label backport-risk-assessed Some risk is taken on with this backport, but the risk tradeoff is worth the gains of being able to patch OSSM CVES, and fixing several customer bugs related to OLM.
|
|
the CI failure for multi-pr job passed our tests, but failed on unrelated And e2e-pre-release-ossm doesn't work anymore on the no-OLM path, that will always fail now. All other current failures are infra failures. /retest |
|
/testwith openshift/cluster-ingress-operator/release-4.21/e2e-aws-operator-techpreview openshift/api#2873 |
|
@gcs278: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@gcs278: Jira Issue OCPBUGS-88295: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged: All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-88295 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. Jira Issue OCPBUGS-88297: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged: All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-88297 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. Jira Issue OCPBUGS-82146: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged: All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-82146 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. Jira Issue OCPBUGS-78330: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged: All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-78330 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. Jira Issue OCPBUGS-85550: Some pull requests linked via external trackers have merged: The following pull request, linked via external tracker, has not merged: All associated pull requests must be merged or unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-85550 has not been moved to the MODIFIED state. This PR is marked as verified. If the remaining PRs listed above are marked as verified before merging, the issue will automatically be moved to VERIFIED after all of the changes from the PRs are available in an accepted nightly payload. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Summary
Backport of the noOLM / Sail Library installation path (NE-2286, shipped in 4.22) to release-4.21. This resolves several fundamental OLM bugs that have no viable OLM-based workaround — most critically OCPBUGS-86778, which blocks all OSSM z-stream upgrades and prevents shipping CVE fixes.
This PR is intended to merge with the
GatewayAPIWithoutOLMfeature gate disabled, making it a no-op on merge. The goal is to subsequently enable the gate by default (via openshift/api) to activate the Sail Library path and resolve the OLM issues.Cherry-picked PRs
istio_sail_installer.go,istio_olm.gorefactor,migration.go,status.go, CRD manifests, Sail Library RBAC manifestsNote: #1393 (OCPBUGS-79667: Use feature-gate annotation for Sail Library RBAC) was also a dependency but is being skipped because CVO on this release does not support the
release.openshift.io/feature-gateannotation (openshift/cluster-version-operator#1273 was not backported). As a result, the Sail Library RBAC manifests use therelease.openshift.io/feature-setannotation and a separate PR will be needed to remove this annotation before promoting the feature gate to GA.Versioning
This backport does not bump the Gateway API CRDs (remain at v1.3.0) or the Istio version (remains at v1.27.3) for the noOLM code path. When the
GatewayAPIWithoutOLMfeature gate is enabled, the Sail Library will install Istio using the same v1.27.3 version that the OLM path currently uses. This works because the vendored Sail Library (OSSM 3.3.1) still supports Istio 1.27.3.The GWAPI CRD bump to v1.4.1 and Istio version bump to v1.28.5 will follow separately via #1444, allowing us to validate the noOLM path independently from the version changes.
When noOLM shipped in 4.22, the OLM and noOLM versions were already aligned at 3.3.1, so version separation was not needed. On 4.21, the OLM path is on 3.2.0 — keeping both paths at the same Istio version avoids introducing conditional logic or separate deployment manifests in the backport.
Conflicts resolved
pkg/operator/operator.go: AddedGatewayAPIWithoutOLMgate alongside existing 4.21 gates (GatewayAPI,GatewayAPIController,RouteExternalCertificate)pkg/operator/controller/status/controller.go: Took incoming noOLM logic (useOLM/useSailLibrary, conditional subscription listing) but wrapped in existing 4.21GatewayAPIEnabledguardtest/e2e/gateway_api_test.go: Kept 4.21gatewayAPIControllerEnabledguard, addedgatewayAPIWithoutOLMEnabledconditionals inside for Sail Library vs OLM test selection. KeptxcrdNamesalongside newistioCRDNames. Removed references totestGatewayAPIInfrastructureAnnotations,testGatewayAPIInternalLoadBalancer, andtestGatewayOpenshiftConditionswhich were added in separate PRs not present on release-4.21.go.mod/ vendor**: Addedreplacedirectives foropenshift/api(fork with gate) andsail-operator(downstream fork withpkg/install)pkg/operator/controller/canary/daemonset.go(OCPBUGS-79467: Change default log level from DEBUG to INFO #1402 commit 2): Skipped — references canary cert hash variables not present on 4.21Rollout Plan
Phase 1 — Land code (gate OFF)
Phase 2 — TechPreview soak
Phase 3 — GA promotion
Follow-up
Go Dependency Updates
Transitive dependency changes
The sail-operator (OSSM 3.3.1) brings in new transitive dependencies for Helm chart rendering (
helm.sh/helm/v3), Istio utility libraries (istio.io/istio/pkg/log,pkg/ptr,pkg/slices,pkg/util/sets), and their dependency chains. These are all indirect — vendored but not imported by CIO code directly. k8s modules received a patch bump (0.34.1 → 0.34.3) fromgo mod tidy. Both are low risk.controller-runtime (pinned: v0.22.5 → v0.21.0)
The sail-operator requires controller-runtime v0.22.5, but we pin back to v0.21.0 — the version CIO's own code was built and tested against on 4.21. CIO's core controller logic (client, cache, manager, controller wiring) is unchanged and continues to run against the same controller-runtime it shipped with. The sail library's install package only uses basic
client.Clientoperations (New,Get,Create,Update) andpkg/log— all unchanged since controller-runtime v0.1. No other vendored dependency calls controller-runtime APIs.On 4.21, this pin is not strictly required since 4.21 is already on k8s 0.34 and a patch bump poses no compatibility risk. However, on 4.20 and 4.19 the pin is essential because controller-runtime 0.22 would force a k8s minor version bump, causing incompatibilities with the frozen openshift ecosystem packages (client-go, library-go). Pinning here maintains a consistent approach across all three backport branches.
gateway-api (pinned: v1.4.1 → v1.3.0)
The sail-operator pulls in gateway-api v1.4.1, but we pin back to v1.3.0 (the original 4.21 version). The CRD manifests shipped in this release are v1.3.0, and the Go types are forward-compatible. Pinning keeps the vendored types aligned with the CRDs installed on the cluster.
Test Coverage
Early CI test results (OLM, OLM-to-OLM upgrade, noOLM, and OLM-to-noOLM migration) are documented in the origin test backport PR: openshift/origin#31232
Verification
go build ./pkg/operator/controller/gatewayclass/...compilesgo test ./pkg/operator/controller/gatewayclass/...passes🤖 Generated with Claude Code