Skip to content

[release-4.20] NE-2286: Backport noOLM / Sail Library to release-4.20#1459

Open
gcs278 wants to merge 11 commits into
openshift:release-4.20from
gcs278:backport-noOLM-4.20
Open

[release-4.20] NE-2286: Backport noOLM / Sail Library to release-4.20#1459
gcs278 wants to merge 11 commits into
openshift:release-4.20from
gcs278:backport-noOLM-4.20

Conversation

@gcs278

@gcs278 gcs278 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Backport of the noOLM / Sail Library installation path (NE-2286, shipped in 4.22) to release-4.20. This resolves several fundamental OLM bugs that have no viable OLM-based workaround — most critically OCPBUGS-86778, which blocks all OSSM z-stream upgrades and prevents shipping CVE fixes.

This PR is intended to merge with the GatewayAPIWithoutOLM feature gate disabled, making it a no-op on merge. The goal is to subsequently enable the gate by default (via openshift/api) to activate the Sail Library path and resolve the OLM issues.

Cherry-picked PRs

PR Title Why
#1354 NE-2471: Replace OLM-based Istio install with Sail Library Core change — adds istio_sail_installer.go, istio_olm.go refactor, migration.go, status.go, CRD manifests, Sail Library RBAC manifests
#1402 OCPBUGS-79467: Change default log level from DEBUG to INFO Sail Library generates ~2,000 debug logs/hour; without this fix, enabling noOLM floods the logs. Only the log level change (commit 1) is cherry-picked; commit 2 references code not present on 4.20.
#1404 NE-2519: Move Sail Library to official release branch Moves from dev Sail Library branch to official OSSM 3.3.1 release

Note: #1393 (OCPBUGS-79667: Use feature-gate annotation for Sail Library RBAC) was also a dependency but is being skipped because CVO on this release does not support the release.openshift.io/feature-gate annotation (openshift/cluster-version-operator#1273 was not backported). As a result, the Sail Library RBAC manifests use the release.openshift.io/feature-set annotation and a separate PR will be needed to remove this annotation before promoting the feature gate to GA.

Versioning

This backport does not bump the Gateway API CRDs (remain at v1.3.0) or the Istio version (remains at v1.26.2) for the noOLM code path. When the GatewayAPIWithoutOLM feature gate is enabled, the Sail Library will install Istio using the same v1.26.2 version that the OLM path currently uses. This works because the vendored Sail Library (OSSM 3.3.1) still supports Istio 1.26.2.

Dependency Pinning Approach

Unlike the 4.21 backport which bumped k8s and controller-runtime, this backport keeps all dependencies at their original 4.20 versions. The sail-operator (OSSM 3.3.1) requires k8s 0.34 and controller-runtime 0.22, but its pkg/install package only uses basic CRUD operations (client.New, client.Get, client.Create, client.Update) and stable types (metav1, corev1, runtime, rest.Config) that exist unchanged in the 4.20 versions.

To prevent go mod tidy from bumping dependencies transitively, the following replace directives pin modules to their 4.20 versions:

Module Pinned Version 4.20 Original
k8s.io/api v0.33.2 v0.33.2
k8s.io/apimachinery v0.33.2 v0.33.2
k8s.io/client-go v0.32.1 v0.32.1
k8s.io/apiextensions-apiserver v0.33.0 v0.33.0
k8s.io/apiserver v0.33.0 v0.33.0
k8s.io/component-base v0.33.0 v0.33.0
k8s.io/kube-openapi v0.0.0-20250318... v0.0.0-20250318...
sigs.k8s.io/controller-runtime v0.20.4 v0.20.4
sigs.k8s.io/gateway-api v1.2.1 v1.2.1
github.com/google/gnostic-models v0.6.9 v0.6.9

Risk assessment: The sail-operator install package uses only stable controller-runtime interfaces (client.Client CRUD operations, pkg/log, pkg/scheme). No APIs introduced in controller-runtime 0.21+ or k8s 0.34+ are used. The structured-merge-diff/v4 vs v6 incompatibility that would arise from bumping k8s is avoided entirely. This approach was validated by building successfully and by auditing every import in the sail-operator's pkg/install, api/v1, and resources packages.

Conflicts resolved

  • pkg/operator/operator.go: Added GatewayAPIWithoutOLM gate alongside existing 4.20 gates (GatewayAPI, GatewayAPIController, RouteExternalCertificate, IngressControllerLBSubnetsAWS, SetEIPForNLBIngressController)
  • pkg/operator/controller/status/controller.go: Took incoming noOLM logic (useOLM/useSailLibrary, conditional subscription listing) but wrapped in existing 4.20 GatewayAPIEnabled guard
  • test/e2e/gateway_api_test.go: Kept 4.20 gatewayAPIControllerEnabled guard, added gatewayAPIWithoutOLMEnabled conditionals inside for Sail Library vs OLM test selection. Kept xcrdNames alongside new istioCRDNames. Removed references to testGatewayAPIInfrastructureAnnotations, testGatewayAPIInternalLoadBalancer, and testGatewayOpenshiftConditions which were added in separate PRs not present on release-4.20.
  • go.mod / vendor/: Added replace directives for openshift/api (fork with gate), sail-operator (official OSSM 3.3.1), and dependency pins (see Dependency Pinning Approach above). Re-vendored from scratch.

Merge Order

  1. Merge openshift/api PR — FG as disabled, allows CI to start
  2. TODO: Backport noOLM E2E tests to origin release-4.20
  3. Merge this PR — Sail Library code lands, gate still OFF
  4. Merge [release-4.20] NE-2480: Promote GatewayAPIWithoutOLM feature gate to TechPreview api#2874 — FG promotion to TechPreview, allows CI soak
  5. Verify CI is green
  6. TODO: Merge CIO PR to remove release.openshift.io/feature-set annotation from Sail Library RBAC manifests
  7. Merge openshift/api PR — FG promotion to Default GA, activates noOLM
  8. Verify CI is green

Verification

  • make builds successfully
  • No unresolved merge conflict markers in any commit
  • Full CI (blocked on openshift/api dependency)

🤖 Generated with Claude Code

gcs278 added 4 commits June 2, 2026 16:51
Vendor the GatewayAPIWithoutOLM in the openshift/api repo to support
backporting the No OLM logic into the release-4.20 branch.

PR: openshift#1354
Cherry-picked from: 8a40966
openshift#1354

Conflicts resolved:
- pkg/operator/operator.go: Added GatewayAPIWithoutOLM gate alongside
  existing 4.21 gates (GatewayAPI, GatewayAPIController, RouteExternalCertificate)
- Note: Go tidy will fail in this commit since Aslak's development branch pulls in
  K8S dependencies that are too new - a future commit in this backport
  will vendor the official release resolve go.mod
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 2, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

@gcs278: This pull request references NE-2286 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.20." or "openshift-4.20.", but it targets "openshift-4.22" instead.

Details

In response to this:

Summary

Backport of the noOLM / Sail Library installation path (NE-2286, shipped in 4.22) to release-4.20. This resolves several fundamental OLM bugs that have no viable OLM-based workaround — most critically OCPBUGS-86778, which blocks all OSSM z-stream upgrades and prevents shipping CVE fixes.

This PR is intended to merge with the GatewayAPIWithoutOLM feature gate disabled, making it a no-op on merge. The goal is to subsequently enable the gate by default (via openshift/api) to activate the Sail Library path and resolve the OLM issues.

Cherry-picked PRs

PR Title Why
#1354 NE-2471: Replace OLM-based Istio install with Sail Library Core change — adds istio_sail_installer.go, istio_olm.go refactor, migration.go, status.go, CRD manifests, Sail Library RBAC manifests
#1393 OCPBUGS-79667: Use feature-gate annotation for Sail Library RBAC Conditionally deploys Sail Library RBAC based on GatewayAPIWithoutOLM feature gate — required for the gate to control RBAC deployment when enabled
#1402 OCPBUGS-79467: Change default log level from DEBUG to INFO Sail Library generates ~2,000 debug logs/hour; without this fix, enabling noOLM floods the logs. Only the log level change (commit 1) is cherry-picked; commit 2 references code not present on 4.20.
#1404 NE-2519: Move Sail Library to official release branch Moves from dev Sail Library branch to official OSSM 3.3.1 release

Versioning

This backport does not bump the Gateway API CRDs (remain at v1.3.0) or the Istio version (remains at v1.26.2) for the noOLM code path. When the GatewayAPIWithoutOLM feature gate is enabled, the Sail Library will install Istio using the same v1.26.2 version that the OLM path currently uses. This works because the vendored Sail Library (OSSM 3.3.1) still supports Istio 1.26.2.

Dependency Pinning Approach

Unlike the 4.21 backport which bumped k8s and controller-runtime, this backport keeps all dependencies at their original 4.20 versions. The sail-operator (OSSM 3.3.1) requires k8s 0.34 and controller-runtime 0.22, but its pkg/install package only uses basic CRUD operations (client.New, client.Get, client.Create, client.Update) and stable types (metav1, corev1, runtime, rest.Config) that exist unchanged in the 4.20 versions.

To prevent go mod tidy from bumping dependencies transitively, the following replace directives pin modules to their 4.20 versions:

Module Pinned Version 4.20 Original
k8s.io/api v0.33.2 v0.33.2
k8s.io/apimachinery v0.33.2 v0.33.2
k8s.io/client-go v0.32.1 v0.32.1
k8s.io/apiextensions-apiserver v0.33.0 v0.33.0
k8s.io/apiserver v0.33.0 v0.33.0
k8s.io/component-base v0.33.0 v0.33.0
k8s.io/kube-openapi v0.0.0-20250318... v0.0.0-20250318...
sigs.k8s.io/controller-runtime v0.20.4 v0.20.4
sigs.k8s.io/gateway-api v1.2.1 v1.2.1
github.com/google/gnostic-models v0.6.9 v0.6.9

Risk assessment: The sail-operator install package uses only stable controller-runtime interfaces (client.Client CRUD operations, pkg/log, pkg/scheme). No APIs introduced in controller-runtime 0.21+ or k8s 0.34+ are used. The structured-merge-diff/v4 vs v6 incompatibility that would arise from bumping k8s is avoided entirely. This approach was validated by building successfully and by auditing every import in the sail-operator's pkg/install, api/v1, and resources packages.

Conflicts resolved

  • pkg/operator/operator.go: Added GatewayAPIWithoutOLM gate alongside existing 4.20 gates (GatewayAPI, GatewayAPIController, RouteExternalCertificate, IngressControllerLBSubnetsAWS, SetEIPForNLBIngressController)
  • pkg/operator/controller/status/controller.go: Took incoming noOLM logic (useOLM/useSailLibrary, conditional subscription listing) but wrapped in existing 4.20 GatewayAPIEnabled guard
  • test/e2e/gateway_api_test.go: Kept 4.20 gatewayAPIControllerEnabled guard, added gatewayAPIWithoutOLMEnabled conditionals inside for Sail Library vs OLM test selection. Kept xcrdNames alongside new istioCRDNames. Removed references to testGatewayAPIInfrastructureAnnotations, testGatewayAPIInternalLoadBalancer, and testGatewayOpenshiftConditions which were added in separate PRs not present on release-4.20.
  • go.mod / vendor/: Added replace directives for openshift/api (fork with gate), sail-operator (official OSSM 3.3.1), and dependency pins (see Dependency Pinning Approach above). Re-vendored from scratch.

Merge Order

  1. Merge openshift/api PR — FG as disabled, allows CI to start
  2. TODO: Backport noOLM E2E tests to origin release-4.20
  3. Merge this PR — Sail Library code lands, gate still OFF
  4. Merge openshift/api PR — FG promotion to Default GA, activates noOLM
  5. Verify CI is green

Verification

  • make builds successfully
  • No unresolved merge conflict markers in any commit
  • Full CI (blocked on openshift/api dependency)

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 2318d14c-0981-4221-8b75-5f498feb9bda

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2026
@openshift-ci

openshift-ci Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci

openshift-ci Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign thealisyed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gcs278 gcs278 changed the title NE-2286: Backport noOLM / Sail Library to release-4.20 [release-4.20] NE-2286: Backport noOLM / Sail Library to release-4.20 Jun 2, 2026
@gcs278

gcs278 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/testwith openshift/cluster-ingress-operator/release-4.20/e2e-aws-operator openshift/api#2869

gcs278 and others added 7 commits June 3, 2026 16:56
Cherry-picked from: ed2eb36
openshift#1354

Conflicts resolved:
- pkg/operator/controller/status/controller.go: Took incoming noOLM logic
  (useOLM/useSailLibrary, conditional subscription listing) but wrapped in
  existing 4.20 GatewayAPIEnabled guard. Restored GatewayAPIControllerEnabled
  guard that was present in the original condition but dropped during
  cherry-pick.
Cherry-picked from: 9c4d792
openshift#1354

Conflicts resolved:
- test/e2e/gateway_api_test.go: Kept 4.21 gatewayAPIControllerEnabled guard,
  added gatewayAPIWithoutOLMEnabled conditionals inside it. Kept xcrdNames
  alongside new istioCRDNames.
- Removed references to testGatewayAPIInfrastructureAnnotations,
  testGatewayAPIInternalLoadBalancer, and testGatewayOpenshiftConditions
  which were added in separate PRs not present on release-4.21.
Cherry-picked from: 43c978a
openshift#1404

Conflicts resolved:
- go.mod: Switched sail-operator replace from aslakknutsen's development
  fork to the official openshift-service-mesh/sail-operator v0.0.0-20260327145107
  (OSSM 3.3.1). Added replace directives to pin k8s.io/api, apimachinery,
  apiextensions-apiserver, apiserver, client-go, component-base,
  kube-openapi, controller-runtime, gateway-api, and gnostic-models to
  their original 4.20 versions, preventing the sail-operator's transitive
  dependencies from bumping them. This avoids the structured-merge-diff
  v4/v6 incompatibility and preserves compatibility with the 4.20
  openshift/client-go and openshift/library-go.
- vendor/: Re-vendored from scratch with pinned dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gcs278

gcs278 commented Jun 5, 2026

Copy link
Copy Markdown
Contributor Author

/testwith openshift/cluster-ingress-operator/release-4.20/e2e-aws-operator openshift/api#2869

@gcs278 gcs278 marked this pull request as ready for review June 18, 2026 18:19
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 18, 2026
@gcs278

gcs278 commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Ready for early review, but blocked on getting some Jira Tickets set up and the 4.21 NO-OLM backport to merge to GA (openshift/api#2865)
/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 18, 2026
@openshift-ci openshift-ci Bot requested a review from grzpiotrowski June 18, 2026 18:20
@openshift-ci openshift-ci Bot requested a review from rikatz June 18, 2026 18:20
@openshift-ci

openshift-ci Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

@gcs278: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-operator 9812b49 link true /test e2e-aws-operator
ci/prow/verify-deps 9812b49 link true /test verify-deps
ci/prow/hypershift-e2e-aks 9812b49 link true /test hypershift-e2e-aks
ci/prow/e2e-aws-ovn-hypershift-conformance 9812b49 link true /test e2e-aws-ovn-hypershift-conformance
ci/prow/e2e-azure-ovn 9812b49 link false /test e2e-azure-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants