refactor(server): split hotspot modules and state boundaries#465
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughRefactors and splits large server modules into focused submodules (auth, clusters, kubernetes, events, lifecycle), adds kubeconfig encryption/migration, Kubernetes client factory/pool/timeouts, per-cluster SSE workers and poller, a server-side cluster-selection API, guarded async client-side cluster switching, many Flux resource templates, and thin Flux use-case adapters. Changes
Sequence Diagram(s)sequenceDiagram
participant Browser
participant UI as ClusterSwitcher
participant Store as Cluster Store
participant API as /api/v1/user/cluster
participant Server
participant EBus as Event Bus
Browser->>UI: click cluster
UI->>Store: selectCluster(id)
Store->>Store: set selectingClusterId (guard)
Store->>EBus: disconnect()
Store->>API: PUT /api/v1/user/cluster {clusterId}
API->>Server: validateSelectableClusterId & set cookie
Server-->>API: 200 + selection payload
API-->>Store: payload
Store->>Store: update current/available, invalidate layout
Store->>EBus: connect()
EBus->>EBus: start worker for selected cluster
sequenceDiagram
participant Poller
participant K8s as Flux API
participant State as Worker State
participant Bus as Event Bus
Poller->>K8s: listFluxResources(type)
K8s-->>Poller: items
loop per resource
Poller->>State: compute key, currentState, notificationState
alt new & settled
Poller->>Bus: broadcast(ADDED)
State->>State: store lastState/notification
else changed & settled
Poller->>Bus: broadcast(MODIFIED)
State->>State: update lastState/notification
end
end
Poller->>State: detect deletions
Poller->>Bus: broadcast(DELETED) for removed
Poller->>Poller: schedule next poll (if active)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 18
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
🟡 Minor comments (7)
src/lib/templates/image-repository.ts-112-116 (1)
112-116:⚠️ Potential issue | 🟡 MinorDuration regex matches empty string.
^([0-9]+(\\.[0-9]+)?(s|m|h))*$uses*, so an empty value passes validation despiterequired: true. Use+so the regex enforces at least one duration segment (the required/default already handle presence, but pattern should align):🛠️ Proposed fix
- pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))*$', + pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))+$',Note: the same regex appears in other template files added in this PR (e.g.,
helm-release.ts,helm-chart.ts, etc.) — worth fixing consistently.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/image-repository.ts` around lines 112 - 116, The validation regex in the template's validation object (the pattern property used for duration validation) currently uses "*" so it allows an empty string; update the pattern from '^([0-9]+(\\.[0-9]+)?(s|m|h))*$' to use '+' — '^([0-9]+(\\.[0-9]+)?(s|m|h))+$' — so at least one duration segment is required, and apply the same fix to the identical pattern occurrences in the other templates (e.g., in helm-release.ts and helm-chart.ts) to keep behavior consistent.src/lib/templates/types.ts-55-59 (1)
55-59:⚠️ Potential issue | 🟡 MinorCEL_VALIDATION regex permits quotes but not common CEL constructs.
The regex allows
",', and spaces but omits characters frequently used in CEL expressions such as{,},?,@,;,\n, and non-ASCII letters/digits. Users writing realistic CEL (e.g. map/list literals, ternaries, or multi-line expressions viatextarea) will hit false-positive validation failures. Consider relaxing to a length check plus blocklist of dangerous chars, or aligning the whitelist with the CEL grammar.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/types.ts` around lines 55 - 59, The CEL_VALIDATION whitelist is too restrictive; replace the strict pattern in CEL_VALIDATION (pattern and message in types.ts) with a safer validation that allows any characters up to 500 chars (e.g. use a length-only pattern such as '^([\\s\\S]{1,500})$' or remove the pattern and validate length separately) and update the message to reflect "Max 500 characters" rather than listing allowed characters; ensure the symbol CEL_VALIDATION.pattern is changed and CEL_VALIDATION.message updated accordingly.src/lib/templates/helm-release.ts-239-372 (1)
239-372:⚠️ Potential issue | 🟡 Minor
spec.valuestextarea andspec.values.resources.*fields target overlapping paths — document interaction or enforce mutual exclusion.The
valuesfield (line 241) targets the entirespec.valuesobject as free-form YAML, whileresourceLimitsCpu,resourceLimitsMemory,resourceRequestsCpu, andresourceRequestsMemory(lines 332–366) write into nestedspec.values.resources.limits.*andspec.values.resources.requests.*. Both are processed sequentially during form submission viadoc.setIn()calls with no visible merge strategy. If a user sets resource values in the textarea and also populates the structured fields, the outcome depends on YAML library behavior and field ordering — either the textarea will be treated as a string overwriting nested paths, or the nestedsetIn()calls will fail/conflict.The helpText on lines 340–341 acknowledges chart-schema variability but does not address this interaction. Consider one of:
- Adding explicit helpText or a warning noting the tension between the free-form textarea and structured fields (e.g., "These override values in the textarea if both are set").
- Validating form submission to prevent both being populated, or disabling the structured fields if
valuescontains aresources:key.- Documenting the merge/override precedence clearly in the descriptions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/helm-release.ts` around lines 239 - 372, The spec.values textarea (path 'spec.values', field name 'values') conflicts with the structured resource fields (resourceLimitsCpu/resourceLimitsMemory/resourceRequestsCpu/resourceRequestsMemory which set 'spec.values.resources.limits.*' and 'spec.values.resources.requests.*' via doc.setIn), so add a clear mutual-exclusion rule: on the form component validate/compute before submit that if 'spec.values' YAML contains a top-level "resources" key then either block submission of resourceLimits* and resourceRequests* (disable those fields) or reject submit with a validation error instructing the user to remove the resources key from the textarea; also add short helpText to the 'values' and each resource field documenting the precedence and that structured fields override or are blocked when resources is present.src/lib/templates/oci-repository.ts-142-156 (1)
142-156:⚠️ Potential issue | 🟡 MinorFix semver validation so
>=1.0.0is accepted.The current pattern allows only one operator character before the version, so the placeholder/example
>=1.0.0fails validation.Proposed fix
- pattern: '^[><=~^*]?[0-9]+\\.[0-9]+(\\.[0-9]+)?', + pattern: '^(?:[<>]=?|=|~|\\^)?\\s*[0-9]+\\.[0-9]+(?:\\.[0-9]+)?(?:[-+][0-9A-Za-z.-]+)?$',🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/oci-repository.ts` around lines 142 - 156, The semver field's validation (the validation.pattern on the 'semver' option under spec.ref.semver, shown when refType is 'semver') currently only allows a single operator character so examples like ">=1.0.0" fail; update that regex to accept zero or one or two-character operators (allowing >=, <=, >, <, =, ~, ^, * as prefixes) followed by a major.minor(.patch) numeric version so the placeholder ">=1.0.0" and other valid semver constraints pass validation.src/lib/templates/git-repository.ts-167-182 (1)
167-182:⚠️ Potential issue | 🟡 MinorFix semver validation so documented constraints pass.
The pattern allows only one leading operator, so the placeholder/example
>=1.0.0fails because=is parsed where a digit is expected.Proposed fix
- pattern: '^[><=~^*]?[0-9]+\\.[0-9]+(\\.[0-9]+)?', + pattern: '^(?:[<>]=?|=|~|\\^)?\\s*[0-9]+\\.[0-9]+(?:\\.[0-9]+)?(?:[-+][0-9A-Za-z.-]+)?$',🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/git-repository.ts` around lines 167 - 182, The semver field's validation.pattern (for name: 'semver', path: 'spec.ref.semver') only allows a single leading operator so examples like ">=1.0.0" fail; update the regex to accept common semver operators including two-character variants (>=, <=), single-character operators (~, ^, >, <, =) and wildcard "*" as an optional prefix before the numeric version, and ensure it still validates the numeric portions (major.minor[.patch]); replace the current pattern with one that permits these multi-char operators (e.g., allow [><]=? as well as ~, ^, =, \*) while keeping the numeric group intact and preserve the validation.message.src/lib/templates/oci-repository.ts-9-12 (1)
9-12:⚠️ Potential issue | 🟡 MinorUpdate OCIRepository template to use current Flux API version v1.
The template currently uses
v1beta2, which is deprecated. Current Flux source-controller (v1.8.2+) serves onlyv1for OCIRepository. Clusters withoutv1beta2support will fail manifest creation/apply.Required fix
- version: 'v1beta2', + version: 'v1', ... - yaml: `apiVersion: source.toolkit.fluxcd.io/v1beta2 + yaml: `apiVersion: source.toolkit.fluxcd.io/v1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/oci-repository.ts` around lines 9 - 12, The template object currently sets version: 'v1beta2' and embeds apiVersion: source.toolkit.fluxcd.io/v1beta2 in the YAML; update both to the current Flux API version v1 by changing version to 'v1' and replacing the embedded apiVersion line with source.toolkit.fluxcd.io/v1 so the OCIRepository template (the fields version, yaml string) targets the supported v1 API.src/lib/server/kubernetes/flux/listing.ts-133-170 (1)
133-170:⚠️ Potential issue | 🟡 MinorInconsistent semantics for returned
limitbetween the two paging paths.Native paging returns
limit: options!.limit!(the requested page size), while the full-fetch path returnslimit: effectiveLimit(i.e.,paginatedItems.length, which differs from the requested limit on the last page or whenoffset >= total). Clients usingPaginatedFluxResourceList.limitto compute page math (e.g.,offset + limitfor the next page) will get different answers depending on which branch ran. Pick one convention (preferably: always echo the requested limit) and document it.♻️ Suggested fix
- const total = sorted.length; - const offset = options?.offset ?? 0; - const paginatedItems = - options?.limit !== undefined - ? sorted.slice(offset, offset + options.limit) - : sorted.slice(offset); - const effectiveLimit = paginatedItems.length; - - return { - items: paginatedItems, - total, - hasMore: options?.limit !== undefined ? offset + options.limit < total : false, - offset, - limit: effectiveLimit, - metadata: { - resourceVersion: list.metadata?.resourceVersion - } - }; + const total = sorted.length; + const offset = options?.offset ?? 0; + const paginatedItems = + options?.limit !== undefined + ? sorted.slice(offset, offset + options.limit) + : sorted.slice(offset); + + return { + items: paginatedItems, + total, + hasMore: options?.limit !== undefined ? offset + options.limit < total : false, + offset, + // Echo the requested limit (or total when unbounded) so clients can + // compute next-page offsets consistently with the native-paging path. + limit: options?.limit ?? total, + metadata: { + resourceVersion: list.metadata?.resourceVersion + } + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/kubernetes/flux/listing.ts` around lines 133 - 170, The two paging branches return different semantics for limit; make them consistent by always echoing the requested page size when provided: in the full-fetch path (where sorted/offset/paginatedItems/effectiveLimit are computed) change the returned limit to options?.limit ?? effectiveLimit (so when a client requested a limit you return that requested value, otherwise fall back to the actual length), and keep the native paging branch as-is (it already returns options!.limit!). Update the return object in the full-fetch branch (and any uses of effectiveLimit) to use this unified limit value.
🧹 Nitpick comments (17)
src/lib/server/clusters/kubeconfig.ts (1)
19-22: Consider logging the error class/message (without the kubeconfig contents) to aid debugging.The current log string "parse error (sanitized)" gives no signal at all on what went wrong. Capturing
err.name/err.messagefrom the caught exception is safe (these come fromjs-yaml/@kubernetes/client-nodeand do not echo the kubeconfig payload) and makes failures actionable in production.Proposed tweak
- } catch { - logger.error('Failed to parse kubeconfig: parse error (sanitized)'); + } catch (err) { + const msg = err instanceof Error ? `${err.name}: ${err.message}` : 'unknown error'; + logger.error(`Failed to parse kubeconfig: ${msg}`); return { contexts: [], currentContext: null }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/clusters/kubeconfig.ts` around lines 19 - 22, The catch block in kubeconfig.ts that currently swallows errors (catch { logger.error('Failed to parse kubeconfig: parse error (sanitized)'); return { contexts: [], currentContext: null }; }) should capture the thrown exception (e.g., catch (err)) and log the error class/name and message using the existing logger (reference the same logger.error call) without including kubeconfig contents; update the catch in the kubeconfig parsing function so it logs err.name and err.message (or equivalent) alongside a short context string, then return the same fallback { contexts: [], currentContext: null } as before.src/lib/stores/events.svelte.ts (1)
6-68: Consolidate duplicated type exports from the events/types.js barrel.The file declares
export *fromevents/types.json line 6, but then locally re-exportsConnectionStatus,NotificationMessage, andResourceEvent. This creates multiple sources of truth for the same types. Move these three type declarations toevents/types.tsand import them here instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/stores/events.svelte.ts` around lines 6 - 68, The local type declarations ConnectionStatus, ResourceEvent, and NotificationMessage duplicate the exported types from events/types.js; move their full definitions into events/types.ts (or the existing types barrel), delete the local declarations in this file, and replace them with imports (e.g., import { ConnectionStatus, ResourceEvent, NotificationMessage } from './events/types.js'); ensure the file still re-exports via export * if desired and update any local references to use the imported symbols.src/lib/stores/events/connection.ts (1)
7-18: Optional: add jitter and freeze the defaults object.Two nice-to-haves for the reconnect layer:
- Pure exponential backoff without jitter can produce synchronized reconnect storms if many clients disconnect simultaneously (e.g., a server restart). Consider adding a small random jitter.
DEFAULT_CONNECTION_OPTIONSis exported as a mutable object; any consumer can inadvertently mutate it.Object.freeze(oras const) makes it safer and communicates intent.♻️ Proposed refactor
-export const DEFAULT_CONNECTION_OPTIONS = { - maxReconnectAttempts: MAX_RECONNECT_ATTEMPTS, - reconnectDelay: RECONNECT_DELAY_MS, - maxReconnectDelay: MAX_RECONNECT_DELAY_MS -}; - -export function getReconnectDelay(attempt: number): number { - return Math.min( - DEFAULT_CONNECTION_OPTIONS.reconnectDelay * Math.pow(2, attempt), - DEFAULT_CONNECTION_OPTIONS.maxReconnectDelay - ); -} +export const DEFAULT_CONNECTION_OPTIONS = Object.freeze({ + maxReconnectAttempts: MAX_RECONNECT_ATTEMPTS, + reconnectDelay: RECONNECT_DELAY_MS, + maxReconnectDelay: MAX_RECONNECT_DELAY_MS +}); + +export function getReconnectDelay(attempt: number): number { + const base = Math.min( + DEFAULT_CONNECTION_OPTIONS.reconnectDelay * Math.pow(2, Math.max(0, attempt)), + DEFAULT_CONNECTION_OPTIONS.maxReconnectDelay + ); + // Full jitter: random in [base/2, base] + return base / 2 + Math.random() * (base / 2); +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/stores/events/connection.ts` around lines 7 - 18, DEFAULT_CONNECTION_OPTIONS is exported mutable and getReconnectDelay uses pure exponential backoff; make the defaults immutable and add a small random jitter to the delay calculation to avoid synchronized reconnect storms. Change DEFAULT_CONNECTION_OPTIONS to an immutable export (use Object.freeze(...) or TypeScript "as const") and modify getReconnectDelay(attempt) to compute baseDelay = Math.min(reconnectDelay * 2^attempt, maxReconnectDelay) then apply a small jitter factor (e.g., multiply by 1 ± random * jitterFactor or add/subtract a random ms up to a configured jitter) before returning; keep references to DEFAULT_CONNECTION_OPTIONS.reconnectDelay and .maxReconnectDelay and ensure the jitter magnitude is bounded so it never exceeds maxReconnectDelay.src/lib/templates/image-update-automation.ts (1)
98-106:intervalplaced inbasicsection is inconsistent with peer templates.Other new templates (e.g.,
image-repository.ts) group scan/syncintervalunder a domain-specific section (repository), while hereintervalsits inbasiceven though a more fittingupdateor dedicated scheduling context exists. Minor UX/consistency nit — consider moving it toupdate(or the relevant operational section) to keepbasicfocused on identity fields per the section description ("Resource identification").♻️ Optional fix
{ name: 'interval', label: 'Sync Interval', path: 'spec.interval', type: 'duration', required: true, - section: 'basic', + section: 'update', default: '1h' }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/image-update-automation.ts` around lines 98 - 106, The 'interval' field (spec.interval) is currently assigned section: 'basic' but should live under the operation-specific section to match peer templates; update the field definition in image-update-automation.ts (the object with name: 'interval', path: 'spec.interval') to set section: 'update' (or another domain-specific section like 'repository') so that basic remains for identity fields and scheduling settings follow the same structure as image-repository.ts.src/lib/server/events/state.ts (1)
8-12: Mutableletexport relies on ESM live bindings.
isShuttingDownis exported as a re-assignable binding and mutated viasetEventBusShuttingDown(). This only propagates to consumers under true ES modules; if any consumer (or a test/bundler target) resolves this file under CommonJS interop, they'll observe a stalefalse. Given the target is SvelteKit/Vite/bun this is typically fine, but consider guarding against regressions by exposing a getter instead:♻️ Optional refactor
-export let isShuttingDown = false; - -export function setEventBusShuttingDown(): void { - isShuttingDown = true; -} +let shuttingDown = false; + +export function isEventBusShuttingDown(): boolean { + return shuttingDown; +} + +export function setEventBusShuttingDown(): void { + shuttingDown = true; +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/events/state.ts` around lines 8 - 12, The exported mutable binding isShuttingDown relies on ESM live bindings and can appear stale under CommonJS interop; replace the exported reassignable variable with an internal module-scoped boolean (e.g. _isShuttingDown) and export a getter function (e.g. isEventBusShuttingDown(): boolean) while keeping setEventBusShuttingDown(): void to flip the internal flag; update any call sites that read isShuttingDown to call the new isEventBusShuttingDown() function and leave setEventBusShuttingDown referenced as-is.src/tests/cluster-selection-route.test.ts (1)
112-126: Assertion doesn't distinguish "inactive" from any other 404 cause.The
PUThandler wrapsvalidateSelectableClusterIdin a broadtry/catchthat rethrowserror(404, ...)for any thrown error (seesrc/routes/api/v1/user/cluster/+server.tslines 39–47). This test would pass even if the 404 originated from an unrelated bug (e.g.,setClusterSelectionCookiethrowing). Consider also asserting on the errorbody.message/codeto ensure the failure path is the intended "not selectable" one, and/or adding a separate test for a completely unknownclusterId.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tests/cluster-selection-route.test.ts` around lines 112 - 126, Update the test to assert the specific 404 reason instead of any 404: call the PUT handler (function PUT) with an inactive cluster (ensure clusterRecord.isActive=false) and assert the thrown error from error(404, ...) contains the expected body.message or code that validateSelectableClusterId returns for "not selectable"; additionally add a separate test case that supplies a completely unknown clusterId and asserts a distinct 404 response so failures from other code like setClusterSelectionCookie are not conflated with the "inactive/unselectable" path.src/routes/api/v1/user/cluster/+server.ts (1)
11-17:GETdoes not re-validatelocals.clusteragainst current selectability.If an uploaded cluster stored in
locals.clusterwas later deactivated or removed,GETwill still return it as the current selection. Consider validating (or at least defensively falling back toIN_CLUSTER_ID) when the stored cluster is no longer selectable so the UI stays consistent withPUT's stricter contract.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routes/api/v1/user/cluster/`+server.ts around lines 11 - 17, The GET handler currently trusts locals.cluster and can return a deactivated/removed cluster; update GET to validate locals.cluster against the selectable clusters before calling getClusterSelectionPayload and fall back to IN_CLUSTER_ID when the stored cluster is not selectable. Specifically, inside the exported GET RequestHandler, fetch the current selectable cluster list (or reuse existing selection-check logic used by PUT), check locals.cluster for membership/validity, set clusterId = locals.cluster if valid otherwise clusterId = IN_CLUSTER_ID, then return json(await getClusterSelectionPayload(clusterId)). Ensure you reference GET, locals.cluster, getClusterSelectionPayload, and IN_CLUSTER_ID when making the change.src/lib/server/lifecycle/security-validation.ts (2)
14-14: Embedding\nand emoji in structured log messages.
logger.info('\n🔐 Validating encryption...')(and the similar calls below) embed a literal newline into the log message. With structured loggers (pino, etc.) this breaks JSON log parsing and line-based ingestion in log pipelines. Prefer a plain message and let the logger handle formatting; reserve decorative separators for dev-only console output.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/lifecycle/security-validation.ts` at line 14, Replace decorative log strings that include literal newlines and emojis with plain structured messages: update the logger.info calls in security-validation.ts (e.g., the logger.info('\n🔐 Validating encryption...') call and the similar logger.* calls in this module) to remove the '\n' and emoji so the message is a simple, plain string like "Validating encryption..." and reserve any decorative newline/emoji output for dev-only console output or a non-structured debug path.
39-42: Rethrow preserves error but outer caller gets little context about which check failed.Both sub-checks (auth vs. cluster encryption, dev-key vs. test-failure) funnel into the same catch with the same log line. Consider narrowing the log by check, or structuring the thrown error so the operator can see immediately whether it was AUTH or CLUSTER that failed without grepping earlier log lines (which may not exist if the test itself threw before logging).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/lifecycle/security-validation.ts` around lines 39 - 42, The catch block that currently does logger.error(error, ' ✗ Encryption validation failed') then rethrows conflates multiple sub-checks; update the code so each specific check (AUTH vs CLUSTER and dev-key vs test-failure) either logs a distinct, descriptive message or augments the thrown error with a contextual property/string before rethrowing. Locate the catch in security-validation (the block using logger.error(... 'Encryption validation failed')) and: 1) add context identifying which check failed (e.g., "AUTH encryption check failed" or "CLUSTER encryption check failed" or "DEV-KEY mismatch") when logging, or 2) wrap the original error in a new Error/CustomError that includes a short code/string describing the failing sub-check, then logger.error that enriched error and throw the enriched error so callers can immediately see which check failed without grepping earlier logs.src/lib/templates/provider.ts (1)
160-173: Inconsistent timeout default vs. placeholder.The
timeoutfield hasdefault: '10m'butplaceholder: '30s'. Minor UX inconsistency — consider aligning the placeholder with the default (or vice-versa) so users aren't misled about the expected value.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/provider.ts` around lines 160 - 173, The timeout field object (name: 'timeout', path: 'spec.timeout') has a UX inconsistency: default is '10m' but placeholder is '30s'; align them by updating either the default or the placeholder so both match (e.g., set placeholder to '10m' to match default, or change default to '30s' if that is intended) and adjust any related description text if needed to reflect the chosen value.src/lib/templates/image-policy.ts (1)
38-56: Missing validation onmetadata.name/metadata.namespace.Other templates in this PR (bucket, helm-repository, provider) apply the RFC 1123-style regex validation to
metadata.nameandmetadata.namespace. Same constraints apply toImagePolicy; consider adding the samevalidationblock here for consistency and to prevent invalid manifests.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/image-policy.ts` around lines 38 - 56, Add RFC1123-style validation blocks to the metadata fields in the ImagePolicy template: for the entry with name: 'name' (path: 'metadata.name') and the entry with name: 'namespace' (path: 'metadata.namespace') add a validation property using the same regex used by the other templates (bucket, helm-repository, provider) to enforce RFC1123 DNS-1123 subdomain/name rules and keep behavior consistent across templates.src/lib/templates/helm-chart.ts (1)
165-173: Inconsistent modeling ofvaluesFilesvs.helm-release.ts.
spec.valuesFilesis a string array in the Flux API, but here it is a free-formtextareathat expects users to hand-write a YAML array (- values.yaml). Insrc/lib/templates/helm-release.tsthe same field is modeled astype: 'array'witharrayItemType: 'string', which is both less error-prone and consistent with how arrays are handled across other templates. Consider aligning so users don’t have to author YAML list syntax by hand (and so the form→YAML serializer doesn’t have to parse a textarea into an array).♻️ Proposed alignment
- { - name: 'valuesFiles', - label: 'Values Files', - path: 'spec.valuesFiles', - type: 'textarea', - section: 'advanced', - placeholder: '- values.yaml\n- values-prod.yaml', - description: 'List of values files to merge (YAML array format)' - }, + { + name: 'valuesFiles', + label: 'Values Files', + path: 'spec.valuesFiles', + type: 'array', + section: 'advanced', + arrayItemType: 'string', + placeholder: 'values.yaml', + description: 'List of values files to merge' + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/helm-chart.ts` around lines 165 - 173, The valuesFiles field currently modeled as a free-form textarea should be changed to an explicit array field to match helm-release.ts: replace the textarea field for 'valuesFiles' (path 'spec.valuesFiles') with type 'array' and arrayItemType 'string' (keep label 'Values Files' and section 'advanced'), update the placeholder/example to show individual items instead of YAML list syntax, and ensure any form→YAML serializer expects an array for spec.valuesFiles.src/lib/server/auth.ts (1)
1-7: No symbol collisions detected across the seven wildcard re-exports.All 36 exported symbols across
constants,passwords,credentials,users,sessions,bootstrap-admin, andin-cluster-adminare unique—no name conflicts exist. The current barrel export is safe from ambiguity errors.That said, converting to explicit named re-exports would improve API stability: if future changes inadvertently introduce duplicate symbols across modules, wildcard re-exports will either surface ambiguous-re-export errors or silently mask one symbol. Explicit re-exports lock the public API and catch such conflicts earlier.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/auth.ts` around lines 1 - 7, Barrel export currently uses seven wildcard re-exports (from './auth/constants.js', './auth/passwords.js', './auth/credentials.js', './auth/users.js', './auth/sessions.js', './auth/bootstrap-admin.js', './auth/in-cluster-admin.js'); replace each "export * from '...'" with explicit named re-exports that enumerate the current public symbols from that module (list each exported identifier from constants, passwords, credentials, users, sessions, bootstrap-admin, in-cluster-admin) so the public API is locked and future collisions are detected; update the top-level file to only export those names and run the type/build/tests to verify no missing or duplicate exports.src/lib/server/clusters/selection.ts (1)
7-13: ConsiderhttpOnly: truenow that the server owns cluster selection.Per the PR description, client-side
gyre_clustercookie writes were removed and selection is now managed server-side via/api/v1/user/cluster. At this point there is no reason for the cookie to be readable by page scripts, and keepinghttpOnly: falseunnecessarily widens the XSS blast radius for the cluster-scoping signal that gates subsequent k8s requests.🔒 Suggested change
const COOKIE_OPTIONS = { path: '/', - httpOnly: false, + httpOnly: true, sameSite: 'lax' as const, secure: process.env.NODE_ENV === 'production', maxAge: 60 * 60 * 24 * 30 };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/clusters/selection.ts` around lines 7 - 13, Update the COOKIE_OPTIONS constant in selection.ts to set httpOnly: true since cluster selection is now server-managed; locate the COOKIE_OPTIONS object and change the httpOnly property from false to true (ensure no code relies on reading the gyre_cluster cookie from client-side JS after this change and update any remaining client-side access paths to use the /api/v1/user/cluster endpoints instead).src/lib/server/lifecycle/startup.ts (1)
18-29: DuplicategetCurrentNamespacehelper.This helper and the
IN_CLUSTER_NAMESPACE_PATHconstant are byte-for-byte duplicates of the ones insrc/lib/server/auth/in-cluster-admin.ts. Since both modules already live undersrc/lib/server, consider extracting to a shared location (e.g.src/lib/server/kubernetes/namespace.ts) and importing from both, so future changes to the namespace-resolution behaviour don't need to be applied in two places.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/lifecycle/startup.ts` around lines 18 - 29, Duplicate IN_CLUSTER_NAMESPACE_PATH constant and getCurrentNamespace function should be extracted into a shared module (e.g., kubernetes/namespace.ts); create a new file that exports IN_CLUSTER_NAMESPACE_PATH and getCurrentNamespace (keeping the same behavior: readFileSync(...).trim() with fallback to 'default') and then replace the local definitions in both startup.ts and in-cluster-admin.ts to import those symbols instead of redefining them so future changes apply in one place.src/lib/server/kubernetes/flux/crud.ts (1)
187-231: Optional: replace the hand-rolled semaphore withp-limit/p-map.The semaphore is correct, but 30+ lines of bookkeeping (queue of resolvers, manual
active++/--, shift on completion, defensiveresults.filter) is doing whatp-limitorp-mapexpresses in one call, and those libraries already handle edge cases like synchronous throws inside the worker and early-abort composition. Not worth changing if you want to avoid a new runtime dep, but worth considering if either is already in the tree.// with p-map, for example: import pMap from 'p-map'; return pMap( items, async (item) => { try { await deleteFluxResource(item.resourceType, item.namespace, item.name, context); return { item, success: true } as DeleteResult; } catch (error) { return { item, success: false, error: error instanceof Error ? error : new Error(String(error)) }; } }, { concurrency, stopOnError: false } );See the separate comment on
src/lib/server/auth/in-cluster-admin.tsabout the.codevs.response?.statusCodemismatch that also applies to line 161 of this file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/kubernetes/flux/crud.ts` around lines 187 - 231, The current deleteFluxResourcesBatch implements a manual semaphore and result bookkeeping; replace it with a p-map (or p-limit) based implementation to simplify concurrency and edge-case handling: in deleteFluxResourcesBatch use p-map/p-limit to iterate items and call deleteFluxResource(item.resourceType, item.namespace, item.name, context) inside the mapper, returning DeleteResult objects on success/failure (preserve ordering by mapping items array directly), pass { concurrency, stopOnError: false } to p-map, remove the semaphore object, executeWithConcurrency helper, manual results array and results.filter, and add the appropriate p-map import at top of the file. Ensure error wrapping logic (error instanceof Error ? error : new Error(String(error))) is preserved in the mapper.src/lib/server/clusters/repository.ts (1)
168-176:deleteClusterisn't transactional — partial-failure can leave orphans.The two
db.delete(...)calls run independently. If theclusterContextsdelete succeeds but theclustersdelete fails (transient error, FK trigger, etc.), you're left with a cluster row whose contexts have already been wiped, and a retry will silently skip the (already empty) context delete without any consistency guarantee. Wrapping both in a transaction makes the cleanup atomic and also lets the DB-level cascade (if configured) be the only implementation of that relationship:♻️ Suggested change
-export async function deleteCluster(id: string): Promise<void> { - const db = await getDb(); - - // Delete contexts first (cascade should handle this but being explicit) - await db.delete(clusterContexts).where(eq(clusterContexts.clusterId, id)); - - // Delete cluster - await db.delete(clusters).where(eq(clusters.id, id)); -} +export async function deleteCluster(id: string): Promise<void> { + const db = await getDb(); + db.transaction((tx) => { + // Delete contexts first (cascade should handle this but being explicit) + tx.delete(clusterContexts).where(eq(clusterContexts.clusterId, id)).run(); + tx.delete(clusters).where(eq(clusters.id, id)).run(); + }); +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/clusters/repository.ts` around lines 168 - 176, The two independent deletes in deleteCluster can leave the DB in an inconsistent state on partial failure; wrap both operations in a single transaction by obtaining the DB via getDb() and using its transaction API (e.g., db.transaction(async (tx) => { await tx.delete(clusterContexts).where(eq(clusterContexts.clusterId, id)); await tx.delete(clusters).where(eq(clusters.id, id)); })) so both deletes succeed or both roll back; ensure you call the transactional delete methods on the transaction object (tx) and rethrow any error so callers see failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 63096d9a-763b-403b-9668-36a37d9deaca
📒 Files selected for processing (74)
src/lib/components/layout/ClusterSwitcher.sveltesrc/lib/server/auth.tssrc/lib/server/auth/bootstrap-admin.tssrc/lib/server/auth/constants.tssrc/lib/server/auth/credentials.tssrc/lib/server/auth/in-cluster-admin.tssrc/lib/server/auth/passwords.tssrc/lib/server/auth/sessions.tssrc/lib/server/auth/users.tssrc/lib/server/clusters.tssrc/lib/server/clusters/encryption.tssrc/lib/server/clusters/health.tssrc/lib/server/clusters/kubeconfig.tssrc/lib/server/clusters/migration.tssrc/lib/server/clusters/repository.tssrc/lib/server/clusters/selection.tssrc/lib/server/events.tssrc/lib/server/events/bus.tssrc/lib/server/events/poller.tssrc/lib/server/events/shutdown.tssrc/lib/server/events/state.tssrc/lib/server/events/types.tssrc/lib/server/flux/use-cases/batch-operation.tssrc/lib/server/flux/use-cases/create-resource.tssrc/lib/server/flux/use-cases/delete-resource.tssrc/lib/server/flux/use-cases/diff-resource.tssrc/lib/server/flux/use-cases/history-resource.tssrc/lib/server/flux/use-cases/list-resource.tssrc/lib/server/flux/use-cases/resolve-resource-type.tssrc/lib/server/flux/use-cases/resource-events.tssrc/lib/server/flux/use-cases/rollback-resource.tssrc/lib/server/flux/use-cases/update-resource.tssrc/lib/server/initialize.tssrc/lib/server/kubernetes/client-factory.tssrc/lib/server/kubernetes/client-pool.tssrc/lib/server/kubernetes/client.tssrc/lib/server/kubernetes/error-handler.tssrc/lib/server/kubernetes/flux/crud.tssrc/lib/server/kubernetes/flux/listing.tssrc/lib/server/kubernetes/flux/logs.tssrc/lib/server/kubernetes/kubeconfig-provider.tssrc/lib/server/kubernetes/secret-access.tssrc/lib/server/kubernetes/timeouts.tssrc/lib/server/lifecycle/security-validation.tssrc/lib/server/lifecycle/shutdown.tssrc/lib/server/lifecycle/startup.tssrc/lib/server/request/access.tssrc/lib/stores/cluster.svelte.tssrc/lib/stores/events.svelte.tssrc/lib/stores/events/connection.tssrc/lib/stores/events/notifications.tssrc/lib/stores/events/storage.tssrc/lib/stores/events/types.tssrc/lib/templates/alert.tssrc/lib/templates/bucket.tssrc/lib/templates/git-repository.tssrc/lib/templates/helm-chart.tssrc/lib/templates/helm-release.tssrc/lib/templates/helm-repository.tssrc/lib/templates/image-policy.tssrc/lib/templates/image-repository.tssrc/lib/templates/image-update-automation.tssrc/lib/templates/index.tssrc/lib/templates/kustomization.tssrc/lib/templates/oci-repository.tssrc/lib/templates/provider.tssrc/lib/templates/receiver.tssrc/lib/templates/types.tssrc/routes/+layout.sveltesrc/routes/api/v1/user/cluster/+server.tssrc/tests/auth-cookie-regression.test.tssrc/tests/cluster-selection-route.test.tssrc/tests/events.test.tssrc/tests/request-pipeline.test.ts
| try { | ||
| const currentClusterId = await validateSelectableClusterId( | ||
| (rawBody as { clusterId: string }).clusterId | ||
| ); | ||
| setClusterSelectionCookie(cookies, currentClusterId); | ||
| return json(await getClusterSelectionPayload(currentClusterId)); | ||
| } catch { | ||
| throw error(404, { message: 'Cluster is not selectable', code: 'NotFound' }); | ||
| } |
There was a problem hiding this comment.
Overly broad catch masks real errors as 404.
Any exception raised inside the try block — including bugs in setClusterSelectionCookie, getClusterSelectionPayload, cookie serialization failures, or downstream I/O — is swallowed and rethrown as 404 "Cluster is not selectable". This both misleads clients (404 for what may actually be a 500) and hides real failures from observability. It also drops the original error entirely (no logging).
Narrow the catch to the known "not selectable" signal from validateSelectableClusterId, log unexpected errors, and let the rest propagate as 500.
🛠️ Suggested refactor
- try {
- const currentClusterId = await validateSelectableClusterId(
- (rawBody as { clusterId: string }).clusterId
- );
- setClusterSelectionCookie(cookies, currentClusterId);
- return json(await getClusterSelectionPayload(currentClusterId));
- } catch {
- throw error(404, { message: 'Cluster is not selectable', code: 'NotFound' });
- }
+ let currentClusterId: string;
+ try {
+ currentClusterId = await validateSelectableClusterId(
+ (rawBody as { clusterId: string }).clusterId
+ );
+ } catch {
+ throw error(404, { message: 'Cluster is not selectable', code: 'NotFound' });
+ }
+
+ setClusterSelectionCookie(cookies, currentClusterId);
+ return json(await getClusterSelectionPayload(currentClusterId));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routes/api/v1/user/cluster/`+server.ts around lines 39 - 47, Narrow the
catch so only the "cluster not selectable" outcome from
validateSelectableClusterId is converted to a 404: call
validateSelectableClusterId first inside a try/catch that only handles the
specific error/marker it throws (or check its return value) and rethrow
error(404, ...) there; after a successful validate, call
setClusterSelectionCookie and getClusterSelectionPayload without swallowing
exceptions so any cookie serialization or payload errors propagate as 500 and
are not converted to 404. Also log unexpected errors (console.error or your app
logger) when catching non-selectable vs other unexpected exceptions so the
original error is preserved for observability; reference
validateSelectableClusterId, setClusterSelectionCookie,
getClusterSelectionPayload, and the existing error(404, ...) usage.
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
- Move RBAC and user updates into transactions - Harden event bus shutdown, polling, and reconnect behavior - Tighten template validation and Helm release resource handling
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
There was a problem hiding this comment.
Actionable comments posted: 13
♻️ Duplicate comments (1)
src/lib/server/auth/passwords.ts (1)
2-2:⚠️ Potential issue | 🟠 MajorReplace modulo-based CSPRNG selection with
randomInt().Both character selection and shuffling reduce random bytes with
%, which CodeQL already flagged as biased. Usenode:crypto’s boundedrandomInt()instead.🔐 Proposed fix
-import { randomBytes } from 'node:crypto'; +import { randomBytes, randomInt } from 'node:crypto'; @@ - const pick = (chars: string) => chars[randomBytes(1)[0] % chars.length]; + const pick = (chars: string) => chars[randomInt(chars.length)]; @@ - const random = randomBytes(passwordChars.length); for (let i = passwordChars.length - 1; i > 0; i--) { - const j = random[i] % (i + 1); + const j = randomInt(i + 1); [passwordChars[i], passwordChars[j]] = [passwordChars[j], passwordChars[i]]; }Verify no modulo-based CSPRNG selection remains:
#!/bin/bash rg -nP 'randomBytes\([^)]*\)(?:\[[^\]]+\])?\s*%' --type=ts rg -nP 'randomInt\s*\(' src/lib/server/auth/passwords.ts -C2Also applies to: 36-47
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/auth/passwords.ts` at line 2, Replace all modulo-based CSPRNG selection that uses randomBytes(...)[...] % ... with node:crypto's bounded randomInt to remove bias; locate the functions in this file that pick characters from a charset (e.g., the password/character selection logic) and the in-place shuffle/Fisher–Yates implementation, and change each randomBytes-based index generation to await/ call randomInt(0, bound) (or synchronous randomInt where appropriate) for both selecting characters and swapping indices during the shuffle; ensure imports include randomInt from 'node:crypto' and remove any remaining uses of '%' with randomBytes to satisfy the CodeQL check.
🧹 Nitpick comments (3)
src/lib/templates/kustomization.ts (3)
12-23: Minor: yaml example and field defaults drift.
- L19
path: ./deployin the starter YAML, while thepathfield (L148) hasdefault: './'(placeholder is./deploy). Users who generate from the form get./but those who start from the YAML snippet get./deploy.timeout(L302‑314) hasdefault: '10m'butplaceholder: '5m'— harmless, just a confusing hint.Aligning these avoids subtle surprises when switching between YAML and form modes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/kustomization.ts` around lines 12 - 23, The YAML example in the kustomization template uses path: ./deploy while the form field definition for path (field name/path placeholder/default in this file) has default './'—and the timeout field has default '10m' but placeholder '5m'; update the template so these three places are consistent by choosing one canonical value (e.g., set the YAML snippet's path to './' to match the path field default, and make the timeout placeholder match the timeout field default '10m'), or alternatively update the path field default and timeout default to match the YAML/placeholder you prefer; modify the YAML literal (yaml: `...`) and the corresponding field definitions for path and timeout in this file to keep defaults/placeholders aligned.
164-168: Extract the duration validation to a shared constant (DRY).The same duration regex
^([0-9]+(\\.[0-9]+)?(s|m|h))+$is repeated three times (interval L164‑168, timeout L310‑313, retryInterval L352‑355) with subtly different user-facing messages. This mirrors theCEL_VALIDATIONpattern already used in this file and applies equally to the other templates (helm-release, git-repository, etc.). Consider defining a singleDURATION_VALIDATIONin./types.tsand reusing it, so the regex and message stay consistent everywhere Flux durations are accepted.♻️ Proposed refactor
In
src/lib/templates/types.ts:export const CEL_VALIDATION = { pattern: '^[\\s\\S]{1,500}$', message: 'CEL expression must be 500 characters or fewer.' }; + +export const DURATION_VALIDATION = { + pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))+$', + message: + 'Duration must use time units like: 1m (minutes), 30s (seconds), 1h (hours), or combined like 1h30m' +};In this file:
-import { CEL_VALIDATION, type ResourceTemplate } from './types.js'; +import { CEL_VALIDATION, DURATION_VALIDATION, type ResourceTemplate } from './types.js'; @@ - validation: { - pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))+$', - message: - 'Duration must use time units like: 1m (minutes), 30s (seconds), 1h (hours), or combined like 1h30m' - } + validation: DURATION_VALIDATION @@ - validation: { - pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))+$', - message: 'Duration must be in Flux format (e.g., 60s, 1m30s, 5m)' - } + validation: DURATION_VALIDATION @@ - validation: { - pattern: '^([0-9]+(\\.[0-9]+)?(s|m|h))+$', - message: 'Duration must be in Flux format (e.g., 60s, 1m30s, 5m)' - } + validation: DURATION_VALIDATION🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/kustomization.ts` around lines 164 - 168, Extract the repeated duration regex and message into a shared constant named DURATION_VALIDATION in src/lib/templates/types.ts and replace the inline validation objects used for interval, timeout, and retryInterval in kustomization.ts (and the analogous validations in helm-release, git-repository templates) with that constant; ensure DURATION_VALIDATION contains the regex '^([0-9]+(\\.[0-9]+)?(s|m|h))+$' and a clear user-facing message, and update any imports/usages to import { DURATION_VALIDATION } alongside the existing CEL_VALIDATION so all Flux duration validations are DRY and consistent.
457-521: No UI grouping issue: renderer groups bysectionID, not array order.The form renderer (ResourceWizard.svelte) explicitly groups fields by their
sectionproperty using.filter()before rendering, so the position ofimagesandpatchesin thefieldsarray does not affect their visual placement. However, moving them to sit with othersection: 'advanced'fields improves code organization and maintainability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/kustomization.ts` around lines 457 - 521, The images and patches field definitions (objects with name: 'images' and name: 'patches', paths 'spec.images' and 'spec.patches') are not visually grouped because grouping is done by the section property; move these two objects so they appear alongside the other fields that have section: 'advanced' (i.e., reorder the fields array so the images and patches entries are within the same contiguous advanced-section block) to improve code organization and maintainability while keeping their section: 'advanced' intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 466-500: validateHelmReleaseResourceValues currently swallows YAML
parse errors when parsing formValues.values; modify the catch block in
validateHelmReleaseResourceValues to call logger.warn (including the exception
and a short context message) before returning null so malformed YAML is
observable in logs while preserving the existing return behavior, and keep the
existing 'resources' in parsedValues check as-is.
In `@src/lib/server/auth/in-cluster-admin.ts`:
- Around line 53-70: markSecretConsumed currently catches and swallows patch
errors which allows callers (e.g., the logic that sets inClusterFirstLoginDone)
to proceed even if the secret label wasn't written; remove the local try/catch
(or rethrow the caught error) in markSecretConsumed so that failures propagate
to the caller's existing catch, and make the same change for the duplicate
implementation/block referenced around the other occurrence so the caller can
decide how to handle the failure.
- Around line 142-145: The catch block in the in-cluster admin provisioning code
currently logs the error with logger.error(..., 'Failed to setup in-cluster
admin:') and returns null, which hides startup failures; change the catch to
re-throw the caught error after logging (i.e., remove the return null and throw
the caught error) so that the provisioning function in in-cluster-admin.ts
surfaces Kubernetes Secret read/create failures to the startup path.
In `@src/lib/server/clusters/health.ts`:
- Around line 230-239: The catch block for checkApiReachability currently always
calls fail(authResult.error) even when authResult.error is undefined; change it
to only call fail when there is an actual error: call checkAuthAndVersion(kc),
push authResult.checks into checks as you already do, then if authResult.error
is truthy return await fail(authResult.error); otherwise continue normal
execution (do not call fail and just return/continue so the health check remains
successful).
In `@src/lib/server/clusters/repository.ts`:
- Around line 28-40: The code currently accepts kubeconfigs that parse to zero
contexts (parseKubeconfig returns empty contexts) and persists an active cluster
with contextCount: 0; before creating NewCluster or entering any transaction,
validate the result of parseKubeconfig(params.kubeconfig) by checking contexts
(or uniqueContexts) length and throw or return an error if it is 0 so the
invalid kubeconfig is rejected; update the flow around parseKubeconfig,
contexts, uniqueContexts and NewCluster creation to perform this guard and avoid
persisting clusters with no contexts.
- Around line 88-106: The uploadedClusters list should exclude any in-cluster DB
rows so only true uploaded clusters are selectable; modify the getAllClusters()
filter used for uploadedClusters to check both cluster.isActive and
cluster.isLocal (e.g., filter clusters where isActive && isLocal) so the
synthetic IN_CLUSTER_ID option remains the sole in-cluster entry; update the
filtering logic around uploadedClusters in repository.ts (references:
uploadedClusters, getAllClusters, IN_CLUSTER_ID).
In `@src/lib/server/events/bus.ts`:
- Around line 2-25: The subscribe function uses the raw clusterId when indexing
activeWorkers, which causes duplicate contexts for equivalent IDs; normalize the
id first (e.g., derive a canonicalClusterId from the incoming clusterId or
IN_CLUSTER_ID) and use that canonicalClusterId everywhere in subscribe: when
reading/writing activeWorkers, when creating the new context object, for
labels/metrics (activeWorkersGauge, sseSubscribersGauge,
fluxResourceStatusGauge), in logger messages, in the initial CONNECTED SSE
event, and when performing the unsubscribe lookup, so all subscribers for the
same canonical cluster share a single context.
In `@src/lib/server/events/poller.ts`:
- Around line 212-237: The cleanup only iterates context.lastStates so transient
resources that lived only in context.resourceFirstSeen are left behind; update
the deletion logic to also scan context.resourceFirstSeen.keys() for keys that
start with `${resourceType}/` and are not in currentMessageKeys (and not present
in context.lastStates), then perform the same cleanup steps: remove
fluxResourceStatusGauge entry, increment resourceUpdatesTotal with 'deleted',
broadcast a DELETED message (same shape as used for lastStates deletions), and
delete the key from context.resourceFirstSeen and
context.lastNotificationStates; reference the existing symbols
context.resourceFirstSeen, context.lastStates, context.lastNotificationStates,
fluxResourceStatusGauge, resourceUpdatesTotal and broadcast to mirror the
current deletion branch.
- Around line 147-205: The code updates context.lastNotificationStates.set(key,
notificationState) even for transient states (when currState.readyStatus is
missing or 'Unknown'), which can overwrite a prior 'False' and prevent a later
recovery notification; modify the logic in the poller (around the
previousNotificationState/notificationState handling in poller.ts) so you only
update context.lastNotificationStates for non-transient states (i.e., when
!isTransientState) — either move the context.lastNotificationStates.set(key,
notificationState) inside the existing if (shouldNotify && !isTransientState)
block or wrap it with if (!isTransientState) before setting; keep
captureReconciliation(...) and broadcast(...) behavior unchanged.
In `@src/lib/server/kubernetes/namespace.ts`:
- Around line 5-9: getCurrentNamespace currently falls back to 'default' only on
read errors but returns an empty string when the namespace file exists but is
empty; update getCurrentNamespace to read and trim the file (using
IN_CLUSTER_NAMESPACE_PATH) and if the trimmed content is falsy/empty return
'default' instead of '', otherwise return the trimmed namespace. Ensure the
change is implemented within the getCurrentNamespace function so downstream
Kubernetes calls never receive an empty namespace.
In `@src/lib/server/lifecycle/startup.ts`:
- Around line 72-99: The bootstrap flow currently calls
createDefaultAdminIfNeeded() and then writes the setupToken to disk with
writeFileSync(tokenFile,...), which can lose the token if the file write fails;
update startup.ts so token persistence is made durable before committing the new
admin or rollback on persistence failure: either (A) change
createDefaultAdminIfNeeded() (or its call site) to write the token to the temp
file (tokenFile) with atomic/locked write permissions before saving the admin to
the database, or (B) if you must keep the current ordering, catch write errors
and call the inverse operation in the admin creation flow to
delete/mark-consumed the newly created admin (use the same creation rollback
function or user deletion routine) so the next start can regenerate the token;
reference createDefaultAdminIfNeeded, writeFileSync, tokenFile and ensure errors
are logged and thrown after rollback if rollback fails.
In `@src/lib/templates/git-repository.ts`:
- Around line 250-264: The verifyMode select currently stores an empty string
for "Disabled" which causes spec.verify.mode: "" to be written; update the
form/template so verifyMode does not serialize an empty string: mark the field
virtual (add virtual: true) and emit spec.verify.mode only when verifyMode is
non-empty (or use a dependent sub-field with showIf to store a real value),
and/or modify updateYamlFromForm serializer logic to short-circuit and skip
setting spec.verify.mode when the selected value is '' (ensure you reference the
verifyMode field, spec.verify.mode key path, and updateYamlFromForm serializer
when making the change).
In `@src/lib/templates/image-policy.ts`:
- Around line 78-99: Mark the UI-only selector "policyType" as virtual by adding
virtual: true to its field definition (the selector that currently has name:
'policyType' and path: 'spec.policy.type'), add two new emittable fields
"numericalOrder" and "alphabeticalOrder" (under spec.policy) that expose an
order option with values 'asc' and 'desc' so that selecting numerical or
alphabetical emits a non-empty spec.policy object, and update the resource
template version from v1beta2 to v1 (ensure any apiVersion or template version
constants in this file are bumped to v1); keep semverRange as-is but ensure
showIf continues to check policyType === 'semver'.
---
Duplicate comments:
In `@src/lib/server/auth/passwords.ts`:
- Line 2: Replace all modulo-based CSPRNG selection that uses
randomBytes(...)[...] % ... with node:crypto's bounded randomInt to remove bias;
locate the functions in this file that pick characters from a charset (e.g., the
password/character selection logic) and the in-place shuffle/Fisher–Yates
implementation, and change each randomBytes-based index generation to await/
call randomInt(0, bound) (or synchronous randomInt where appropriate) for both
selecting characters and swapping indices during the shuffle; ensure imports
include randomInt from 'node:crypto' and remove any remaining uses of '%' with
randomBytes to satisfy the CodeQL check.
---
Nitpick comments:
In `@src/lib/templates/kustomization.ts`:
- Around line 12-23: The YAML example in the kustomization template uses path:
./deploy while the form field definition for path (field name/path
placeholder/default in this file) has default './'—and the timeout field has
default '10m' but placeholder '5m'; update the template so these three places
are consistent by choosing one canonical value (e.g., set the YAML snippet's
path to './' to match the path field default, and make the timeout placeholder
match the timeout field default '10m'), or alternatively update the path field
default and timeout default to match the YAML/placeholder you prefer; modify the
YAML literal (yaml: `...`) and the corresponding field definitions for path and
timeout in this file to keep defaults/placeholders aligned.
- Around line 164-168: Extract the repeated duration regex and message into a
shared constant named DURATION_VALIDATION in src/lib/templates/types.ts and
replace the inline validation objects used for interval, timeout, and
retryInterval in kustomization.ts (and the analogous validations in
helm-release, git-repository templates) with that constant; ensure
DURATION_VALIDATION contains the regex '^([0-9]+(\\.[0-9]+)?(s|m|h))+$' and a
clear user-facing message, and update any imports/usages to import {
DURATION_VALIDATION } alongside the existing CEL_VALIDATION so all Flux duration
validations are DRY and consistent.
- Around line 457-521: The images and patches field definitions (objects with
name: 'images' and name: 'patches', paths 'spec.images' and 'spec.patches') are
not visually grouped because grouping is done by the section property; move
these two objects so they appear alongside the other fields that have section:
'advanced' (i.e., reorder the fields array so the images and patches entries are
within the same contiguous advanced-section block) to improve code organization
and maintainability while keeping their section: 'advanced' intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c7dc1738-77c1-49ce-8b03-63fae8eb4171
📒 Files selected for processing (37)
src/lib/components/layout/ClusterSwitcher.sveltesrc/lib/components/wizards/ResourceWizard.sveltesrc/lib/server/auth/credentials.tssrc/lib/server/auth/in-cluster-admin.tssrc/lib/server/auth/passwords.tssrc/lib/server/auth/users.tssrc/lib/server/clusters/health.tssrc/lib/server/clusters/kubeconfig.tssrc/lib/server/clusters/migration.tssrc/lib/server/clusters/repository.tssrc/lib/server/clusters/selection.tssrc/lib/server/events/bus.tssrc/lib/server/events/poller.tssrc/lib/server/events/state.tssrc/lib/server/kubernetes/client-factory.tssrc/lib/server/kubernetes/flux/logs.tssrc/lib/server/kubernetes/namespace.tssrc/lib/server/kubernetes/timeouts.tssrc/lib/server/lifecycle/security-validation.tssrc/lib/server/lifecycle/startup.tssrc/lib/server/rbac-defaults.tssrc/lib/stores/events.svelte.tssrc/lib/stores/events/connection.tssrc/lib/templates/bucket.tssrc/lib/templates/git-repository.tssrc/lib/templates/helm-chart.tssrc/lib/templates/helm-release.tssrc/lib/templates/helm-repository.tssrc/lib/templates/image-policy.tssrc/lib/templates/image-repository.tssrc/lib/templates/image-update-automation.tssrc/lib/templates/kustomization.tssrc/lib/templates/oci-repository.tssrc/lib/templates/provider.tssrc/lib/templates/receiver.tssrc/lib/templates/types.tssrc/routes/api/v1/user/cluster/+server.ts
✅ Files skipped from review due to trivial changes (3)
- src/lib/templates/types.ts
- src/lib/templates/image-update-automation.ts
- src/lib/templates/helm-release.ts
🚧 Files skipped from review as they are similar to previous changes (13)
- src/lib/server/clusters/kubeconfig.ts
- src/lib/components/layout/ClusterSwitcher.svelte
- src/lib/stores/events/connection.ts
- src/lib/templates/image-repository.ts
- src/lib/templates/provider.ts
- src/lib/templates/bucket.ts
- src/lib/templates/helm-chart.ts
- src/lib/templates/oci-repository.ts
- src/lib/server/kubernetes/client-factory.ts
- src/lib/server/kubernetes/flux/logs.ts
- src/lib/server/clusters/migration.ts
- src/routes/api/v1/user/cluster/+server.ts
- src/lib/server/clusters/selection.ts
- Centralize template duration validation and add missing virtual defaults - Harden admin setup, cluster handling, and event bus state normalization - Update resource templates for newer Flux fields and defaults
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/lib/components/wizards/ResourceWizard.svelte (1)
149-157:⚠️ Potential issue | 🟠 MajorParse YAML-fragment textareas before writing them into the manifest.
Textarea fields that represent structured YAML data are written directly as strings without parsing. The
valuesfield is documented as JSON/object per the Flux API, andcommonMetadata.labels/annotationsshould be maps. ThepostRenderers[].kustomizefield is similarly expected as an object. Currently,coerceFieldValue()(line 202) only handles numeric coercion and returns all other field types unchanged, so textarea content reachesdoc.setIn(path, value)at line 157 as raw strings. WhilevalidateHelmReleaseResourceValues()does parse thevaluesfield, this is only for conflict detection, not for preparing values for write.Add YAML parsing for textarea fields before calling
doc.setIn(), or model these fields as structuredobject/arrayfields with nested form controls.Also applies to: 488-503, and to
postRenderers[].kustomizesubfield (helm-release.ts:524-528).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/components/wizards/ResourceWizard.svelte` around lines 149 - 157, The textarea fields (e.g., values, commonMetadata.labels, commonMetadata.annotations, and postRenderers[].kustomize) are being written as raw strings because coerceFieldValue only coerces numbers; before calling doc.setIn(path, value) update coerceFieldValue (or add a pre-set step in the code that leads to doc.setIn) to detect textarea/structured fields and parse their content as YAML/JSON (falling back to null or empty object on parse failure) so the manifest stores objects/arrays/maps rather than plain strings; use the same parsing logic used by validateHelmReleaseResourceValues for the values field and ensure parsed results are passed to doc.setIn instead of the raw textarea string.
♻️ Duplicate comments (2)
src/lib/server/clusters/health.ts (1)
230-249:⚠️ Potential issue | 🟡 MinorFallback path reports success without a corresponding
API Server Reachabilitycheck.When
checkApiReachabilitythrows (auth/cert/authz path) andcheckAuthAndVersionthen succeeds, the final response hasconnected: truebut thechecksarray contains onlyKubeconfig Parse+ auth/version checks — noAPI Server Reachabilityentry. Downstream UI that iteratescheckswill silently lose that stage. Consider pushing a synthetic reachability check before delegating:Proposed fix
} catch { // Server responded with an auth/cert/authz error — it is reachable, so let // checkAuthAndVersion produce the proper diagnostic instead of a network failure + checks.push({ + name: 'API Server Reachability', + passed: true, + message: 'Kubernetes API server responded', + duration: 0 + }); const authResult = await checkAuthAndVersion(kc); checks.push(...authResult.checks);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/clusters/health.ts` around lines 230 - 249, When checkApiReachability(kc) throws but you then call checkAuthAndVersion(kc), add a synthetic "API Server Reachability" HealthCheckResult into the checks array before delegating to checkAuthAndVersion so the UI still sees that stage; create a reachability check object (matching the HealthCheckResult shape used elsewhere, e.g. name "API Server Reachability", success=true, timestamp=new Date(), and optional details/error info from the caught exception) and push it onto checks prior to calling checkAuthAndVersion(kc), then proceed with the existing updateCluster(id, ...), fail(...), and return logic.src/lib/templates/git-repository.ts (1)
246-275:⚠️ Potential issue | 🟠 MajorEmit
spec.verify.modewhen verification is enabled.
verifyModeis now virtual, so selectingHEAD,Tag, orTagAndHEADnever writesspec.verify.mode; onlysecretRefis emitted. Flux examples require bothverify.modeandverify.secretRef.namefor commit signature verification: https://fluxcd.io/flux/components/source/gitrepositories/#verification.Proposed direction
name: 'verifyMode', label: 'Verification Mode', path: 'spec.verify.mode', type: 'select', section: 'verification', default: '', - virtual: true, options: [Then update
ResourceWizard.updateYamlFromForm()to delete optional empty-string values instead of writingspec.verify.mode: "".Verification commands:
#!/bin/bash # Verify that no non-virtual field currently emits spec.verify.mode. rg -n -C4 "name: 'verifyMode'|path: 'spec.verify.mode'|virtual: true|doc\.setIn\(path, value\)" \ src/lib/templates/git-repository.ts src/lib/components/wizards/ResourceWizard.svelteVerification query:
Flux GitRepository v1 spec.verify.mode secretRef example🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/templates/git-repository.ts` around lines 246 - 275, The verifyMode field is marked virtual so choosing HEAD/Tag/TagAndHEAD never writes spec.verify.mode; remove virtual: true (or otherwise make the field non-virtual) on the verifyMode field in git-repository.ts so the selected value is emitted to spec.verify.mode, and also update ResourceWizard.updateYamlFromForm() to treat optional empty-string values as absent (delete keys with "" rather than writing spec.verify.mode: "") so optional fields aren't persisted as empty strings; reference the verifyMode field name, path 'spec.verify.mode', and the ResourceWizard.updateYamlFromForm() method when making these changes.
🧹 Nitpick comments (2)
src/lib/server/lifecycle/startup.ts (1)
72-121: Bootstrap ordering is safe now — writeFileSync throws before DB commit.The
persistSetupTokencallback is invoked bycreateDefaultAdminIfNeededbefore thedb.transaction(...)that inserts the admin user (seesrc/lib/server/auth/bootstrap-admin.tslines 116-141), so a failed token write correctly aborts admin creation. Thewxflag +0o600mode +tmpdir()placement is reasonable.One small hardening suggestion (optional): the filename contains only
Date.now(), which is predictable. On shared hosts another local user could pre-create files with guessed names to DoS startup (thoughwxwould make the failure loud rather than silent). Consider adding a random component:♻️ Optional hardening
- tokenFile = join(tmpdir(), `gyre-setup-token-${Date.now()}.txt`); + tokenFile = join( + tmpdir(), + `gyre-setup-token-${Date.now()}-${randomBytes(8).toString('hex')}.txt` + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/lifecycle/startup.ts` around lines 72 - 121, The token file name uses only Date.now() making it predictable; update the token filename generation inside the persistSetupToken callback (where tokenFile is set and writeFileSync is called during createDefaultAdminIfNeeded) to include a cryptographically-random component (e.g., from crypto.randomBytes or uuid) before writing with mode 0o600 and flag 'wx', so callers like setSetupTokenFile receive a non-guessable path and pre-creation DoS is mitigated.src/lib/server/auth/bootstrap-admin.ts (1)
48-60:hasUsers()swallows all DB errors — consider narrowing to "table missing".The catch-all treats any failure (connection error, corrupted DB, transient lock) as "no users", which then drives
createDefaultAdminIfNeededto attempt inserting anadminrow. In practice the unique-username constraint will block duplicate admins, but on a genuinely broken DB the subsequent insert will throw a confusing error rather than the original DB failure. Consider only swallowing the "no such table" case and letting other errors propagate.♻️ Suggested narrowing
} catch (err) { - // Table doesn't exist yet - no users - return false; + if (err instanceof Error && /no such table/i.test(err.message)) return false; + throw err; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/server/auth/bootstrap-admin.ts` around lines 48 - 60, The hasUsers function currently catches all errors; change it to only treat a missing-table error as "no users" by detecting the DB error that indicates the table doesn't exist (e.g., check the thrown error's code/message for "no such table" or the specific SQLite/Postgres error identifier) and return false in that case, but rethrow any other errors so connection/lock/corruption issues propagate; update the try/catch in hasUsers (which calls getDbSync and queries users) to inspect the caught error and only swallow the missing-table case, otherwise throw the error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 173-179: The virtual-field defaulting code needs to first infer
virtual selector values from the manifest YAML before applying defaults:
implement or call inferVirtualValueFromVisibleDependentFields prior to the loop
that handles template.fields and sets values[field.name] = field.default so that
for each virtual field (field.virtual) you inspect non-virtual fields where
showIf.field === field.name, check whether that dependent field's YAML path
exists and if so set the virtual field to the corresponding showIf.value; only
fall back to field.default if no inferred value is found and do not overwrite an
existing user-provided values[field.name].
In `@src/lib/server/auth/bootstrap-admin.ts`:
- Around line 151-172: Replace the duplicated admin-in-cluster condition in
authenticateUser with the helper isInClusterAdmin(user): when authenticating, if
isInClusterAdmin(user) is true dispatch to validateInClusterAdmin(password)
instead of repeating normalizeUsername(...) && isInClusterMode(); ensure
authenticateUser still increments failure/success metrics and returns null/user
exactly as before, and keep calling cleanupSetupTokenFile() on first successful
login.
In `@src/lib/server/auth/in-cluster-admin.ts`:
- Around line 81-134: The code in loadOrCreateInClusterAdmin currently falls
through to creating ADMIN_SECRET_NAME when the secret exists but lacks
data.password, causing a 409; change the logic so if readNamespacedSecret
returns a secret but passwordBase64 is falsy you generate a password (using
process.env.ADMIN_PASSWORD || generateStrongPassword()), warnIfWeakAdminPassword
if needed, set inClusterAdminPasswordHash = await hashPassword(password), set
inClusterFirstLoginDone = await isSecretConsumed(api, namespace), and then PATCH
the existing secret (use api.patchNamespacedSecret or equivalent) to add
stringData.password (and labels if desired) instead of calling
createNamespacedSecret; return the new password. Also ensure you catch and
rethrow non-404 errors as before and include the same unique symbols:
ADMIN_SECRET_NAME, loadOrCreateInClusterAdmin, inClusterAdminPasswordHash,
isSecretConsumed.
In `@src/lib/server/clusters/health.ts`:
- Around line 210-222: The code currently only treats a null return from
getClusterKubeconfig as a kubeconfig error, but exceptions thrown during
decryption escape and become a generic outer error; update the try block around
the kubeconfig retrieval so any exception from getClusterKubeconfig (including
decrypt failures) is caught locally and returns await fail(...) with the same
"Kubeconfig Access" checks entry. Specifically, wrap the call to
getClusterKubeconfig(id) in its own try/catch (or catch errors from
decryptKubeconfig if that helper is used) and on error build the same checks
array used for the null case and return fail(errorMessage, checks) so decryption
failures produce the intended per-stage diagnostic instead of the outer
unexpected error.
In `@src/lib/server/clusters/repository.ts`:
- Around line 188-195: getClusterKubeconfig can throw because decryptKubeconfig
throws on malformed/unsupported ciphertext; update getClusterKubeconfig to catch
decryptKubeconfig errors and return null (preserving the Promise<string | null>
contract) so callers like testClusterConnection in health.ts get a null
diagnostic instead of an unexpected rejection. Wrap the
decryptKubeconfig(cluster.kubeconfigEncrypted) call in a try/catch, log a
contextual message (including cluster id and the error) in the catch, and return
null; alternatively, if you prefer to propagate errors, add JSDoc and change
callers to handle throws — but do one of these two options so the behavior is
explicit.
- Around line 55-68: The createCluster function currently calls
db.transaction((tx) => { tx.insert(clusters).values(newCluster).run(); if
(contextRecords.length > 0) {
tx.insert(clusterContexts).values(contextRecords).run(); } }); without awaiting
it; update createCluster to await the db.transaction call (same pattern as
deleteCluster) so the transaction's Promise is resolved/rejected before
proceeding, ensuring any driver errors surface and the subsequent
db.query.clusters.findFirst (eq(clusters.id, id)) observes the inserted rows;
locate the db.transaction usage and the tx.insert(...).run() calls inside
createCluster to apply the change.
In `@src/lib/server/events/bus.ts`:
- Around line 115-121: When stopping the worker we currently only remove gauges
for keys in context.lastStates before clearing maps, which misses resources that
were only in context.resourceFirstSeen (set by poller.ts during the settling
window) and leaves stale metrics; update the shutdown/cleanup sequence to
additionally iterate over context.resourceFirstSeen (for any key not present in
context.lastStates) and call fluxResourceStatusGauge.remove(context.clusterId,
type, namespace, name, 'Ready') for each parsed key (split by '/'), then clear
context.lastStates, context.lastNotificationStates and context.resourceFirstSeen
as before so all gauge labels are removed.
In `@src/lib/templates/helm-release.ts`:
- Around line 257-295: The UI schema for the HelmRelease field valuesFrom is
emitting an unsupported namespace property and misleading help text; remove the
'namespace' arrayItemField from the valuesFrom definition (the object under
name: 'valuesFrom') and update its description/helpText to state that
ValuesReference does not support cross-namespace references and the referenced
Secret/ConfigMap must live in the same namespace as the HelmRelease (refer to
ValuesReference in helm-controller/v2beta1). Keep other fields (kind, name,
valuesKey, targetPath, optional) intact and ensure no UI form control or
placeholder for 'namespace' remains.
---
Outside diff comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 149-157: The textarea fields (e.g., values, commonMetadata.labels,
commonMetadata.annotations, and postRenderers[].kustomize) are being written as
raw strings because coerceFieldValue only coerces numbers; before calling
doc.setIn(path, value) update coerceFieldValue (or add a pre-set step in the
code that leads to doc.setIn) to detect textarea/structured fields and parse
their content as YAML/JSON (falling back to null or empty object on parse
failure) so the manifest stores objects/arrays/maps rather than plain strings;
use the same parsing logic used by validateHelmReleaseResourceValues for the
values field and ensure parsed results are passed to doc.setIn instead of the
raw textarea string.
---
Duplicate comments:
In `@src/lib/server/clusters/health.ts`:
- Around line 230-249: When checkApiReachability(kc) throws but you then call
checkAuthAndVersion(kc), add a synthetic "API Server Reachability"
HealthCheckResult into the checks array before delegating to checkAuthAndVersion
so the UI still sees that stage; create a reachability check object (matching
the HealthCheckResult shape used elsewhere, e.g. name "API Server Reachability",
success=true, timestamp=new Date(), and optional details/error info from the
caught exception) and push it onto checks prior to calling
checkAuthAndVersion(kc), then proceed with the existing updateCluster(id, ...),
fail(...), and return logic.
In `@src/lib/templates/git-repository.ts`:
- Around line 246-275: The verifyMode field is marked virtual so choosing
HEAD/Tag/TagAndHEAD never writes spec.verify.mode; remove virtual: true (or
otherwise make the field non-virtual) on the verifyMode field in
git-repository.ts so the selected value is emitted to spec.verify.mode, and also
update ResourceWizard.updateYamlFromForm() to treat optional empty-string values
as absent (delete keys with "" rather than writing spec.verify.mode: "") so
optional fields aren't persisted as empty strings; reference the verifyMode
field name, path 'spec.verify.mode', and the ResourceWizard.updateYamlFromForm()
method when making these changes.
---
Nitpick comments:
In `@src/lib/server/auth/bootstrap-admin.ts`:
- Around line 48-60: The hasUsers function currently catches all errors; change
it to only treat a missing-table error as "no users" by detecting the DB error
that indicates the table doesn't exist (e.g., check the thrown error's
code/message for "no such table" or the specific SQLite/Postgres error
identifier) and return false in that case, but rethrow any other errors so
connection/lock/corruption issues propagate; update the try/catch in hasUsers
(which calls getDbSync and queries users) to inspect the caught error and only
swallow the missing-table case, otherwise throw the error.
In `@src/lib/server/lifecycle/startup.ts`:
- Around line 72-121: The token file name uses only Date.now() making it
predictable; update the token filename generation inside the persistSetupToken
callback (where tokenFile is set and writeFileSync is called during
createDefaultAdminIfNeeded) to include a cryptographically-random component
(e.g., from crypto.randomBytes or uuid) before writing with mode 0o600 and flag
'wx', so callers like setSetupTokenFile receive a non-guessable path and
pre-creation DoS is mitigated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c5eae1b5-2a46-4309-952e-78bcc4f9841d
📒 Files selected for processing (15)
src/lib/components/wizards/ResourceWizard.sveltesrc/lib/server/auth/bootstrap-admin.tssrc/lib/server/auth/in-cluster-admin.tssrc/lib/server/auth/passwords.tssrc/lib/server/clusters/health.tssrc/lib/server/clusters/repository.tssrc/lib/server/events/bus.tssrc/lib/server/events/poller.tssrc/lib/server/kubernetes/namespace.tssrc/lib/server/lifecycle/startup.tssrc/lib/templates/git-repository.tssrc/lib/templates/helm-release.tssrc/lib/templates/image-policy.tssrc/lib/templates/kustomization.tssrc/lib/templates/types.ts
✅ Files skipped from review due to trivial changes (2)
- src/lib/server/kubernetes/namespace.ts
- src/lib/templates/types.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/lib/templates/kustomization.ts
| export async function authenticateUser(username: string, password: string): Promise<User | null> { | ||
| const user = await getUserByUsername(username); | ||
|
|
||
| if (!user || !user.active) { | ||
| loginAttemptsTotal.labels('failure').inc(); | ||
| return null; | ||
| } | ||
|
|
||
| const isValid = await verifyManagedUserPassword(user, password); | ||
|
|
||
| if (!isValid) { | ||
| loginAttemptsTotal.labels('failure').inc(); | ||
| return null; | ||
| } | ||
|
|
||
| loginAttemptsTotal.labels('success').inc(); | ||
|
|
||
| // First successful login: delete the setup token file so credentials do not | ||
| // persist on disk beyond first use. | ||
| cleanupSetupTokenFile(); | ||
|
|
||
| return user; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
rg -nP -C5 '\bverifyManagedUserPassword\b' src/lib/server/auth/credentials.tsRepository: EnTRoPY0120/gyre
Length of output: 507
🏁 Script executed:
rg -nP 'isInClusterAdmin|isInClusterMode' src/lib/server/auth/ -A 3Repository: EnTRoPY0120/gyre
Length of output: 2778
Use isInClusterAdmin(user) to dispatch to the in-cluster admin validator instead of duplicating the condition.
The code at lines 79-80 correctly routes to validateInClusterAdmin(password) for the in-cluster admin, but it duplicates the condition logic (normalizeUsername(user.username) === 'admin' && isInClusterMode()) instead of using the existing isInClusterAdmin(user) helper function defined in the same module. Replace the inline condition with isInClusterAdmin(user) for consistency and maintainability.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/auth/bootstrap-admin.ts` around lines 151 - 172, Replace the
duplicated admin-in-cluster condition in authenticateUser with the helper
isInClusterAdmin(user): when authenticating, if isInClusterAdmin(user) is true
dispatch to validateInClusterAdmin(password) instead of repeating
normalizeUsername(...) && isInClusterMode(); ensure authenticateUser still
increments failure/success metrics and returns null/user exactly as before, and
keep calling cleanupSetupTokenFile() on first successful login.
| try { | ||
| const result = await api.readNamespacedSecret({ | ||
| name: ADMIN_SECRET_NAME, | ||
| namespace | ||
| }); | ||
| const passwordBase64 = result.data?.['password']; | ||
|
|
||
| if (passwordBase64) { | ||
| const password = Buffer.from(passwordBase64, 'base64').toString('utf-8'); | ||
| // Hash and store for authentication | ||
| inClusterAdminPasswordHash = await hashPassword(password); | ||
|
|
||
| // Check if already consumed | ||
| inClusterFirstLoginDone = await isSecretConsumed(api, namespace); | ||
|
|
||
| return password; | ||
| } | ||
| } catch (error: unknown) { | ||
| // Secret doesn't exist, will create it below | ||
| const k8sError = error as Error & { code?: number }; | ||
| if (k8sError.code !== 404) { | ||
| throw error; | ||
| } | ||
| } | ||
|
|
||
| // Generate new password | ||
| // Use ADMIN_PASSWORD from env if provided, otherwise generate a strong one | ||
| const password = process.env.ADMIN_PASSWORD || generateStrongPassword(); | ||
| if (process.env.ADMIN_PASSWORD) warnIfWeakAdminPassword(process.env.ADMIN_PASSWORD); | ||
|
|
||
| // Hash for storage | ||
| inClusterAdminPasswordHash = await hashPassword(password); | ||
|
|
||
| // Create the secret | ||
| const secret: k8s.V1Secret = { | ||
| apiVersion: 'v1', | ||
| kind: 'Secret', | ||
| metadata: { | ||
| name: ADMIN_SECRET_NAME, | ||
| namespace, | ||
| labels: { | ||
| 'app.kubernetes.io/managed-by': 'gyre', | ||
| 'gyre.io/secret-type': 'initial-admin-password' | ||
| } | ||
| }, | ||
| stringData: { | ||
| password: password | ||
| } | ||
| }; | ||
|
|
||
| await api.createNamespacedSecret({ | ||
| namespace, | ||
| body: secret | ||
| }); |
There was a problem hiding this comment.
Edge case: existing secret without a password data field will fail with 409 on create.
If ADMIN_SECRET_NAME already exists in the namespace but is missing data.password (e.g., manually created, or written by an older version with a different key), the if (passwordBase64) branch falls through to the create path at Line 131, which will reject with 409 AlreadyExists. The resulting error is then rethrown from loadOrCreateInClusterAdmin and fails startup with a message that doesn't make the misconfiguration obvious.
Consider either (a) logging a clear diagnostic when the secret exists but lacks data.password, or (b) patching the existing secret with a freshly generated password instead of trying to create a duplicate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/auth/in-cluster-admin.ts` around lines 81 - 134, The code in
loadOrCreateInClusterAdmin currently falls through to creating ADMIN_SECRET_NAME
when the secret exists but lacks data.password, causing a 409; change the logic
so if readNamespacedSecret returns a secret but passwordBase64 is falsy you
generate a password (using process.env.ADMIN_PASSWORD ||
generateStrongPassword()), warnIfWeakAdminPassword if needed, set
inClusterAdminPasswordHash = await hashPassword(password), set
inClusterFirstLoginDone = await isSecretConsumed(api, namespace), and then PATCH
the existing secret (use api.patchNamespacedSecret or equivalent) to add
stringData.password (and labels if desired) instead of calling
createNamespacedSecret; return the new password. Also ensure you catch and
rethrow non-404 errors as before and include the same unique symbols:
ADMIN_SECRET_NAME, loadOrCreateInClusterAdmin, inClusterAdminPasswordHash,
isSecretConsumed.
| try { | ||
| const kubeconfig = await getClusterKubeconfig(id); | ||
| if (!kubeconfig) { | ||
| const error = 'Kubeconfig not found or failed to decrypt'; | ||
| return await fail(error, [ | ||
| { | ||
| name: 'Kubeconfig Access', | ||
| passed: false, | ||
| message: 'Failed to retrieve kubeconfig', | ||
| details: error | ||
| } | ||
| ]); | ||
| } |
There was a problem hiding this comment.
Decryption failures bypass the Kubeconfig Access diagnostic.
getClusterKubeconfig returns null only when the row/ciphertext is missing; when decryptKubeconfig throws (unsupported format, bad auth tag, wrong key), the exception escapes this block and is only caught by the outer handler on line 269, which reports it as a generic “unexpected error” with no per-stage checks entry. The user-visible health report for a decryption failure therefore loses the intended Kubeconfig Access check and only shows error.
Proposed fix
- const kubeconfig = await getClusterKubeconfig(id);
- if (!kubeconfig) {
- const error = 'Kubeconfig not found or failed to decrypt';
- return await fail(error, [
+ let kubeconfig: string | null;
+ try {
+ kubeconfig = await getClusterKubeconfig(id);
+ } catch (err) {
+ const details = sanitizeK8sErrorMessage(
+ err instanceof Error ? err.message : 'Failed to decrypt kubeconfig'
+ );
+ return await fail(details, [
+ {
+ name: 'Kubeconfig Access',
+ passed: false,
+ message: 'Failed to decrypt kubeconfig',
+ details
+ }
+ ]);
+ }
+ if (!kubeconfig) {
+ const error = 'Kubeconfig not found';
+ return await fail(error, [
{
name: 'Kubeconfig Access',
passed: false,
message: 'Failed to retrieve kubeconfig',
details: error
}
]);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/clusters/health.ts` around lines 210 - 222, The code currently
only treats a null return from getClusterKubeconfig as a kubeconfig error, but
exceptions thrown during decryption escape and become a generic outer error;
update the try block around the kubeconfig retrieval so any exception from
getClusterKubeconfig (including decrypt failures) is caught locally and returns
await fail(...) with the same "Kubeconfig Access" checks entry. Specifically,
wrap the call to getClusterKubeconfig(id) in its own try/catch (or catch errors
from decryptKubeconfig if that helper is used) and on error build the same
checks array used for the null case and return fail(errorMessage, checks) so
decryption failures produce the intended per-stage diagnostic instead of the
outer unexpected error.
| db.transaction((tx) => { | ||
| tx.insert(clusters).values(newCluster).run(); | ||
| if (contextRecords.length > 0) { | ||
| tx.insert(clusterContexts).values(contextRecords).run(); | ||
| } | ||
| }); | ||
|
|
||
| const cluster = await db.query.clusters.findFirst({ | ||
| where: eq(clusters.id, id) | ||
| }); | ||
|
|
||
| if (!cluster) { | ||
| throw new Error('Failed to create cluster'); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm which drizzle driver is in use and whether db.transaction returns a Promise here.
fd -t f 'drizzle' -e ts -e js | head -40
rg -nP '\b(drizzle-orm/[a-z0-9-]+|createClient|Database)\b' --type=ts -C2 -g 'src/lib/server/db/**'
rg -nP '\bdb\.transaction\s*\(' --type=ts -C2Repository: EnTRoPY0120/gyre
Length of output: 12125
🏁 Script executed:
#!/bin/bash
# Check the db.transaction() signature and return type
cat -n src/lib/server/db/index.ts | head -100
# Look at the createCluster function in detail
cat -n src/lib/server/clusters/repository.ts | head -80Repository: EnTRoPY0120/gyre
Length of output: 6649
🏁 Script executed:
#!/bin/bash
# Search for transaction return types or any type hints
rg -A3 'type.*transaction|\.transaction.*:' --type=ts src/lib/server/db/
rg -B5 -A5 'getDb\(\)' src/lib/server/clusters/repository.tsRepository: EnTRoPY0120/gyre
Length of output: 1527
🏁 Script executed:
#!/bin/bash
# Check package.json to see drizzle-orm version
cat package.json | grep -A2 -B2 drizzle
# Look for any TypeScript definitions or type hints
fd -t f 'tsconfig\|types\|declaration' -e json -e d.ts | head -20Repository: EnTRoPY0120/gyre
Length of output: 302
🏁 Script executed:
#!/bin/bash
# Search for how transaction is actually being called/returned elsewhere
rg 'return\s+db\.transaction|db\.transaction.*return' --type=ts -A2 -B2
rg 'const.*transaction|const.*=\s*await\s+db\.transaction' --type=tsRepository: EnTRoPY0120/gyre
Length of output: 807
🏁 Script executed:
#!/bin/bash
# Look at the rate-limiter.ts transaction return pattern
sed -n '70,110p' src/lib/server/rate-limiter.ts
sed -n '280,310p' src/lib/server/rate-limiter.ts
# Look at rbac.ts transaction return pattern
sed -n '395,420p' src/lib/server/rbac.tsRepository: EnTRoPY0120/gyre
Length of output: 3873
🏁 Script executed:
#!/bin/bash
# Check what happens with the returned values - check function signatures
rg -B5 'return db\.transaction' --type=ts src/lib/server/rate-limiter.ts | head -30
rg -B3 'checkRateLimit' --type=ts src/lib/server/rate-limiter.ts | head -20Repository: EnTRoPY0120/gyre
Length of output: 443
🏁 Script executed:
#!/bin/bash
# Let's understand the actual return type by checking how these return values are used
# Check the function that calls checkRateLimit
rg -B5 -A10 'checkRateLimit\(' --type=ts | head -50
# Check what type is expected from the transaction returns
rg 'function checkRateLimit' --type=ts -A15 src/lib/server/rate-limiter.ts | head -30Repository: EnTRoPY0120/gyre
Length of output: 4104
🏁 Script executed:
#!/bin/bash
# Check the actual implementation to see if there's any await missing elsewhere
cat -n src/lib/server/auth/users.ts | sed -n '25,40p'
# Verify if this is about Promise handling with better-sqlite3
rg 'better-sqlite3|sqlite.*sync|sync.*sqlite' --type=ts src/lib/server/db/Repository: EnTRoPY0120/gyre
Length of output: 636
🏁 Script executed:
#!/bin/bash
# Check the actual type signature returned by better-sqlite3's transaction method
# by examining how the return value is actually used
rg 'return db.transaction' --type=ts -A5 src/lib/server/rate-limiter.ts | head -20
# Check if db.transaction is truly synchronous by looking at the limiter.check function signature
sed -n '60,80p' src/lib/server/rate-limiter.ts
# Check the function signature of the rate limiter method that uses the transaction return
cat -n src/lib/server/rate-limiter.ts | sed -n '1,70p' | tail -30Repository: EnTRoPY0120/gyre
Length of output: 2072
🏁 Script executed:
#!/bin/bash
# Let's verify what the transaction actually returns and if there are Promise types
rg 'Promise.*transaction|transaction.*Promise' --type=ts src/lib/server/
# Check if there's any explicit error handling or Promise handling for transactions
rg 'catch|\.catch|try.*transaction' --type=ts | grep transactionRepository: EnTRoPY0120/gyre
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Verify the pattern - check if all functions using await getDb() also await their transactions
rg -B3 'const db = await getDb\(\)' --type=ts src/lib/server/auth/users.ts | head -15
rg -B3 'const db = await getDb\(\)' --type=ts src/lib/server/clusters/repository.ts | head -20Repository: EnTRoPY0120/gyre
Length of output: 977
🏁 Script executed:
#!/bin/bash
# Check if there's any documentation or type hint about transaction return type with drizzle
# Let's examine the actual behavior by checking another await pattern
cat -n src/lib/server/auth/users.ts | sed -n '25,45p'
cat -n src/lib/server/clusters/repository.ts | sed -n '170,185p'Repository: EnTRoPY0120/gyre
Length of output: 1185
🏁 Script executed:
#!/bin/bash
# Pattern confirmation: Check all async functions using await getDb() to see if they await their transactions
rg 'const db = await getDb\(\)' --type=ts -A20 | grep -B5 'db\.transaction' | head -60Repository: EnTRoPY0120/gyre
Length of output: 2856
🏁 Script executed:
#!/bin/bash
# Final verification: Count the pattern to be absolutely sure
echo "=== Functions using await getDb() WITHOUT await on transaction ==="
rg 'const db = await getDb\(\)' --type=ts -A20 | grep -B15 'db\.transaction' | grep -B15 '^[^:]*:\s*db\.transaction\s*\(' | grep -v 'await' | head -20
echo ""
echo "=== Functions using await getDb() WITH await on transaction ==="
rg 'const db = await getDb\(\)' --type=ts -A20 | grep -B15 'await db\.transaction' | head -30Repository: EnTRoPY0120/gyre
Length of output: 2294
🏁 Script executed:
#!/bin/bash
# Final check - look at seed-providers to confirm it uses getDbSync
cat -n src/lib/server/auth/seed-providers.ts | sed -n '215,230p'Repository: EnTRoPY0120/gyre
Length of output: 501
🏁 Script executed:
#!/bin/bash
# Confirm seed-providers uses getDbSync not await getDb
cat -n src/lib/server/auth/seed-providers.ts | sed -n '215,225p'Repository: EnTRoPY0120/gyre
Length of output: 366
🏁 Script executed:
#!/bin/bash
# Final comprehensive check: find ALL instances of await getDb() followed by non-awaited db.transaction
echo "=== All instances of 'const db = await getDb()' ==="
rg 'const db = await getDb\(\)' --type=ts -n
echo ""
echo "=== Check which ones DO NOT await the transaction ==="
# Get all files with await getDb() and check their transaction usage
rg 'const db = await getDb\(\)' --type=ts -l | while read file; do
echo "File: $file"
grep -n 'db\.transaction' "$file" || echo " (no transaction found)"
doneRepository: EnTRoPY0120/gyre
Length of output: 3984
createCluster does not await its transaction, unlike deleteCluster.
db.transaction((tx) => { ... }) on line 55 returns a Promise (note await on the same call in deleteCluster at line 176 and on all other async transaction usage in users.ts, credentials.ts, and sso.ts). Since the transaction callback itself is synchronous and only issues .run() calls, this may coincidentally work, but:
- Any rejection from the underlying driver is unhandled (silent / unhandled-promise rejection) instead of surfacing as a failed
createCluster. - The subsequent
findFirston line 62 is not guaranteed to observe the inserts.
Mirror deleteCluster and all other async functions for correctness:
Proposed fix
- db.transaction((tx) => {
+ await db.transaction((tx) => {
tx.insert(clusters).values(newCluster).run();
if (contextRecords.length > 0) {
tx.insert(clusterContexts).values(contextRecords).run();
}
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| db.transaction((tx) => { | |
| tx.insert(clusters).values(newCluster).run(); | |
| if (contextRecords.length > 0) { | |
| tx.insert(clusterContexts).values(contextRecords).run(); | |
| } | |
| }); | |
| const cluster = await db.query.clusters.findFirst({ | |
| where: eq(clusters.id, id) | |
| }); | |
| if (!cluster) { | |
| throw new Error('Failed to create cluster'); | |
| } | |
| await db.transaction((tx) => { | |
| tx.insert(clusters).values(newCluster).run(); | |
| if (contextRecords.length > 0) { | |
| tx.insert(clusterContexts).values(contextRecords).run(); | |
| } | |
| }); | |
| const cluster = await db.query.clusters.findFirst({ | |
| where: eq(clusters.id, id) | |
| }); | |
| if (!cluster) { | |
| throw new Error('Failed to create cluster'); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/clusters/repository.ts` around lines 55 - 68, The
createCluster function currently calls db.transaction((tx) => {
tx.insert(clusters).values(newCluster).run(); if (contextRecords.length > 0) {
tx.insert(clusterContexts).values(contextRecords).run(); } }); without awaiting
it; update createCluster to await the db.transaction call (same pattern as
deleteCluster) so the transaction's Promise is resolved/rejected before
proceeding, ensuring any driver errors surface and the subsequent
db.query.clusters.findFirst (eq(clusters.id, id)) observes the inserted rows;
locate the db.transaction usage and the tx.insert(...).run() calls inside
createCluster to apply the change.
| export async function getClusterKubeconfig(id: string): Promise<string | null> { | ||
| const cluster = await getClusterById(id); | ||
| if (!cluster || !cluster.kubeconfigEncrypted) { | ||
| return null; | ||
| } | ||
|
|
||
| return decryptKubeconfig(cluster.kubeconfigEncrypted); | ||
| } |
There was a problem hiding this comment.
Document that getClusterKubeconfig can reject (not only return null).
decryptKubeconfig throws on unsupported/malformed ciphertext (see src/lib/server/clusters/encryption.ts lines 111 and 127-130). The return type Promise<string | null> suggests only a null-vs-value contract, so callers (e.g. testClusterConnection in health.ts) must also handle rejections — which currently land in the generic outer catch and surface as “Unexpected error” rather than a Kubeconfig Access diagnostic.
Consider either catching-and-returning-null here (and logging), or updating the JSDoc to explicitly note the throw case so callers handle it consciously.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/clusters/repository.ts` around lines 188 - 195,
getClusterKubeconfig can throw because decryptKubeconfig throws on
malformed/unsupported ciphertext; update getClusterKubeconfig to catch
decryptKubeconfig errors and return null (preserving the Promise<string | null>
contract) so callers like testClusterConnection in health.ts get a null
diagnostic instead of an unexpected rejection. Wrap the
decryptKubeconfig(cluster.kubeconfigEncrypted) call in a try/catch, log a
contextual message (including cluster id and the error) in the catch, and return
null; alternatively, if you prefer to propagate errors, add JSDoc and change
callers to handle throws — but do one of these two options so the behavior is
explicit.
| for (const key of context.lastStates.keys()) { | ||
| const [type, namespace, name] = key.split('/'); | ||
| fluxResourceStatusGauge.remove(context.clusterId, type, namespace, name, 'Ready'); | ||
| } | ||
| context.lastStates.clear(); | ||
| context.lastNotificationStates.clear(); | ||
| context.resourceFirstSeen.clear(); |
There was a problem hiding this comment.
Remove gauges for unsettled resources before clearing resourceFirstSeen.
poller.ts sets fluxResourceStatusGauge before a resource enters lastStates; during the settling window it may exist only in resourceFirstSeen. Stopping the worker currently clears those keys without removing their gauge labels, leaving stale metrics behind.
🧹 Proposed fix
- for (const key of context.lastStates.keys()) {
+ const trackedKeys = new Set([...context.lastStates.keys(), ...context.resourceFirstSeen.keys()]);
+ for (const key of trackedKeys) {
const [type, namespace, name] = key.split('/');
fluxResourceStatusGauge.remove(context.clusterId, type, namespace, name, 'Ready');
}
context.lastStates.clear();
context.lastNotificationStates.clear();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for (const key of context.lastStates.keys()) { | |
| const [type, namespace, name] = key.split('/'); | |
| fluxResourceStatusGauge.remove(context.clusterId, type, namespace, name, 'Ready'); | |
| } | |
| context.lastStates.clear(); | |
| context.lastNotificationStates.clear(); | |
| context.resourceFirstSeen.clear(); | |
| const trackedKeys = new Set([...context.lastStates.keys(), ...context.resourceFirstSeen.keys()]); | |
| for (const key of trackedKeys) { | |
| const [type, namespace, name] = key.split('/'); | |
| fluxResourceStatusGauge.remove(context.clusterId, type, namespace, name, 'Ready'); | |
| } | |
| context.lastStates.clear(); | |
| context.lastNotificationStates.clear(); | |
| context.resourceFirstSeen.clear(); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/server/events/bus.ts` around lines 115 - 121, When stopping the
worker we currently only remove gauges for keys in context.lastStates before
clearing maps, which misses resources that were only in
context.resourceFirstSeen (set by poller.ts during the settling window) and
leaves stale metrics; update the shutdown/cleanup sequence to additionally
iterate over context.resourceFirstSeen (for any key not present in
context.lastStates) and call fluxResourceStatusGauge.remove(context.clusterId,
type, namespace, name, 'Ready') for each parsed key (split by '/'), then clear
context.lastStates, context.lastNotificationStates and context.resourceFirstSeen
as before so all gauge labels are removed.
| name: 'valuesFrom', | ||
| label: 'Values From', | ||
| path: 'spec.valuesFrom', | ||
| type: 'array', | ||
| section: 'release', | ||
| arrayItemType: 'object', | ||
| arrayItemFields: [ | ||
| { | ||
| name: 'kind', | ||
| label: 'Kind', | ||
| path: 'kind', | ||
| type: 'select', | ||
| options: [ | ||
| { label: 'ConfigMap', value: 'ConfigMap' }, | ||
| { label: 'Secret', value: 'Secret' } | ||
| ] | ||
| }, | ||
| { | ||
| name: 'name', | ||
| label: 'Name', | ||
| path: 'name', | ||
| type: 'string' | ||
| }, | ||
| { | ||
| name: 'namespace', | ||
| label: 'Namespace', | ||
| path: 'namespace', | ||
| type: 'string', | ||
| placeholder: 'flux-system', | ||
| description: | ||
| 'Namespace of the referenced resource. Leave blank to use the HelmRelease namespace.' | ||
| }, | ||
| { name: 'valuesKey', label: 'Values Key', path: 'valuesKey', type: 'string' }, | ||
| { name: 'targetPath', label: 'Target Path', path: 'targetPath', type: 'string' }, | ||
| { name: 'optional', label: 'Optional', path: 'optional', type: 'boolean' } | ||
| ], | ||
| description: 'References to ConfigMaps or Secrets containing Helm values.', | ||
| helpText: | ||
| "Security: valuesFrom can reference resources from any namespace if the controller's RBAC permits it. Prefer referencing Secrets and ConfigMaps in the same namespace as the HelmRelease to limit exposure." |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Flux HelmRelease v2 ValuesReference fields namespace support
💡 Result:
In Flux HelmRelease v2 (apiVersion: helm.toolkit.fluxcd.io/v2), the ValuesReference struct in spec.valuesFrom does not include a 'namespace' field. Official API documentation and source code confirm ValuesReference only has fields: kind (ConfigMap/Secret), name, valuesKey, targetPath, and optional. The comment explicitly states: "Name of the values referent. Should reside in the same namespace as the referring resource." By default, ValuesReference targets ConfigMaps/Secrets in the same namespace as the HelmRelease. Cross-namespace references for valuesFrom are not natively supported by the API schema. A 2026 blog post shows YAML examples with 'namespace' in ValuesReference and claims it works when enabling --no-cross-namespace-refs=false on helm-controller. However, this contradicts official sources (API refs, Go source code from helm-controller repo) which lack the field. The blog likely contains inaccurate information or refers to undocumented behavior/changes post-v2 GA (May 2024). The --no-cross-namespace-refs flag primarily controls source references (HelmRepository, etc.), not valuesFrom. ValuesFrom are fetched via Kubernetes client-go in the HelmRelease's namespace context. To share values cross-namespace, use alternatives like: - Copy ConfigMap/Secret to HelmRelease namespace via Kustomization or another controller. - Inline values in spec.values. - External secrets operators (e.g., ExternalSecrets) to sync cross-namespace. Official Flux docs emphasize same-namespace for ValuesReference.
Citations:
- 1: https://oneuptime.com/blog/post/2026-03-05-helmrelease-values-cross-namespace-flux/view
- 2: https://v2-0.docs.fluxcd.io/flux/components/helm/helmreleases/
- 3: https://fluxcd.io/flux/components/helm/api/v2/
- 4: https://github.com/fluxcd/helm-controller/blob/main/api/v2beta1/reference_types.go
- 5: https://v2-0.docs.fluxcd.io/flux/components/helm/api/
- 6: https://fluxcd.io/flux/components/helm/api/v2
- 7: https://github.com/fluxcd/helm-controller/blob/main/api/v2beta1/helmrelease_types.go
- 8: Update
HelmReleaseAPI to v2 (GA) fluxcd/flux2#4775 - 9: https://github.com/fluxcd/helm-controller/blob/main/main.go
Remove unsupported namespace from valuesFrom.
Flux ValuesReference does not include a namespace field; the referenced Secret/ConfigMap must reside in the same namespace as the HelmRelease. This UI currently emits an unsupported field and the help text incorrectly advertises cross-namespace access. See the Flux API documentation and helm-controller source code for ValuesReference: https://fluxcd.io/flux/components/helm/api/v2/ and https://github.com/fluxcd/helm-controller/blob/main/api/v2beta1/reference_types.go.
Proposed fix
{
- name: 'namespace',
- label: 'Namespace',
- path: 'namespace',
- type: 'string',
- placeholder: 'flux-system',
- description:
- 'Namespace of the referenced resource. Leave blank to use the HelmRelease namespace.'
- },
- { name: 'valuesKey', label: 'Values Key', path: 'valuesKey', type: 'string' },
+ name: 'valuesKey',
+ label: 'Values Key',
+ path: 'valuesKey',
+ type: 'string'
+ },
{ name: 'targetPath', label: 'Target Path', path: 'targetPath', type: 'string' },
{ name: 'optional', label: 'Optional', path: 'optional', type: 'boolean' }
],
description: 'References to ConfigMaps or Secrets containing Helm values.',
helpText:
- "Security: valuesFrom can reference resources from any namespace if the controller's RBAC permits it. Prefer referencing Secrets and ConfigMaps in the same namespace as the HelmRelease to limit exposure."
+ 'ValuesFrom references must point to Secrets or ConfigMaps in the same namespace as the HelmRelease.'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| name: 'valuesFrom', | |
| label: 'Values From', | |
| path: 'spec.valuesFrom', | |
| type: 'array', | |
| section: 'release', | |
| arrayItemType: 'object', | |
| arrayItemFields: [ | |
| { | |
| name: 'kind', | |
| label: 'Kind', | |
| path: 'kind', | |
| type: 'select', | |
| options: [ | |
| { label: 'ConfigMap', value: 'ConfigMap' }, | |
| { label: 'Secret', value: 'Secret' } | |
| ] | |
| }, | |
| { | |
| name: 'name', | |
| label: 'Name', | |
| path: 'name', | |
| type: 'string' | |
| }, | |
| { | |
| name: 'namespace', | |
| label: 'Namespace', | |
| path: 'namespace', | |
| type: 'string', | |
| placeholder: 'flux-system', | |
| description: | |
| 'Namespace of the referenced resource. Leave blank to use the HelmRelease namespace.' | |
| }, | |
| { name: 'valuesKey', label: 'Values Key', path: 'valuesKey', type: 'string' }, | |
| { name: 'targetPath', label: 'Target Path', path: 'targetPath', type: 'string' }, | |
| { name: 'optional', label: 'Optional', path: 'optional', type: 'boolean' } | |
| ], | |
| description: 'References to ConfigMaps or Secrets containing Helm values.', | |
| helpText: | |
| "Security: valuesFrom can reference resources from any namespace if the controller's RBAC permits it. Prefer referencing Secrets and ConfigMaps in the same namespace as the HelmRelease to limit exposure." | |
| name: 'valuesFrom', | |
| label: 'Values From', | |
| path: 'spec.valuesFrom', | |
| type: 'array', | |
| section: 'release', | |
| arrayItemType: 'object', | |
| arrayItemFields: [ | |
| { | |
| name: 'kind', | |
| label: 'Kind', | |
| path: 'kind', | |
| type: 'select', | |
| options: [ | |
| { label: 'ConfigMap', value: 'ConfigMap' }, | |
| { label: 'Secret', value: 'Secret' } | |
| ] | |
| }, | |
| { | |
| name: 'name', | |
| label: 'Name', | |
| path: 'name', | |
| type: 'string' | |
| }, | |
| { | |
| name: 'valuesKey', | |
| label: 'Values Key', | |
| path: 'valuesKey', | |
| type: 'string' | |
| }, | |
| { name: 'targetPath', label: 'Target Path', path: 'targetPath', type: 'string' }, | |
| { name: 'optional', label: 'Optional', path: 'optional', type: 'boolean' } | |
| ], | |
| description: 'References to ConfigMaps or Secrets containing Helm values.', | |
| helpText: | |
| 'ValuesFrom references must point to Secrets or ConfigMaps in the same namespace as the HelmRelease.' |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/lib/templates/helm-release.ts` around lines 257 - 295, The UI schema for
the HelmRelease field valuesFrom is emitting an unsupported namespace property
and misleading help text; remove the 'namespace' arrayItemField from the
valuesFrom definition (the object under name: 'valuesFrom') and update its
description/helpText to state that ValuesReference does not support
cross-namespace references and the referenced Secret/ConfigMap must live in the
same namespace as the HelmRelease (refer to ValuesReference in
helm-controller/v2beta1). Keep other fields (kind, name, valuesKey, targetPath,
optional) intact and ensure no UI form control or placeholder for 'namespace'
remains.
- Keep `verifyMode` writable instead of virtual - Clear the field from the document when left disabled
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/lib/components/wizards/ResourceWizard.svelte (1)
106-108:⚠️ Potential issue | 🟠 MajorInfer virtual selector values before applying defaults.
This still defaults virtual fields without reading the manifest, so YAML with
spec.ref.tag,spec.ref.semver, orspec.ref.commitcan reopen as the defaultrefTypeand the next Form→YAML sync may delete the user’s selected ref. Same previously flagged issue still applies here.Also applies to: 179-183
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/components/wizards/ResourceWizard.svelte` around lines 106 - 108, The defaulting logic is setting virtual fields (field.virtual) before reading the manifest-derived selector values, which causes refType to be overwritten; update the logic that assigns values[field.name] = field.default so that for virtual fields you first infer and populate values[field.name] from the manifest/parsed YAML selector (e.g., check for spec.ref.tag, spec.ref.semver, spec.ref.commit or any selector keys) and only fall back to field.default if no selector-derived value exists; apply this same change to both places where this pattern occurs (the block using field.virtual and values[field.name] === undefined and the similar block around the other occurrence).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 152-155: When handling the case where the 'verifyMode' field
(field.name === 'verifyMode') is set to the empty string, remove the entire
spec.verify object instead of only deleting spec.verify.mode to avoid leaving
spec.verify.secretRef.name orphaned; update the logic in the block that
currently calls doc.deleteIn(path) (for path === 'spec.verify.mode') to instead
delete the parent object (e.g., call doc.deleteIn('spec.verify') or compute the
parent path from path) so both mode and secretRef are removed when verifyMode
=== ''.
In `@src/lib/templates/git-repository.ts`:
- Around line 180-184: The current validation.pattern for the SemVer constraint
is overly restrictive and blocks valid Flux/Masterminds semver ranges; remove or
replace the pattern entry on the validation object so combined ranges, ORs,
hyphen ranges and wildcards are allowed—either delete the entire
validation.pattern (and adjust validation.message) or replace client-side regex
validation with a proper semver-range check using a semver parser (e.g.,
Masterminds/semver or an equivalent JS library) invoked where the field is
validated; target the validation object (the validation.pattern and
validation.message entries) in git-repository.ts when making this change.
---
Duplicate comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 106-108: The defaulting logic is setting virtual fields
(field.virtual) before reading the manifest-derived selector values, which
causes refType to be overwritten; update the logic that assigns
values[field.name] = field.default so that for virtual fields you first infer
and populate values[field.name] from the manifest/parsed YAML selector (e.g.,
check for spec.ref.tag, spec.ref.semver, spec.ref.commit or any selector keys)
and only fall back to field.default if no selector-derived value exists; apply
this same change to both places where this pattern occurs (the block using
field.virtual and values[field.name] === undefined and the similar block around
the other occurrence).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c14b5631-a28b-40b4-985b-80b292707e9a
📒 Files selected for processing (2)
src/lib/components/wizards/ResourceWizard.sveltesrc/lib/templates/git-repository.ts
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lib/components/wizards/ResourceWizard.svelte (1)
215-227: Optional: consolidate the three path-walking variants.
getValueAtPathduplicates the dotted-path traversal already inlined in the initial-values effect (Lines 89-98) and inupdateFormFromYaml(Lines 194-203). The three copies have subtly different guard clauses (if (!current) break;vsif (!current || typeof current !== 'object')), which is the kind of drift that tends to cause edge-case bugs later when someone fixes one site and forgets the others. Consider routing all three throughgetValueAtPath.♻️ Suggested consolidation
- template.fields.forEach((field) => { - const path = field.path.split('.'); - let current = parsed; - for (let i = 0; i < path.length; i++) { - if (!current) break; - if (i === path.length - 1) { - values[field.name] = coerceFieldValue(field, current[path[i]]); - } else { - current = current[path[i]] as Record<string, unknown>; - } - } + template.fields.forEach((field) => { + values[field.name] = coerceFieldValue(field, getValueAtPath(parsed, field.path));And similarly in
updateFormFromYamlat Lines 194-203.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/components/wizards/ResourceWizard.svelte` around lines 215 - 227, The dotted-path traversal logic is duplicated and inconsistent; replace the inline traversals in the initial-values effect (the block around Lines 89-98) and in updateFormFromYaml with a single call to getValueAtPath so all callers share the same guard behavior; specifically, change the inline loops to call getValueAtPath(source, path) and ensure getValueAtPath's guard uses the stricter check (if (!current || typeof current !== 'object')) so all callers adopt consistent handling of non-object/current falsy values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 229-242: The presence check in inferVirtualFieldValue is too
permissive: getValueAtPath(source, candidate.path) can return null or '' (or an
empty array) and still cause a virtual field like refType to be inferred; update
the condition in inferVirtualFieldValue to treat values as absent when
getValueAtPath(...) is undefined OR null OR an empty string OR an empty array
(i.e., reject value === null, value === '' and Array.isArray(value) &&
value.length === 0) before using candidate.showIf.value from
template.fields/candidate.path so only truly populated paths drive virtual-field
inference.
---
Nitpick comments:
In `@src/lib/components/wizards/ResourceWizard.svelte`:
- Around line 215-227: The dotted-path traversal logic is duplicated and
inconsistent; replace the inline traversals in the initial-values effect (the
block around Lines 89-98) and in updateFormFromYaml with a single call to
getValueAtPath so all callers share the same guard behavior; specifically,
change the inline loops to call getValueAtPath(source, path) and ensure
getValueAtPath's guard uses the stricter check (if (!current || typeof current
!== 'object')) so all callers adopt consistent handling of non-object/current
falsy values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b972a666-f694-4788-9e0f-84c94116d7b6
📒 Files selected for processing (2)
src/lib/components/wizards/ResourceWizard.sveltesrc/lib/templates/git-repository.ts
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
- Replace duplicated nested path traversal with `getValueAtPath` - Treat empty values as unset when inferring virtual field defaults
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/templates/git-repository.ts`:
- Around line 109-120: The serviceAccountName field in the GitRepository
template (the property controlling spec.serviceAccountName) must be shown only
when the provider is Azure; update the field configuration for
serviceAccountName to add a conditional guard that checks the provider/select
field (spec.provider or the 'provider' form field) equals 'azure' (use the same
conditional mechanism your form system uses, e.g., dependsOn/condition) and
change its description to explicitly state "Azure-only: set
spec.serviceAccountName when using Azure DevOps with Workload Identity" so users
understand it is only valid for the azure provider.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9c6356ae-84e9-477c-9559-4d97e2c4a4cf
📒 Files selected for processing (2)
src/lib/components/wizards/ResourceWizard.sveltesrc/lib/templates/git-repository.ts
✅ Files skipped from review due to trivial changes (1)
- src/lib/components/wizards/ResourceWizard.svelte
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
Docker Build CompletedStatus: Success Security ScanTrivy vulnerability scan completed. Check the workflow logs for details.
|
Summary
/api/v1/user/clusterand removed client-sidegyre_clustercookie writes/full reloads.Verification
bun run formatbun run lintbun run checkbun test src/tests/request-pipeline.test.ts src/tests/clusters-encryption.test.ts src/tests/events.test.ts src/tests/k8s-client-reliability.test.ts src/tests/flux-route-adapters.test.ts src/tests/flux-action-routes.test.ts src/tests/flux-server-loads.test.ts src/tests/auth.test.ts src/tests/session-cleanup.test.ts src/tests/cluster-selection-route.test.tsbun run verify:cibun run helm:checkbun run scripts:checkbun run docs:checkNotes: build/docs emitted existing dependency/build warnings, but all commands exited successfully.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor