Skip to content

fix(cloud-agent-next): diagnose kilo import failures#3632

Open
eshurakov wants to merge 1 commit into
mainfrom
inky-toque
Open

fix(cloud-agent-next): diagnose kilo import failures#3632
eshurakov wants to merge 1 commit into
mainfrom
inky-toque

Conversation

@eshurakov
Copy link
Copy Markdown
Contributor

Summary

  • Preserve actionable wrapper diagnostics when kilo import fails during Cloud Agent session restore without exposing arbitrary CLI output through worker or UI error contracts.
  • Drain import stdout and stderr concurrently so verbose subprocess output cannot deadlock restore, while keeping complete previews bounded and replacing oversized previews with a fail-closed [REDACTED] marker.
  • Harden the wrapper boundary around import failures: sanitize credentials and terminal controls, cap aggregate stdout/stderr processing at 1 MiB, terminate imports after five minutes, and kill the detached process group so pipe-owning descendants cannot keep restore pending.
  • Document the new restore-session: kilo import diagnostics triage landmark for sandbox wrapper logs and uploaded archives.

Verification

  • No manual sandbox restore reproduction was performed in this session; this wrapper-only diagnostic path was exercised through the local fake-kilo harness rather than a live Cloud Agent sandbox.
  • Add any additional manual verification details.

Visual Changes

N/A

Reviewer Notes

  • Failure diagnostics remain local to the wrapper log surface. RestoreResult.error stays generic, so CLI output does not propagate into worker state, WebSocket events, or user-visible errors.
  • Complete stream previews are retained only up to 4096 bytes. Truncated streams discard raw preview bytes and log [REDACTED] with byte counts.
  • Import subprocesses now run in a detached process group and are terminated after five minutes or when combined stdout/stderr exceeds 1 MiB.


function sanitizeImportDiagnostic(diagnostic: string, truncated: boolean): string {
const secrets = collectImportDiagnosticSecrets();
let sanitized = stripUnsafeControlCharacters(diagnostic.replace(ANSI_ESCAPE_SEQUENCE, ''));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: When truncated=true, diagnostic is already "[REDACTED]" (set by readBoundedOutputTail), so the ANSI-strip and control-char-strip on this line are always a no-op in that branch. The truncated guard on line 180 then immediately overwrites sanitized with REDACTED anyway.

Consider moving the truncated early-exit before line 178 to make the flow explicit and avoid dead work:

function sanitizeImportDiagnostic(diagnostic: string, truncated: boolean): string {
  if (truncated) return REDACTED;
  const secrets = collectImportDiagnosticSecrets();
  let sanitized = stripUnsafeControlCharacters(diagnostic.replace(ANSI_ESCAPE_SEQUENCE, ''));
  // ... rest of redaction chain
}

This is a clarity/correctness nit — the current code produces the correct output because retainSanitizedImportDiagnosticTail("[REDACTED]") returns "[REDACTED]" unchanged.

}

function sanitizeImportDiagnostic(diagnostic: string, truncated: boolean): string {
const secrets = collectImportDiagnosticSecrets();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUGGESTION: collectImportDiagnosticSecrets() is called once per non-empty stream in logImportDiagnostics — potentially twice (once for stdout, once for stderr). Each call iterates process.env and may read KILOCODE_TOKEN_FILE from disk.

The secrets list is constant for the lifetime of a single import, so a single call shared between both sanitizeImportDiagnostic invocations would be cleaner. Either collect secrets outside sanitizeImportDiagnostic and pass them in, or call sanitizeImportDiagnostic once and cache.

totalBytes += value.byteLength;
options?.observeBytes(value.byteLength);
if (truncated) continue;
if (totalBytes > maxBytes) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: The function name readBoundedOutputTail implies keeping the tail (last N bytes) of the stream, but the actual behaviour when totalBytes > maxBytes is fail-closed: all previously retained bytes are discarded and truncated=true/REDACTED is returned. The actual tail-retention logic lives in the separate post-sanitisation step (retainSanitizedImportDiagnosticTail).

This is intentionally conservative and the tests confirm the design, but the name creates a misleading contract. Consider renaming to drainBoundedOutputStream or readBoundedOutput to reflect that the function drains the stream while enforcing a byte budget, with truncation on overflow. At minimum a doc comment would help.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Jun 1, 2026

Code Review Summary

Status: 3 Issues Found | Recommendation: Address before merge

Executive Summary

The sanitizeImportDiagnostic function has dead work in the truncated=true path, and readBoundedOutputTail has a misleading name that implies tail-retention semantics it doesn't implement.

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 1
Issue Details (click to expand)

WARNING

File Line Issue
services/cloud-agent-next/wrapper/src/restore-session.ts 178 Dead work in truncated=true branch — ANSI/control-char stripping runs on an already-REDACTED string before being overwritten
services/cloud-agent-next/wrapper/src/restore-session.ts 93 readBoundedOutputTail name implies tail-preservation but actually discards all retained data on overflow (fail-closed design); misleading contract

SUGGESTION

File Line Issue
services/cloud-agent-next/wrapper/src/restore-session.ts 177 collectImportDiagnosticSecrets() called once per non-empty stream (up to twice); secrets are constant for a single import run — could be computed once and shared
Files Reviewed (3 files)
  • services/cloud-agent-next/wrapper/src/restore-session.ts — 3 issues
  • services/cloud-agent-next/wrapper/src/restore-session.test.ts — no issues
  • services/cloud-agent-next/DEBUG.md — no issues

Fix these issues in Kilo Cloud


Reviewed by claude-4.6-sonnet-20260217 · 1,057,842 tokens

Review guidance: REVIEW.md from base branch main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant