Skip to content
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
5108b22
feat(preview): experimental preview commands for isolated full-stack …
CarmenDou Jun 12, 2026
16f06f1
fix(preview): address review — literal env values, early manifest, sa…
CarmenDou Jun 12, 2026
f64f000
fix(preview): cleanup branch on poll failure, overwrite all duplicate…
CarmenDou Jun 12, 2026
aa43a5b
fix(preview): validate name before provisioning, guard duplicate crea…
CarmenDou Jun 13, 2026
3e1323f
fix(preview): gate teardown output behind --json, wrap teardown name …
CarmenDou Jun 13, 2026
21c8689
feat(link): --with-test-agents installs Playwright Test Agents for in…
CarmenDou Jun 16, 2026
48d0d38
Merge remote-tracking branch 'origin/main' into worktree-agent-e2e-pr…
CarmenDou Jun 16, 2026
3dd19ac
docs(preview): add README covering verify-loop context, flow, design …
CarmenDou Jun 16, 2026
0c743ab
fix(deployments): exclude Playwright test artifacts (test-results, pl…
CarmenDou Jun 16, 2026
517a2a9
feat(verify): light-mode verify probes, default browser MCP, drop pre…
CarmenDou Jun 18, 2026
26d56bf
chore(verify): remove dead overwriteEnvFile, simplify skills-only ins…
CarmenDou Jun 18, 2026
29acf6c
fix(verify): guard read-only SQL, assert fetch ok, harden MCP merge, …
CarmenDou Jun 18, 2026
1452c86
fix(verify): block destructive DML hidden inside a CTE in isReadOnlyQ…
CarmenDou Jun 18, 2026
ee76be5
fix(verify): validate identifiers/emails, block MERGE CTE, exclusive …
CarmenDou Jun 18, 2026
008fdd5
fix(verify): block SELECT INTO, sanitize finding endpoint/message PII…
CarmenDou Jun 18, 2026
a309bd5
fix(verify): rename event to cli_verify_finding with standard dims, t…
CarmenDou Jun 18, 2026
63b3147
fix(verify): owner-scope anon RLS probe, drop free-text telemetry, ho…
CarmenDou Jun 19, 2026
0bdb097
fix(verify): validate --expect-count before running the query, assert…
CarmenDou Jun 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/commands/deployments/deploy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ const EXCLUDE_PATTERNS = [
'.cache',
'skills',
'coverage',
'test-results',
'playwright-report',
'.playwright-mcp',
IGNORE_FILE_NAME,
];

Expand Down
48 changes: 48 additions & 0 deletions src/commands/verify/finding.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
import type { Command } from 'commander';
import { CLIError, getRootOpts, handleError } from '../../lib/errors.js';
import { outputJson, outputInfo } from '../../lib/output.js';
import { shutdownAnalytics, trackVerifyFinding } from '../../lib/analytics.js';
import { getProjectConfig } from '../../lib/config.js';

// Record a "loud" error the browser surfaced during the drive — a 4xx/5xx, a
// `column does not exist`, a console exception — that the agent saw via
// `browser_console_messages` / `browser_network_requests`. The rls/truth probes
// only cover the *silent* findings; this is how the loud ones reach PostHog too.
export function registerVerifyFindingCommand(verify: Command): void {
verify
.command('finding')
.description('Record a loud error surfaced during the drive (4xx/5xx, column-not-found, console) as a finding (experimental)')
.requiredOption('--kind <kind>', 'short error kind, e.g. pgrst_column_not_found, http_500, console_error')
.option('--type <type>', 'finding type', 'error')
.option('--status <n>', 'HTTP status, if any', (v) => parseInt(v, 10))
.option('--endpoint <path>', 'the endpoint/URL that errored')
.option('--message <text>', 'the error message the page showed')
.option('--table <name>', 'related table, if known')
.action(async (opts, cmd) => {
const { json } = getRootOpts(cmd);
try {
const config = getProjectConfig();
if (!config) throw new CLIError('No linked project found — run `insforge link` first.');
const finding = {
type: opts.type as string,
kind: opts.kind as string,
status: Number.isNaN(opts.status) ? undefined : (opts.status as number | undefined),
endpoint: opts.endpoint as string | undefined,
message: opts.message as string | undefined,
table: opts.table as string | undefined,
};
trackVerifyFinding(finding, config);
await shutdownAnalytics(); // flush the PostHog event before exit
Comment on lines +30 to +35

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize --message and --endpoint before tracking analytics.

Line 30 through Line 35 forwards raw free-form text to trackVerifyFinding(...). These values can include emails, tokens, or URL query params from runtime errors, which risks telemetry PII/secrets leakage. Redact sensitive patterns and strip query strings before emitting.

Suggested direction
-          endpoint: opts.endpoint as string | undefined,
-          message: opts.message as string | undefined,
+          endpoint: sanitizeEndpoint(opts.endpoint as string | undefined),
+          message: sanitizeMessage(opts.message as string | undefined),
// Add small helpers in this file:
function sanitizeEndpoint(v?: string): string | undefined {
  if (!v) return undefined;
  return v.split('?')[0]; // drop query params
}

function sanitizeMessage(v?: string): string | undefined {
  if (!v) return undefined;
  return v
    .replace(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}/gi, '[redacted-email]')
    .replace(/\b(?:Bearer\s+)?[A-Za-z0-9._-]{20,}\b/g, '[redacted-token]')
    .slice(0, 500);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/verify/finding.ts` around lines 30 - 35, The endpoint and
message values passed to trackVerifyFinding are raw user input that may contain
sensitive data like emails, tokens, or URL query parameters, creating a
PII/secrets leakage risk in analytics. Create two helper functions in this file
(sanitizeEndpoint and sanitizeMessage) that remove query strings from endpoints
and redact email addresses and bearer tokens from messages using regex patterns,
then apply these functions to opts.endpoint and opts.message before passing them
to the trackVerifyFinding call.


if (json) {
outputJson({ recorded: true, finding });
} else {
outputInfo(
`📝 recorded ${finding.type} finding: ${finding.kind}${finding.status ? ` (${finding.status})` : ''}${finding.message ? ` — ${finding.message}` : ''}`,
);
}
} catch (e) {
handleError(e, json);
}
});
}
14 changes: 14 additions & 0 deletions src/commands/verify/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// src/commands/verify/index.ts
import type { Command } from 'commander';
import { registerVerifyRlsCommand } from './rls.js';
import { registerVerifyTruthCommand } from './truth.js';
import { registerVerifyFindingCommand } from './finding.js';

export function registerVerifyCommands(program: Command): void {
const verify = program
.command('verify', { hidden: true })
.description('[experimental] Backend-truth & RLS probes + loud-error recording for insforge-verify');
registerVerifyRlsCommand(verify);
registerVerifyTruthCommand(verify);
registerVerifyFindingCommand(verify);
}
77 changes: 77 additions & 0 deletions src/commands/verify/rls.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { Command } from 'commander';
import type * as VerifyProbe from '../../lib/verify-probe.js';
import { registerVerifyRlsCommand } from './rls.js';

vi.mock('../../lib/config.js', () => ({
getProjectConfig: vi.fn(() => ({
project_id: 'p1', project_name: 'n', org_id: 'o1', region: 'us-east',
api_key: 'key', oss_host: 'https://h',
})),
}));
vi.mock('../../lib/api/oss.js', () => ({
getAnonKey: vi.fn(async () => 'anon'),
runRawSql: vi.fn(async () => ({ rows: [{ id: 'aid' }] })),
}));
vi.mock('../../lib/analytics.js', () => ({
trackVerifyFinding: vi.fn(),
shutdownAnalytics: vi.fn(async () => {}),
}));
// Keep the pure helpers (classifyRls / isSafeIdentifier / isLikelyEmail) real; mock the
// two network calls.
vi.mock('../../lib/verify-probe.js', async (importOriginal) => {
const actual = await importOriginal<typeof VerifyProbe>();
return { ...actual, login: vi.fn(async () => 'token'), recordsCount: vi.fn(async () => 0) };
});

function makeProgram() {
const program = new Command().exitOverride();
program.option('--json');
registerVerifyRlsCommand(program.command('verify'));
return program;
}

describe('verify rls (command)', () => {
let exitSpy: ReturnType<typeof vi.spyOn>;
beforeEach(() => {
vi.clearAllMocks();
process.exitCode = undefined;
exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
throw new Error(`exit:${code}`);
}) as never);
vi.spyOn(console, 'error').mockImplementation(() => {});
vi.spyOn(console, 'log').mockImplementation(() => {});
});
afterEach(() => {
exitSpy.mockRestore();
vi.restoreAllMocks();
process.exitCode = undefined;
});

it('rejects an --owner that smuggles PostgREST params, before any login', async () => {
const { login } = await import('../../lib/verify-probe.js');
await expect(
makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id&select=secret', '--json'], { from: 'user' }),
).rejects.toThrow(/exit:/);
expect(login).not.toHaveBeenCalled();
Comment on lines +51 to +56

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tighten assertions to verify the intended rejection cause.

On Lines 53-56 and Lines 61-64, asserting only /exit:/ can pass on unrelated early-exit paths. Add an assertion on the emitted validation message so these tests prove they’re hitting the --owner and --user-a guards specifically.

Suggested test hardening
 describe('verify rls (command)', () => {
   let exitSpy: ReturnType<typeof vi.spyOn>;
+  let errorSpy: ReturnType<typeof vi.spyOn>;
   beforeEach(() => {
     vi.clearAllMocks();
     process.exitCode = undefined;
     exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
       throw new Error(`exit:${code}`);
     }) as never);
-    vi.spyOn(console, 'error').mockImplementation(() => {});
+    errorSpy = vi.spyOn(console, 'error').mockImplementation(() => {});
     vi.spyOn(console, 'log').mockImplementation(() => {});
   });
@@
   it('rejects an --owner that smuggles PostgREST params, before any login', async () => {
     const { login } = await import('../../lib/verify-probe.js');
     await expect(
       makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id&select=secret', '--json'], { from: 'user' }),
     ).rejects.toThrow(/exit:/);
+    expect(errorSpy.mock.calls.flat().join(' ')).toMatch(/--owner must be a bare column name/);
     expect(login).not.toHaveBeenCalled();
   });
@@
   it('rejects a non-email --user-a, before any login', async () => {
     const { login } = await import('../../lib/verify-probe.js');
     await expect(
       makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id', '--user-a', 'not-an-email', '--json'], { from: 'user' }),
     ).rejects.toThrow(/exit:/);
+    expect(errorSpy.mock.calls.flat().join(' ')).toMatch(/--user-a and --user-b must be valid email addresses/);
     expect(login).not.toHaveBeenCalled();
   });

Also applies to: 59-64

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/verify/rls.test.ts` around lines 51 - 56, The test assertions at
lines 51-56 and 59-64 are too loose - they only verify that an exit occurs with
`/exit:/` but don't confirm the rejection is specifically due to the `--owner`
or `--user-a` validation guards. Replace the generic `/exit:/` pattern matching
in the toThrow() calls with more specific validation error messages that confirm
the actual validation failure for each guard (the `--owner` parameter validation
in the first test and `--user-a` parameter validation in the second test) to
ensure these tests are hitting the intended rejection paths and not passing on
unrelated early-exit scenarios.

});

it('rejects a non-email --user-a, before any login', async () => {
const { login } = await import('../../lib/verify-probe.js');
await expect(
makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id', '--user-a', 'not-an-email', '--json'], { from: 'user' }),
).rejects.toThrow(/exit:/);
expect(login).not.toHaveBeenCalled();
});

it('scopes the anonymous control to A\'s owner filter (not the whole table)', async () => {
const { recordsCount } = await import('../../lib/verify-probe.js');
await makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id', '--json'], { from: 'user' });
// 3 probes: B-of-A, A-own, anon — all must use the same owner-scoped filter.
expect(recordsCount).toHaveBeenCalledTimes(3);
// The anon probe (3rd call) must pass the filter + no token, NOT undefined for the filter.
expect(recordsCount).toHaveBeenNthCalledWith(
3, 'https://h', 'orders', expect.stringContaining('user_id=eq.'), undefined, 'anon',
);
Comment on lines +70 to +75

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate filter equality across all three recordsCount probes.

Line 70 says all probes must use the same owner-scoped filter, but the test currently checks only the 3rd call and only with stringContaining. Compare the 3rd argument from all three calls for exact equality.

Suggested assertion improvement
   it('scopes the anonymous control to A\'s owner filter (not the whole table)', async () => {
     const { recordsCount } = await import('../../lib/verify-probe.js');
     await makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id', '--json'], { from: 'user' });
     // 3 probes: B-of-A, A-own, anon — all must use the same owner-scoped filter.
     expect(recordsCount).toHaveBeenCalledTimes(3);
+    const calls = (recordsCount as ReturnType<typeof vi.fn>).mock.calls;
+    const [filter1, filter2, filter3] = [calls[0][2], calls[1][2], calls[2][2]];
+    expect(typeof filter1).toBe('string');
+    expect(filter1).toMatch(/^user_id=eq\./);
+    expect(filter2).toBe(filter1);
+    expect(filter3).toBe(filter1);
     // The anon probe (3rd call) must pass the filter + no token, NOT undefined for the filter.
     expect(recordsCount).toHaveBeenNthCalledWith(
       3, 'https://h', 'orders', expect.stringContaining('user_id=eq.'), undefined, 'anon',
     );
   });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 3 probes: B-of-A, A-own, anon — all must use the same owner-scoped filter.
expect(recordsCount).toHaveBeenCalledTimes(3);
// The anon probe (3rd call) must pass the filter + no token, NOT undefined for the filter.
expect(recordsCount).toHaveBeenNthCalledWith(
3, 'https://h', 'orders', expect.stringContaining('user_id=eq.'), undefined, 'anon',
);
it('scopes the anonymous control to A\'s owner filter (not the whole table)', async () => {
const { recordsCount } = await import('../../lib/verify-probe.js');
await makeProgram().parseAsync(['verify', 'rls', '--table', 'orders', '--owner', 'user_id', '--json'], { from: 'user' });
// 3 probes: B-of-A, A-own, anon — all must use the same owner-scoped filter.
expect(recordsCount).toHaveBeenCalledTimes(3);
const calls = (recordsCount as ReturnType<typeof vi.fn>).mock.calls;
const [filter1, filter2, filter3] = [calls[0][2], calls[1][2], calls[2][2]];
expect(typeof filter1).toBe('string');
expect(filter1).toMatch(/^user_id=eq\./);
expect(filter2).toBe(filter1);
expect(filter3).toBe(filter1);
// The anon probe (3rd call) must pass the filter + no token, NOT undefined for the filter.
expect(recordsCount).toHaveBeenNthCalledWith(
3, 'https://h', 'orders', expect.stringContaining('user_id=eq.'), undefined, 'anon',
);
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/verify/rls.test.ts` around lines 70 - 75, The test currently
validates only the 3rd recordsCount call with a partial filter match using
stringContaining, but the comment indicates all three probes must use the same
owner-scoped filter. Capture the filter argument from the 1st call to
recordsCount, then add assertions to verify that both the 2nd and 3rd calls use
the exact same filter string with strict equality (not partial matching). This
ensures consistency across all three probes as intended.

});
});
87 changes: 87 additions & 0 deletions src/commands/verify/rls.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import type { Command } from 'commander';
import { CLIError, getRootOpts, handleError } from '../../lib/errors.js';
import { getProjectConfig } from '../../lib/config.js';
import { outputJson, outputInfo } from '../../lib/output.js';
import { shutdownAnalytics, trackVerifyFinding } from '../../lib/analytics.js';
import {
classifyRls,
isLikelyEmail,
isSafeIdentifier,
login,
recordsCount,
} from '../../lib/verify-probe.js';
import { getAnonKey, runRawSql } from '../../lib/api/oss.js';

export function registerVerifyRlsCommand(verify: Command): void {
verify
.command('rls')
.description('Cross-user RLS isolation probe — checks B cannot read A, A can read own (experimental)')
.requiredOption('--table <name>', 'user-scoped table to probe')
.requiredOption('--owner <column>', 'owner column on the table (e.g. user_id)')
.option('--user-a <email>', 'seeded user A email', 'verify-a@example.com')
.option('--user-b <email>', 'seeded user B email', 'verify-b@example.com')
.option('--password <pw>', 'seeded users password', 'Test1234!pass')
.action(async (opts, cmd) => {
const { json } = getRootOpts(cmd);
try {
const config = getProjectConfig();
if (!config) throw new CLIError('No linked project found — run `insforge link` first.');
const baseUrl = config.oss_host;

// --table/--owner are interpolated into a PostgREST resource path and filter; keep
// them to bare identifiers so a value like `user_id&select=secret` can't inject extra
// params. --user-a/-b go into a raw SQL lookup; require an email shape (the single-
// quote escaping below already blocks string-literal injection — this removes the rest).
if (!isSafeIdentifier(String(opts.table))) {
throw new CLIError(`--table must be a bare table name (got ${JSON.stringify(opts.table)}).`);
}
if (!isSafeIdentifier(String(opts.owner))) {
throw new CLIError(`--owner must be a bare column name (got ${JSON.stringify(opts.owner)}).`);
}
if (!isLikelyEmail(String(opts.userA)) || !isLikelyEmail(String(opts.userB))) {
throw new CLIError('--user-a and --user-b must be valid email addresses.');
}

const aToken = await login(baseUrl, opts.userA, opts.password);
const bToken = await login(baseUrl, opts.userB, opts.password);
const anon = await getAnonKey();
if (!aToken || !bToken || !anon) {
throw new CLIError(
'Login or anon-key fetch returned empty — seed BOTH users first. An empty token turns every probe into an anonymous request that silently "passes" isolation.',
);
}

const { rows } = await runRawSql(
`select id from auth.users where email='${String(opts.userA).replace(/'/g, "''")}'`,
);
const aId = (rows[0] as { id?: string })?.id;
if (!aId) throw new CLIError(`Could not find user A (${opts.userA}) — seed it first.`);

// All three probes use the SAME owner-scoped filter so we measure "can X read A's
// rows", not "can X read any row". Checking the whole table for the anon control would
// false-positive a leak on any table that intentionally exposes some public rows.
const filter = `${opts.owner}=eq.${encodeURIComponent(aId)}`;
const bReadRowsOfA = await recordsCount(baseUrl, opts.table, filter, bToken, anon);
const aReadOwnRows = await recordsCount(baseUrl, opts.table, filter, aToken, anon);
const anonReadRows = await recordsCount(baseUrl, opts.table, filter, undefined, anon);

const { type, evidence } = classifyRls({ bReadRowsOfA, aReadOwnRows, anonReadRows });
const finding = { type, table: opts.table as string, evidence };
trackVerifyFinding(finding, config);
await shutdownAnalytics(); // flush the PostHog event before exit

if (json) {
outputJson({ passed: type === 'none', finding });
} else if (type === 'rls_leak') {
outputInfo(`❌ rls_leak on ${opts.table}: B read ${bReadRowsOfA} of A's rows (anon read ${anonReadRows}).`);
} else if (type === 'rls_overrestrict') {
outputInfo(`❌ rls_overrestrict on ${opts.table}: A could not read its own rows (positive control empty).`);
} else {
outputInfo(`✅ isolation holds on ${opts.table}: B=0, anon=0, A=${aReadOwnRows}.`);
}
process.exitCode = type === 'none' ? 0 : 1;
} catch (e) {
handleError(e, json);
}
});
}
81 changes: 81 additions & 0 deletions src/commands/verify/truth.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { Command } from 'commander';
import { registerVerifyTruthCommand } from './truth.js';

vi.mock('../../lib/config.js', () => ({
getProjectConfig: vi.fn(() => ({
project_id: 'p1', project_name: 'n', org_id: 'o1', region: 'us-east',
api_key: 'key', oss_host: 'https://h',
})),
}));
vi.mock('../../lib/api/oss.js', () => ({ runRawSql: vi.fn() }));
vi.mock('../../lib/analytics.js', () => ({
trackVerifyFinding: vi.fn(),
shutdownAnalytics: vi.fn(async () => {}),
}));

function makeProgram() {
const program = new Command().exitOverride();
program.option('--json');
registerVerifyTruthCommand(program.command('verify'));
return program;
}

describe('verify truth (command)', () => {
let exitSpy: ReturnType<typeof vi.spyOn>;
beforeEach(async () => {
vi.clearAllMocks();
process.exitCode = undefined;
const { runRawSql } = await import('../../lib/api/oss.js');
(runRawSql as ReturnType<typeof vi.fn>).mockResolvedValue({ rows: [] });
exitSpy = vi.spyOn(process, 'exit').mockImplementation(((code?: number) => {
throw new Error(`exit:${code}`);
}) as never);
vi.spyOn(console, 'error').mockImplementation(() => {});
vi.spyOn(console, 'log').mockImplementation(() => {});
});
afterEach(() => {
exitSpy.mockRestore();
vi.restoreAllMocks();
process.exitCode = undefined;
});

it('rejects a non-read query before touching the DB', async () => {
const { runRawSql } = await import('../../lib/api/oss.js');
await expect(
makeProgram().parseAsync(['verify', 'truth', '--query', 'delete from t', '--expect', '1', '--json'], { from: 'user' }),
).rejects.toThrow(/exit:/);
expect(runRawSql).not.toHaveBeenCalled();
});

it('rejects when both --expect and --expect-count are given', async () => {
const { runRawSql } = await import('../../lib/api/oss.js');
await expect(
makeProgram().parseAsync(['verify', 'truth', '--query', 'select 1', '--expect', '1', '--expect-count', '1', '--json'], { from: 'user' }),
).rejects.toThrow(/exit:/);
expect(runRawSql).not.toHaveBeenCalled();
});

it('rejects a non-integer --expect-count', async () => {
Comment thread
cubic-dev-ai[bot] marked this conversation as resolved.
Outdated
await expect(
makeProgram().parseAsync(['verify', 'truth', '--query', 'select count(*) from t', '--expect-count', 'abc', '--json'], { from: 'user' }),
).rejects.toThrow(/exit:/);
});
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

it('passes (exit 0) + records & flushes a finding when DB matches the claim', async () => {
const oss = await import('../../lib/api/oss.js');
(oss.runRawSql as ReturnType<typeof vi.fn>).mockResolvedValue({ rows: [{ n: 3 }] });
await makeProgram().parseAsync(['verify', 'truth', '--query', 'select n', '--expect', '3', '--json'], { from: 'user' });
expect(process.exitCode).toBe(0);
const { trackVerifyFinding, shutdownAnalytics } = await import('../../lib/analytics.js');
expect(trackVerifyFinding).toHaveBeenCalledTimes(1);
expect(shutdownAnalytics).toHaveBeenCalled();
});

it('flags false_pass (exit 1) when DB differs from the claim', async () => {
const oss = await import('../../lib/api/oss.js');
(oss.runRawSql as ReturnType<typeof vi.fn>).mockResolvedValue({ rows: [{ n: 1 }] });
await makeProgram().parseAsync(['verify', 'truth', '--query', 'select n', '--expect', '3', '--json'], { from: 'user' });
expect(process.exitCode).toBe(1);
});
Comment on lines +77 to +82

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify analytics behavior on the false_pass path, not just exit code.

Line 75 covers process.exitCode, but the PR contract also says findings are emitted and flushed on failures. Add trackVerifyFinding and shutdownAnalytics assertions here to lock that behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/verify/truth.test.ts` around lines 75 - 80, The test case `flags
false_pass (exit 1) when DB differs from the claim` currently only validates the
exit code but should also verify analytics behavior on the failure path. Add
expect assertions after the parseAsync call to verify that trackVerifyFinding
was called with the appropriate finding details and that shutdownAnalytics was
called to ensure findings are properly emitted and analytics are properly
flushed when verification fails.

});
68 changes: 68 additions & 0 deletions src/commands/verify/truth.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import type { Command } from 'commander';
import { CLIError, getRootOpts, handleError } from '../../lib/errors.js';
import { getProjectConfig } from '../../lib/config.js';
import { outputJson, outputInfo } from '../../lib/output.js';
import { shutdownAnalytics, trackVerifyFinding } from '../../lib/analytics.js';
import { classifyTruth, isReadOnlyQuery } from '../../lib/verify-probe.js';
import { runRawSql } from '../../lib/api/oss.js';

export function registerVerifyTruthCommand(verify: Command): void {
verify
.command('truth')
.description('Backend-truth cross-check — compare a DB read to what the UI claimed (experimental)')
.requiredOption('--query <sql>', 'a read proving what the UI showed; compares the first column of the first row')
.option('--expect <value>', 'the value the UI displayed (compared as a scalar)')
.option('--expect-count <n>', 'expect this many rows instead of a scalar value')
.option('--table <name>', 'table name, for the finding label')
.action(async (opts, cmd) => {
const { json } = getRootOpts(cmd);
try {
const config = getProjectConfig();
if (!config) throw new CLIError('No linked project found — run `insforge link` first.');
if (!isReadOnlyQuery(opts.query)) {
throw new CLIError(
'verify truth expects a single read query — it must start with SELECT or WITH and not chain statements. (This guard blocks common destructive forms, not a hard read-only guarantee — pass a plain read.)',
);
}
if (opts.expect !== undefined && opts.expectCount !== undefined) {
throw new CLIError('Provide either --expect <value> or --expect-count <n>, not both.');
}

const { rows } = await runRawSql(opts.query);

let result: { type: 'false_pass' | 'none'; evidence: Record<string, unknown> };
if (opts.expectCount !== undefined) {
Comment thread
cubic-dev-ai[bot] marked this conversation as resolved.
// Compare as a number so `--expect-count 03` matches 3 rows (string compare wouldn't).
const expected = Number(opts.expectCount);
if (!Number.isInteger(expected) || expected < 0) {
throw new CLIError(`--expect-count must be a non-negative integer (got ${JSON.stringify(opts.expectCount)}).`);
}
result = classifyTruth(rows.length, String(expected));
Comment thread
greptile-apps[bot] marked this conversation as resolved.
Outdated
} else if (opts.expect !== undefined) {
const first = rows[0];
const dbValue =
first && typeof first === 'object' ? Object.values(first as Record<string, unknown>)[0] : first;
result = classifyTruth(dbValue, String(opts.expect));
} else {
throw new CLIError('Provide --expect <value> (scalar) or --expect-count <n> (row count).');
}

const finding = { type: result.type, table: opts.table as string | undefined, evidence: result.evidence };
trackVerifyFinding(finding, config);
await shutdownAnalytics(); // flush the PostHog event before exit

if (json) {
outputJson({ passed: result.type === 'none', finding });
} else if (result.type === 'false_pass') {
outputInfo(
`❌ false_pass${opts.table ? ` on ${opts.table}` : ''}: UI claimed ${JSON.stringify(result.evidence.ui_claimed)} but DB has ${JSON.stringify(result.evidence.db_actual)}.`,
);
} else {
outputInfo(`✅ backend truth matches: ${JSON.stringify(result.evidence.db_actual)}.`);
}
process.exitCode = result.type === 'none' ? 0 : 1;
} catch (e) {
handleError(e, json);
}
});
}
Loading
Loading