chore: drop command-line-args/usage and @swc/helpers from runtime deps#441
chore: drop command-line-args/usage and @swc/helpers from runtime deps#441maorleger wants to merge 2 commits into
Conversation
7669f4e to
9cb1479
Compare
|
Long term, separating out the CLI into a separate package will allow further removals but that seems a larger undertaking. This feels like a good compromise as it removes some of the heftier packages and their transitive dependencies Remaining:
|
|
I just noticed |
There was a problem hiding this comment.
Pull request overview
This PR reduces apache-arrow’s runtime dependency footprint by removing command-line-args/command-line-usage from dependencies and moving @swc/helpers to devDependencies, replacing the CLI parsing/usage formatting with a small internal wrapper around node:util.parseArgs.
Changes:
- Introduce
src/bin/cli.ts(parseCliArgs,formatUsage) to replace external CLI deps while preserving greedymultiple: trueparsing behavior. - Migrate
arrow2csvand dev scripts to the new CLI helper /parseArgs, and add unit tests for the helper. - Update
package.json/package-lock.jsonto dropcommand-line-args/command-line-usage(and their types) from runtime deps and mark@swc/helpersas dev-only.
Reviewed changes
Copilot reviewed 2 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/bin/cli.ts |
Adds internal parseArgs wrapper and plain-text usage formatter replacing external CLI dependencies. |
test/unit/bin/cli-tests.ts |
Adds unit tests covering greedy multi-flag parsing, unknown-flag behavior, and usage formatting. |
src/bin/arrow2csv.ts |
Switches arrow2csv to the internal CLI helper for arg parsing and usage output. |
bin/json-to-arrow.ts |
Migrates dev script arg parsing and usage rendering to the internal helper. |
bin/integration.ts |
Migrates dev script arg parsing and usage rendering to the internal helper. |
gulp/argv.js |
Replaces command-line-args with node:util.parseArgs for gulp task argument handling. |
package.json |
Removes CLI libs from runtime deps and moves @swc/helpers to devDependencies. |
package-lock.json |
Reflects dependency graph changes (CLI libs removed; @swc/helpers marked dev). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
I noticed that everything we list under root `dependencies` gets propagated verbatim into every published per-target package.json via `gulp/package-task.js`, so each entry is something every downstream `npm install apache-arrow` has to pull in. This PR trims that list from 9 down to 4 (`@types/node`, `flatbuffers`, `json-with-bigint`, `tslib` stay). Two pieces here: Moving to devDependencies. The published library and CLI are both compiled by TSC (`gulp-typescript` + `tsconfig.bin.cjs.json`), and `importHelpers: true` in our tsconfig means the emitted code imports from `tslib`, never `@swc/helpers` — `grep -r '@swc/helpers' targets/` on a freshly built tree returns zero matches in any `.js`/`.mjs`/`.d.ts`. The only path that touches SWC at all is the dev-time bin shebangs that go through `@swc-node/register/esm-register`, and that path keeps working with `@swc/helpers` in devDeps (verified by removing it from `dependencies`, blowing away node_modules + the lockfile, reinstalling, and running both the test suite and `bin/file-to-stream.ts` end-to-end). I considered deleting it entirely since `@swc-node/register` indirectly pulls it in, but keeping it as an explicit devDep felt safer in case upstream changes that. Per the [SWC docs](https://swc.rs/docs/configuration/compilation#jscexternalhelpers), `externalHelpers` is a bundle-size knob — it doesn't enable any features — and most of our source needs zero helper emissions anyway because the targets are modern enough that `async`/`await`/`class extends` compile to themselves. The only piece of the published package that actually needs an argv parser is the `arrow2csv` CLI. `node:util.parseArgs` has been stable since Node 18.3, which is well below our `engines.node: '>=20.0'` floor, so we don't strictly need the two libraries (plus their transitive trees of `chalk`, `lodash`, `table-layout`, etc.) to be installed alongside `apache-arrow` for every consumer. This PR adds a small internal helper at `src/bin/cli.ts` that wraps `parseArgs` and provides a plain-text `formatUsage`. The one tricky bit was that `command-line-args` treats `multiple: true` greedily — `-s a b c` puts all three into `schema` — while `parseArgs` only honours the repeated-flag form. I wanted to keep `arrow2csv` behaviour identical, so the helper opts in to `tokens: true` and reattributes positionals that follow a `multiple: true` flag back to that flag's array. The unit tests in `test/unit/bin/cli-tests.ts` lock in the greedy and non-greedy cases side-by-side. The dev-only `bin/integration.ts`, `bin/json-to-arrow.ts`, and `gulp/argv.js` get migrated over in the same pass. The new helper is intentionally not re-exported from `src/Arrow.ts`, so it isn't public API. A few small things that `command-line-args` and `command-line-usage` support that this helper doesn't (none currently used by anything in `arrow2csv`): - Custom `type: (val) => ...` coercion functions - `lazyMultiple`, `defaultOption`, `group`, `camelCase`, `stopAtFirstUnknown` - ANSI styling (`{bold ...}` / `{underline ...}` markers are stripped to plain text rather than coloured) - Text wrapping of descriptions and `typeLabel` rendering in the help screen — the `typeLabel` field is accepted on the spec but not rendered. None of our help text is wide enough to need wrapping. Happy to add any of these back if reviewers prefer; my read is that the unused-feature surface isn't worth re-implementing. Validation: - `npm run lint:ci` clean - `npm test -- -t src` passes (46 suites, ~11.7k tests, plus the new cli-tests.ts) - `npm run build` succeeds for all 10 target/format combos - `npm run test:bundle` (esbuild + rollup + webpack) all run clean - End-to-end smoke tests of `node targets/apache-arrow/bin/arrow2csv.js` pass for `--help`, `-f <file>`, greedy `-s id name -f <file>`, and `--sep ' , '`
9cb1479 to
4fc1621
Compare
Sure thing, added! |
| // Rebuild the `_unknown` array that command-line-args' `{ partial: true }` mode | ||
| // used to produce. gulp/test-task.js forwards _unknown to Jest, so any flag or | ||
| // positional we don't recognise here (e.g. `--testNamePattern foo`, `--bail`) | ||
| // needs to round-trip through with its original raw form preserved. parseArgs | ||
| // only flags positionals as such; unknown options land in `values` and need to | ||
| // be reconstructed from the token stream. | ||
| const knownNames = new Set(Object.keys(options)); | ||
| const unknown = []; | ||
| for (const tok of tokens) { | ||
| if (tok.kind === 'positional') { | ||
| unknown.push(tok.value); | ||
| } else if (tok.kind === 'option' && !knownNames.has(tok.name)) { | ||
| unknown.push(tok.rawName); | ||
| // `--foo=bar` parses to a single token with `inlineValue: true`; the | ||
| // separate-arg form `--foo bar` puts `bar` in a following positional | ||
| // token, which we'll pick up on the next loop iteration. | ||
| if (tok.inlineValue) unknown.push(tok.value); | ||
| delete values[tok.name]; | ||
| } |
There was a problem hiding this comment.
Yeah, it's one of those things that's technically accurate but (IIUC) largely meaningless in practice.
Here's my understanding, correct me if I'm wrong:
gulp/argv.js is dev-only, and _unknown only has one reader (test-task.js) - all that reader does is forward it to jest. Jest treats --foo=bar and --foo bar identically (I used --testNamePattern argument to test this), so the two forms give exactly the same outcome.
So I think a change here is not necessary, but I can change the code comment a bit to something like:
- // needs to round-trip through with its original raw form preserved.
+ // needs to round-trip through in a form Jest will acceptWhich would more accurately describe the contract
If you feel strongly that I should fix this or if I misunderstood something, let me know, but it doesn't seem necessary. I'll defer to y'all as owners 🙇
This PR trims the list of runtime dependencies from 9 down to 4
(
@types/node,flatbuffers,json-with-bigint,tslibstay).Two pieces here:
@swc/helpers
Moving to devDependencies. The published library and CLI are both
compiled by TSC (
gulp-typescript+tsconfig.bin.cjs.json), andimportHelpers: truein our tsconfig means the emitted code importsfrom
tslib, never@swc/helpers-grep -r '@swc/helpers' targets/on a freshly built tree returns zero matches in any
.js/.mjs/.d.ts.The only path that touches SWC at all is the dev-time bin shebangs that
go through
@swc-node/register/esm-register, and that path keepsworking with
@swc/helpersin devDeps (verified by removing it fromdependencies, blowing away node_modules + the lockfile, reinstalling,and running both the test suite and
bin/file-to-stream.tsend-to-end).I considered deleting it entirely since
@swc-node/registerindirectlypulls it in, but keeping it as an explicit devDep felt safer in case
upstream changes that.
command-line-args and command-line-usage
The only piece of the published package that actually needs an argv
parser is the
arrow2csvCLI.node:util.parseArgshas been stablesince Node 18.3, which is well below the min version specified here,
so we don't strictly need the two libraries (plus their
transitive trees of
chalk,lodash,table-layout, etc.) to beinstalled alongside
apache-arrowfor every consumer.This PR adds a small internal helper at
src/bin/cli.tsthat wrapsparseArgsand provides a plain-textformatUsage.Callout
The one tricky bit was that
command-line-argstreatsmultiple: truegreedily --s a b cputs all three intoschema- whileparseArgsonlyhonors the repeated-flag form. I wanted to keep
arrow2csvbehavioridentical, so the helper opts in to
tokens: trueand reattributespositionals that follow a
multiple: trueflag back to that flag'sarray. The unit tests in
test/unit/bin/cli-tests.tslock in thegreedy and non-greedy cases side-by-side. The dev-only
bin/integration.ts,bin/json-to-arrow.ts, andgulp/argv.jsget migrated over in thesame pass.
We could significantly simplify the cli.ts file if we can forgo that requirement,
but I opted to keep full compat for now.
A few small things that
command-line-argsandcommand-line-usagesupport that this helper doesn't (none currently used by anything in
arrow2csv):type: (val) => ...coercion functionslazyMultiple,defaultOption,group,camelCase,stopAtFirstUnknown{bold ...}/{underline ...}markers are strippedto plain text rather than colored)
typeLabelrendering in the helpscreen - the
typeLabelfield is accepted on the spec but notrendered. None of our help text is wide enough to need wrapping.
Happy to add any of these back if reviewers prefer; my read is that
the unused-feature surface isn't worth re-implementing.
Validation:
npm run lint:cicleannpm test -- -t srcpasses (46 suites, ~11.7k tests, plus the newcli-tests.ts)
npm run buildsucceeds for all 10 target/format combosnpm run test:bundle(esbuild + rollup + webpack) all run cleannode targets/apache-arrow/bin/arrow2csv.jspass for
--help,-f <file>, greedy-s id name -f <file>,and
--sep ' , 'Resolves #55