Skip to content

chore: drop command-line-args/usage and @swc/helpers from runtime deps#441

Open
maorleger wants to merge 1 commit into
apache:mainfrom
maorleger:chore/reduce-runtime-dependencies
Open

chore: drop command-line-args/usage and @swc/helpers from runtime deps#441
maorleger wants to merge 1 commit into
apache:mainfrom
maorleger:chore/reduce-runtime-dependencies

Conversation

@maorleger
Copy link
Copy Markdown

@maorleger maorleger commented Jun 4, 2026

This PR trims the list of runtime dependencies from 9 down to 4
(@types/node, flatbuffers, json-with-bigint, tslib stay).

Two pieces here:

@swc/helpers

Moving to devDependencies. The published library and CLI are both
compiled by TSC (gulp-typescript + tsconfig.bin.cjs.json), and
importHelpers: true in our tsconfig means the emitted code imports
from tslib, never @swc/helpers - grep -r '@swc/helpers' targets/
on a freshly built tree returns zero matches in any .js/.mjs/.d.ts.

The only path that touches SWC at all is the dev-time bin shebangs that
go through @swc-node/register/esm-register, and that path keeps
working with @swc/helpers in devDeps (verified by removing it from
dependencies, blowing away node_modules + the lockfile, reinstalling,
and running both the test suite and bin/file-to-stream.ts end-to-end).

I considered deleting it entirely since @swc-node/register indirectly
pulls it in, but keeping it as an explicit devDep felt safer in case
upstream changes that.

command-line-args and command-line-usage

The only piece of the published package that actually needs an argv
parser is the arrow2csv CLI. node:util.parseArgs has been stable
since Node 18.3, which is well below the min version specified here,
so we don't strictly need the two libraries (plus their
transitive trees of chalk, lodash, table-layout, etc.) to be
installed alongside apache-arrow for every consumer.

This PR adds a small internal helper at src/bin/cli.ts that wraps
parseArgs and provides a plain-text formatUsage.

Callout

The one tricky bit was that command-line-args treats multiple: true greedily -
-s a b c puts all three into schema - while parseArgs only
honors the repeated-flag form. I wanted to keep arrow2csv behavior
identical, so the helper opts in to tokens: true and reattributes
positionals that follow a multiple: true flag back to that flag's
array. The unit tests in test/unit/bin/cli-tests.ts lock in the
greedy and non-greedy cases side-by-side. The dev-only bin/integration.ts,
bin/json-to-arrow.ts, and gulp/argv.js get migrated over in the
same pass.

We could significantly simplify the cli.ts file if we can forgo that requirement,
but I opted to keep full compat for now.

A few small things that command-line-args and command-line-usage
support that this helper doesn't (none currently used by anything in
arrow2csv):

  • Custom type: (val) => ... coercion functions
  • lazyMultiple, defaultOption, group, camelCase,
    stopAtFirstUnknown
  • ANSI styling ({bold ...} / {underline ...} markers are stripped
    to plain text rather than colored)
  • Text wrapping of descriptions and typeLabel rendering in the help
    screen - the typeLabel field is accepted on the spec but not
    rendered. None of our help text is wide enough to need wrapping.

Happy to add any of these back if reviewers prefer; my read is that
the unused-feature surface isn't worth re-implementing.

Validation:

  • npm run lint:ci clean
  • npm test -- -t src passes (46 suites, ~11.7k tests, plus the new
    cli-tests.ts)
  • npm run build succeeds for all 10 target/format combos
  • npm run test:bundle (esbuild + rollup + webpack) all run clean
  • End-to-end smoke tests of node targets/apache-arrow/bin/arrow2csv.js
    pass for --help, -f <file>, greedy -s id name -f <file>,
    and --sep ' , '

Resolves #55

I noticed that everything we list under root `dependencies` gets
propagated verbatim into every published per-target package.json via
`gulp/package-task.js`, so each entry is something every downstream
`npm install apache-arrow` has to pull in. This PR trims that list
from 9 down to 4 (`@types/node`, `flatbuffers`, `json-with-bigint`,
`tslib` stay).

Two pieces here:

Moving to devDependencies. The published library and CLI are both
compiled by TSC (`gulp-typescript` + `tsconfig.bin.cjs.json`), and
`importHelpers: true` in our tsconfig means the emitted code imports
from `tslib`, never `@swc/helpers` — `grep -r '@swc/helpers' targets/`
on a freshly built tree returns zero matches in any `.js`/`.mjs`/`.d.ts`.

The only path that touches SWC at all is the dev-time bin shebangs that
go through `@swc-node/register/esm-register`, and that path keeps
working with `@swc/helpers` in devDeps (verified by removing it from
`dependencies`, blowing away node_modules + the lockfile, reinstalling,
and running both the test suite and `bin/file-to-stream.ts` end-to-end).

I considered deleting it entirely since `@swc-node/register` indirectly
pulls it in, but keeping it as an explicit devDep felt safer in case
upstream changes that. Per the [SWC docs](https://swc.rs/docs/configuration/compilation#jscexternalhelpers),
`externalHelpers` is a bundle-size knob — it doesn't enable any
features — and most of our source needs zero helper emissions anyway
because the targets are modern enough that `async`/`await`/`class extends`
compile to themselves.

The only piece of the published package that actually needs an argv
parser is the `arrow2csv` CLI. `node:util.parseArgs` has been stable
since Node 18.3, which is well below our `engines.node: '>=20.0'`
floor, so we don't strictly need the two libraries (plus their
transitive trees of `chalk`, `lodash`, `table-layout`, etc.) to be
installed alongside `apache-arrow` for every consumer.

This PR adds a small internal helper at `src/bin/cli.ts` that wraps
`parseArgs` and provides a plain-text `formatUsage`. The one tricky
bit was that `command-line-args` treats `multiple: true` greedily —
`-s a b c` puts all three into `schema` — while `parseArgs` only
honours the repeated-flag form. I wanted to keep `arrow2csv` behaviour
identical, so the helper opts in to `tokens: true` and reattributes
positionals that follow a `multiple: true` flag back to that flag's
array. The unit tests in `test/unit/bin/cli-tests.ts` lock in the
greedy and non-greedy cases side-by-side. The dev-only `bin/integration.ts`,
`bin/json-to-arrow.ts`, and `gulp/argv.js` get migrated over in the
same pass.

The new helper is intentionally not re-exported from `src/Arrow.ts`, so
it isn't public API.

A few small things that `command-line-args` and `command-line-usage`
support that this helper doesn't (none currently used by anything in
`arrow2csv`):
- Custom `type: (val) => ...` coercion functions
- `lazyMultiple`, `defaultOption`, `group`, `camelCase`,
  `stopAtFirstUnknown`
- ANSI styling (`{bold ...}` / `{underline ...}` markers are stripped
  to plain text rather than coloured)
- Text wrapping of descriptions and `typeLabel` rendering in the help
  screen — the `typeLabel` field is accepted on the spec but not
  rendered. None of our help text is wide enough to need wrapping.

Happy to add any of these back if reviewers prefer; my read is that
the unused-feature surface isn't worth re-implementing.

Validation:
- `npm run lint:ci` clean
- `npm test -- -t src` passes (46 suites, ~11.7k tests, plus the new
  cli-tests.ts)
- `npm run build` succeeds for all 10 target/format combos
- `npm run test:bundle` (esbuild + rollup + webpack) all run clean
- End-to-end smoke tests of `node targets/apache-arrow/bin/arrow2csv.js`
  pass for `--help`, `-f <file>`, greedy `-s id name -f <file>`,
  and `--sep ' , '`
@maorleger maorleger force-pushed the chore/reduce-runtime-dependencies branch from 7669f4e to 9cb1479 Compare June 4, 2026 22:03
@maorleger maorleger marked this pull request as ready for review June 4, 2026 22:11
@maorleger
Copy link
Copy Markdown
Author

maorleger commented Jun 4, 2026

Long term, separating out the CLI into a separate package will allow further removals but that seems a larger undertaking. This feels like a good compromise as it removes some of the heftier packages and their transitive dependencies

Remaining:

  • json-with-bigint: 0 dependencies
  • flatbuffer: 0 dependencies
  • @types/node: 0 runtime cost but 1 dependency
  • tslib: 0 dependencies

@trxcllnt
Copy link
Copy Markdown
Contributor

trxcllnt commented Jun 4, 2026

I just noticed bin/arrow2csv.cjs is missing a comma here, would you mind adding that in this PR as well?

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces apache-arrow’s runtime dependency footprint by removing command-line-args/command-line-usage from dependencies and moving @swc/helpers to devDependencies, replacing the CLI parsing/usage formatting with a small internal wrapper around node:util.parseArgs.

Changes:

  • Introduce src/bin/cli.ts (parseCliArgs, formatUsage) to replace external CLI deps while preserving greedy multiple: true parsing behavior.
  • Migrate arrow2csv and dev scripts to the new CLI helper / parseArgs, and add unit tests for the helper.
  • Update package.json/package-lock.json to drop command-line-args/command-line-usage (and their types) from runtime deps and mark @swc/helpers as dev-only.

Reviewed changes

Copilot reviewed 2 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/bin/cli.ts Adds internal parseArgs wrapper and plain-text usage formatter replacing external CLI dependencies.
test/unit/bin/cli-tests.ts Adds unit tests covering greedy multi-flag parsing, unknown-flag behavior, and usage formatting.
src/bin/arrow2csv.ts Switches arrow2csv to the internal CLI helper for arg parsing and usage output.
bin/json-to-arrow.ts Migrates dev script arg parsing and usage rendering to the internal helper.
bin/integration.ts Migrates dev script arg parsing and usage rendering to the internal helper.
gulp/argv.js Replaces command-line-args with node:util.parseArgs for gulp task argument handling.
package.json Removes CLI libs from runtime deps and moves @swc/helpers to devDependencies.
package-lock.json Reflects dependency graph changes (CLI libs removed; @swc/helpers marked dev).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread gulp/argv.js
Comment on lines +18 to +41
import { parseArgs } from 'node:util';

const { values, tokens } = parseArgs({
options: {
all: { type: 'boolean' },
verbose: { type: 'boolean', short: 'v' },
target: { type: 'string', default: '' },
module: { type: 'string', default: '' },
coverage: { type: 'boolean', default: false },
tests: { type: 'string', multiple: true, default: ['test/unit/'] },
targets: { type: 'string', short: 't', multiple: true, default: [] },
modules: { type: 'string', short: 'm', multiple: true, default: [] },
},
args: process.argv.slice(2),
strict: false,
allowPositionals: true,
tokens: true,
});

// Expose unknown/positional tokens under `_unknown` for compatibility with
// the previous command-line-args partial mode used by gulp/test-task.js.
values._unknown = tokens
.filter((t) => t.kind === 'positional')
.map((t) => t.value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maorleger is the AI correct about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[JS] 'apache-arrow' adds a bunch of dev dependencies

3 participants