Skip to content

Validate module spec against JSON schema#7094

Open
pditommaso wants to merge 4 commits intomasterfrom
module-validate-schema
Open

Validate module spec against JSON schema#7094
pditommaso wants to merge 4 commits intomasterfrom
module-validate-schema

Conversation

@pditommaso
Copy link
Copy Markdown
Member

Summary

  • nextflow module validate now formally validates meta.yml against the upstream module JSON schema (nextflow-io/schemas/main/module/v1/schema.json) using com.networknt:json-schema-validator.
  • Schema validation runs between the structural check and the Nextflow-specific spec check; if it fails, validation stops there.
  • Hand-coded spec checks that the schema already covers (name/description required, per-param type/description required, TODO_TYPE placeholder) are removed — the schema is the single source of truth for those rules. Checks the current schema does not cover stay in ModuleSpec.validate(): version/license required, semver pattern, namespace/name regex, and the TODO_DESCRIPTION placeholder.
  • New --schema <url-or-path> flag on nextflow module validate to override the default schema location. Accepts an http(s):// URL, a file: URI, or a local file path.
  • Schema load failures abort with a clear message pointing the user at --schema as the escape hatch.

A follow-up PR against nextflow-io/schemas will tighten the schema (require version/license, add the semver and namespace/name patterns, reject TODO_DESCRIPTION) so the remaining hand-coded checks can also be removed.

Test plan

  • ./gradlew :nextflow:test --tests 'nextflow.module.ModuleSpecTest' --tests 'nextflow.module.ModuleSpecFactoryTest' --tests 'nextflow.module.ModuleSchemaValidatorTest' --tests 'nextflow.cli.module.CmdModuleValidateTest' — all green
  • New ModuleSchemaValidatorTest: pass, missing required field, invalid enum value, file: URI schema, hard-fail on missing schema
  • CmdModuleValidateTest rewritten to use a hermetic local schema fixture (no network in CI) and to split the missing-fields case into schema-level vs. nextflow-only scenarios
  • Manual: nextflow module validate <module> against a real module
  • Manual: nextflow module validate <module> --schema ./local-schema.json

Add formal JSON Schema validation for `nextflow module validate`,
backed by the upstream Nextflow module schema. Schema validation runs
between structure and Nextflow-specific spec checks, and overlapping
hand-coded checks (name/description required, per-param type/description
required, TODO type placeholder) are removed in favour of the schema as
single source of truth.

A `--schema` flag accepts a remote URL, a `file:` URI, or a local path
to override the default schema location; load failures abort with a
clear error.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 6027327
🔍 Latest deploy log https://app.netlify.com/projects/nextflow-docs-staging/deploys/69fa2033247b19000713ab8e

Comment thread modules/nextflow/src/main/groovy/nextflow/module/ModuleSpec.groovy Outdated
Comment thread modules/nextflow/src/main/groovy/nextflow/module/ModuleSchemaValidator.groovy Outdated
Copy link
Copy Markdown
Contributor

@jorgee jorgee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #7056, we added the $schema in meta.yml, but it is not taken into account in this PR as it is just considering using the DEFAULT_SCHEMA_URL or the URL provided by the user. Not a big deal now because both will be the same value, but thinking in the future, we could have meta.yml files pointing to a version which is not the default.

I was initially thinking the implementation on the PR is wrong, and we should use the schema in $schema. However, this could also point to a wrong schema and it will break the validation. @pditommaso have you considered it? What should we do in the case $schema is not the same as the DEFAULT_SCHEMA_URL. Should the validation fail, or should we ignore/warn and always validate through the DEFAULT_SCHEMA_URL (or the user specified one)?

pditommaso added 3 commits May 5, 2026 18:34
…sMap [ci fast]

Drop the duplicated literal so the default schema URL has a single source
of truth, per review feedback on #7094.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Replace the hardcoded SpecVersion.VersionFlag.V202012 with
SpecVersionDetector.detect, so the validator follows whatever draft
the schema declares via $schema and aborts with a clear message if
the draft is missing or unsupported. Per review feedback on #7094.

Decompose validate() into self-contained helpers (parseSchema,
detectSpecVersion, buildSchema, loadMeta), each with its own
contextual error handling.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants