feat(integrations): unify catalog into single JSON source of truth (JS + Python)#346
Conversation
… catalog JSON asset
Add a Python package (openhands_extensions) that reads the OAuth provider
catalog from the same JSON asset as the JS package, so Python services (e.g.
the integrations-hub backend) can import the catalog instead of vendoring a
hand-maintained snapshot.
Single source of truth: integrations/oauth-provider-catalog.json, generated
from the JS authoring source by
> @openhands/extensions@0.5.0 build:integration-catalog
> node scripts/build-integration-catalog.mjs
Generated /tmp/extensions/integrations/oauth-provider-catalog.json with 52 providers and 2 default managed connectors
(scripts/build-integration-catalog.mjs). The asset is embedded into both the
JS package (exported via package.json) and the Python package (bundled via
hatchling, loaded via importlib.resources).
Python API mirrors the JS API:
- list_oauth_provider_catalog() -> listOAuthProviderCatalog()
- get_oauth_provider_registration_defaults(slug) -> getOAuthProviderRegistrationDefaults()
- default_managed_connectors() -> snapshot defaultManagedConnectors
- hubspot_mcp_* / hubspot_*_scopes -> hubspot constants
- INTEGRATION_CATALOG_SNAPSHOT -> { providers, defaultManagedConnectors }
Tests:
- tests/test_integration_catalog_in_sync.py regenerates the JSON from the JS
source via Node and asserts the checked-in asset matches (catches drift),
plus asserts the Python package reads the same asset.
- tests/test_version_alignment.py asserts package.json, pyproject.toml, and
_version.py versions stay in lock-step (release-please bumps all three).
pyproject.toml is now an installable package (hatchling build backend) while
preserving the existing uv test group. CI (tests.yml) gains a Node setup step
so the parity test can run node.
Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
The sync-sdk-skill CI check regenerates skills/openhands-sdk/SKILL.md from the upstream software-agent-sdk repo. An upstream example was renamed (43_mixed_marketplace_skills -> 04_mixed_marketplace_skills) between the last main build and this PR, so the check failed on every PR opened today. Regenerating the file (script-generated, no manual edits) unblocks CI. Unrelated to the Python integrations package in this PR; included only to satisfy the repo-wide sync check. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable - The overall data flow is sensible: keep the catalog generated from the existing JS authoring source, bundle the JSON into the Python package, and add parity tests. However, two public-surface/release-automation issues need fixing before this is safe to merge.
[CRITICAL ISSUES]
- The runtime
__version__file is not actually wired into release-please, so the first release after this lands will leave the Python package version stale or force every release PR to fail the new alignment test. - The HubSpot helper names are exported as URL/scope values but implemented as callables, which is a confusing new public API and does not mirror the JS constants.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
This adds a new installable Python package and release/versioning surface. The catalog generation and parity tests reduce data-drift risk, and the newhatchlingbuild dependency is older than 7 days, but release automation and public API semantics need correction before merge.
VERDICT:
❌ Needs rework: Fix the release-please version bump wiring and clarify the HubSpot exported API before merging.
KEY INSIGHT:
The architecture is right, but the new Python package must have release automation and exported symbols that are as deterministic as the catalog data it serves.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Re-running build:skills so skills/index.js reflects the regenerated openhands-sdk SKILL.md (required by test_skills_catalog). Unrelated to the Python integrations package; included to keep the repo-wide sync checks green. Co-authored-by: openhands <openhands@all-hands.dev>
1. _version.py: derive __version__ from installed package metadata (importlib.metadata) so pyproject.toml is the single source of truth, and add python/openhands_extensions/_version.py to release-please extra-files so the fallback string stays in lock-step too. Avoids release-please leaving the runtime version stale. 2. HubSpot exports: expose module-level string/list constants (HUBSPOT_MCP_SERVER_URL, HUBSPOT_MCP_AUTHORIZATION_URL, HUBSPOT_MCP_TOKEN_URL, HUBSPOT_REQUIRED_SCOPES, HUBSPOT_OPTIONAL_SCOPES) instead of callables, to match the JS constant semantics and avoid the function-vs-value confusion. Add tests guarding the constants match the asset and are values (not callables). Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed both review comments in c1f77b0 (version now metadata-derived + release-please extra-files; HubSpot exports are now module-level constants). Threads resolved. Ready for another look on this SHA. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Summary
🟡 Acceptable, but one parity gap blocks merging.
The split-of-authoring/build/runtime story is the right shape: JS as authoring source, generated JSON as the single asset, Python package as a thin consumer that embeds the asset via importlib.resources. Both prior review threads (_version.py in release-please extra-files; HubSpot exports as plain values, not callables) are fully addressed, and the new parity test (test_integration_catalog_in_sync.py) is a strong regression guard against JS-asset drift.
One real bug remains: HUBSPOT_REQUIRED_SCOPES and HUBSPOT_OPTIONAL_SCOPES are empty lists at runtime, but the README documents them as mirroring the JS hubspotRequiredScopes / hubspotOptionalScopes, which carry real values. The root cause is that the JS authoring source keeps the HubSpot scopes as top-level exports rather than embedding them in registrationDefaults, so the generated JSON never carries them and the Python layer reads them as []. The new test_hubspot_constants_match_asset only asserts the Python constants match the (also empty) asset, so it passes while parity is broken.
Also: list_oauth_provider_catalog() returns the same cached dict on every call (functools.lru_cache(maxsize=1)), so a caller that mutates the result would corrupt the cache for everyone else. Worth either returning a copy or documenting the contract. Minor — not blocking on its own, but cheap to fix while you're touching this file.
Risk assessment
🟡 MEDIUM. Introduces a new public API that silently misrepresents itself (empty scopes where the docstring / README promise the actual scopes). Downstream consumers that follow the README and request no scopes will authenticate to HubSpot's MCP server without the documented oauth + crm.objects.contacts.read minimums — likely a silent loss of capability rather than an auth failure. Not a security issue; a correctness one.
Files reviewed
python/openhands_extensions/__init__.py,python/openhands_extensions/integrations.py,python/openhands_extensions/_version.pypyproject.toml(renamedextensions→openhands-extensions, hatchling backend,python>=3.12)integrations/oauth-provider-catalog.json(52 providers, 2defaultManagedConnectors)scripts/build-integration-catalog.mjs,package.json(build:integration-catalog,buildaggregate)release-please-config.json(_version.pycorrectly added toextra-files).github/workflows/tests.yml(Node.js 20 setup step for the catalog build)tests/test_integration_catalog_in_sync.py,tests/test_version_alignment.py- README Python section, regenerated
skills/index.jsandskills/openhands-sdk/SKILL.md
Verdict
❌ Needs rework. Fix the HubSpot scopes parity gap (preferred: embed the scopes in registrationDefaults so the JSON asset carries them; alternative: drop the empty Python constants until they can be populated) before merging. The shape of the package is sound — just one mismatch to close.
Key insight
Single-source-of-truth only works when every distinct constant the consumers need lives inside the serialized blob. Top-level JS exports that don't make it into the JSON silently disappear on the Python side; the parity test has to assert against the JS source (or against a JSON that includes them) to catch that.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
🔴 Taste Rating: Needs improvement - the package shape is close, but two generated/public API edges need fixing before this can safely merge.
[CRITICAL ISSUES]
- [python/openhands_extensions/integrations.py, Lines 22-32] Public API correctness:
__all__still advertises lowercase HubSpot names that no longer exist, sofrom openhands_extensions.integrations import *raises at import time. - [scripts/build-integration-catalog.mjs, Line 31] Single source of truth drift: the build script regenerates only the root JSON asset while the Python runtime reads a second committed copy under
python/openhands_extensions/; future catalog edits can pass the documented build command and still leave the wheel asset stale.
[TESTING GAPS]
- Add a smoke test for
from openhands_extensions.integrations import *so the exported public surface cannot drift from the actual symbols again. - Add coverage that the documented catalog build path updates every committed/bundled asset that the Python package will ship, or remove the duplicated asset and truly bundle the root JSON.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
This introduces a new published Python package and a cross-language generated catalog asset. The blast radius is moderate because downstream services will consume this as a public API and packaging/build drift would ship stale integration data. The new build dependency minimum (hatchling>=1.21) points to an old, established release, so I do not see a 7-day supply-chain hold issue.
VERDICT:
❌ Needs rework: Fix the broken module export list and make the catalog generation/bundling path actually maintain one generated asset for both JS and Python consumers.
KEY INSIGHT:
The core idea is sound, but the implementation still has two sources of truth at the exact boundary this PR is trying to eliminate.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Address all-hands-bot review on c1f77b0: - Remove all HubSpot-specific constants (HUBSPOT_MCP_*, HUBSPOT_*_SCOPES) from the Python public API and __init__ re-exports. The package now exposes only the generic, provider-agnostic catalog accessors that mirror the JS package (list_oauth_provider_catalog, get_oauth_provider_registration_defaults) plus the snapshot helpers. This eliminates the misleading empty-scope constants the bot flagged. - Fix __all__ in integrations.py: it still listed the old lowercase hubspot_* names, so `from openhands_extensions.integrations import *` raised AttributeError. __all__ now matches the exported symbols. - Cached accessors (list_oauth_provider_catalog, get_oauth_provider_registration_defaults, default_managed_connectors) now return independent deep copies so a caller mutating a returned value cannot corrupt the shared lru_cache for other callers. - Build script now writes BOTH asset locations (integrations/ and python/openhands_extensions/) from a single generation pass, so there is one source and one command. Previously it wrote only the root asset, leaving a second committed copy in the wheel that could drift. - Add test_python_asset_is_byte_identical_to_root_asset (drift guard) and test_accessors_return_independent_copies (mutability guard); drop the now-obsolete hubspot-constant tests. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — Clean design, all CI green, previous review feedback addressed. A few remaining improvement opportunities before this graduates from draft.
Summary of the change: Adds a openhands_extensions Python package that embeds integrations/oauth-provider-catalog.json and exposes the same catalog API as the JS package, so Python services can pip install and avoid maintaining their own snapshot. The single-source-of-truth design is sound, the importlib.resources + lru_cache approach is correct, and the test suite is solid.
What was fixed from previous reviews (all addressed in 2d3d6aa): ✅ HubSpot constants removed from public API, ✅ __all__ in integrations.py corrected, ✅ build script now writes both file locations, ✅ accessor functions return deepcopy so callers get independent copies.
Remaining issues (see inline comments):
INTEGRATION_CATALOG_SNAPSHOTis a direct reference to the LRU cache — mutating it corrupts the cache for all callers. Accessor functions are safely isolated but the snapshot itself isn't. Either expose a deepcopy or extend the test.test_get_oauth_provider_registration_defaults_round_tripusesprovider["registrationDefaults"](dict key access) — will KeyError if any future provider omits that field. Use.get()for consistency with the implementation.- PR description still mentions
hubspot_mcp_server_url/hubspot_required_scopesetc. that no longer exist. README.md is correct; update the PR body.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW
Purely additive change — new Python sub-package alongside the existing JS package. No modifications to existing public JS APIs. JSON asset is read-only at runtime. All CI checks pass.
VERDICT:
✅ Worth merging once the draft is lifted: core logic is sound. The INTEGRATION_CATALOG_SNAPSHOT mutability and test robustness points should be addressed before promoting to ready-for-review.
KEY INSIGHT: INTEGRATION_CATALOG_SNAPSHOT exposes the live LRU-cache dict directly — the isolation story for callers is complete at the accessor level but not at the module-level constant.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
Folds extensions#345 into this PR and generalizes it per the project's "no provider-specific code" direction. TS and Python now expose an exactly parallel, provider-agnostic API. JS (integrations): - Remove all hubspot* named exports: hubspotMcpServerUrl, hubspotMcpAuthorizationUrl, hubspotMcpTokenUrl, hubspotRequiredScopes, hubspotOptionalScopes (and the #345-added hubspotLegacyScopeBundle / hubspotLegacyScopeBundleWithoutOauth / hubspotManagedConnectorMigration consts, which are no longer needed). The hubspot MCP server URL is inlined into officialManagedMcpServerUrls and the registrationDefaults authorizationUrl/tokenUrl literals. - Declare the managed-connector migration descriptor + errorHints as INLINE DATA on the hubspot registrationDefaults entry (the generic managedConnectorMigration / errorHints fields), exactly the shape the integrations hub consumes (canonicalServerUrl + legacyScopeBundles. {required,optional,union,unionWithoutOauth}). No hubspot-named module symbols remain. - index.js / index.d.ts: drop the hubspot* re-exports/declarations; add the generic errorHints / managedConnectorMigration fields and the ManagedConnectorMigration interface (so any provider can declare one). Tests: - Replace #345's hubspot-specific test (which imported hubspot* symbols) with a provider-agnostic invariant test: every provider declaring a managedConnectorMigration has a well-formed descriptor (https canonical URL, all four bundles as string arrays, union == required+optional, unionWithoutOauth == union minus "oauth", errorHints declared). Asset: - Regenerated integrations/oauth-provider-catalog.json and the Python copy (byte-identical, single build pass). The hubspot entry now carries the migration descriptor + errorHints; scopes stays [] to preserve the hub's legacy-detection semantics. Supersedes extensions#345. The integrations-hub (#274) reads this via the generic accessors only — it never imported the removed hubspot* symbols. Co-authored-by: openhands <openhands@all-hands.dev>
Address all-hands-bot review on 769f0c7: - INTEGRATION_CATALOG_SNAPSHOT is now a copy.deepcopy of the cached snapshot at module load, so mutating it cannot corrupt the shared cache backing list_oauth_provider_catalog / get_oauth_provider_registration_defaults / default_managed_connectors. - test_accessors_return_independent_copies extended to assert mutating INTEGRATION_CATALOG_SNAPSHOT does not leak into the accessors. - test_get_oauth_provider_registration_defaults_round_trip now uses provider.get('registrationDefaults') to match the implementation's .get() (returns None when absent), guarding future catalog additions. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
Co-authored-by: openhands <openhands@all-hands.dev>
251ca21 to
33c54a1
Compare
…s into integrations The integration catalog is now a single hand-authored JSON asset (integrations/integration-catalog.json), not generated from any .mjs/.js source. Both the JS and Python packages read it at runtime. Changes: - Merge the top-level providers[] array into each integration as a minimal oauthProvider override (null when the integration has no OAuth provider). Only fields whose OAuth-context values differ from the integration's connector view (description, docsUrl, popularityRank) are stored; slug==id, availability==catalogStatus, and the rest were pure duplication. - Drop derived fields from the asset (supportsMcp/supportsOauth, providers[], defaultManagedConnectors) and derive them at runtime in index.js / integrations.py. Verified the reconstructed providers[] and defaultManagedConnectors are byte-identical to the previous output (0/52 provider mismatches; DMC content+order preserved via defaultManagedConnectorSlugs). - Remove the mjs/js authoring sources + build script (integration-catalog-source.mjs, oauth-provider-catalog-source.mjs, oauth-provider-registration-defaults-source.js, scripts/build-integration-catalog.mjs) and the build:integration-catalog npm script. - Create integrations/catalog/<id>.json for all 82 integrations (adds the 12 oauth-only entries that previously had no file); a CI parity test asserts every catalog file equals its master entry. - Update index.d.ts (oauthProvider type, derived flags optional), the Python module, README/integrations README, and tests. - JS<->Python parity confirmed across integrations, providers, defaultManagedConnectors, filters, and all 52 registration-defaults lookups. Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Summary
This PR makes
integrations/catalog/<id>.jsonthe single hand-authored source of truth for integration catalog data.There is no longer any
integrations/integration-catalog.jsonaggregate asset. The JS package uses a generated static import index over the individual JSON files, and the Python wheel packages/reads those same individual JSON files directly. Provider-specific knowledge lives only in the service's one JSON file, not runtime code or separate provider files.What changed
integrations/catalog/*.jsonis the only manual place to add or edit an integration.integrations/integration-catalog.jsonpython/openhands_extensions/integration-catalog.jsonintegrations/catalog-index.jsfor JS static imports, including an auto-generated warning header with the regeneration command.integrations/catalog/*.jsonas wheel data and reads those individual files directly.listIntegrationCatalog({ mcp, oauth })/getIntegrationCatalogEntrylist_integration_catalog(mcp=, oauth=)/get_integration_catalog_entryavailabilityandruntimeAvailability; every catalog entry with connection options is treated as available.Validation
npm run build:integrationsuv run pytest tests/test_integration_catalog_in_sync.py -q→13 passeduv run pytest -q→359 passed, and wheel inspection confirmed 82 individual catalog JSON files with 0 aggregateintegration-catalog.jsonfiles.57c7f72and validated:Related PRs
This PR description was updated by an AI agent (OpenHands) on behalf of the user.