Skip to content

Python: add agent-framework-hosting-activity-protocol channel#5641

Open
eavanvalkenburg wants to merge 6 commits intomicrosoft:feature/python-hostingfrom
eavanvalkenburg:feat/hosting-channel-activity-protocol
Open

Python: add agent-framework-hosting-activity-protocol channel#5641
eavanvalkenburg wants to merge 6 commits intomicrosoft:feature/python-hostingfrom
eavanvalkenburg:feat/hosting-channel-activity-protocol

Conversation

@eavanvalkenburg
Copy link
Copy Markdown
Member

Motivation and Context

Implements the Activity Protocol channel described in SPEC-002 §7 (merged via #5549). This is the lower-level Bot Service / Activity Protocol surface — it covers any Activity-Protocol-speaking client (Teams, Direct Line, Web Chat, Slack via Bot Service, etc.).

A separate PR-5b adds a higher-level agent-framework-hosting-teams package built on the microsoft-teams-apps SDK for Teams-native affordances (Adaptive Cards, Citations, streaming via ctx.stream).

Description

Adds agent-framework-hosting-activity-protocol (python/packages/hosting-activity-protocol/):

  • ActivityProtocolChannel — mounts POST /activity (configurable), validates Bot Framework JWT, decodes Activity Protocol payloads into ChannelRequest, replies as outbound activities.
  • Token validation + identity derivation from the Activity sender.
  • Tests for channel wiring, JWT path, and request/response shaping.

This is the rename target of the previously-named hosting-teams package — the directory rename frees hosting-teams for the Teams-SDK-based channel in PR-5b. TeamsChannelActivityProtocolChannel, name="teams""activity", default mount /teams/activity.

Stack

PR-5a of 9. Depends on #PR-2 (feat/hosting-core). Independent of PR-5b — the two Teams-related packages are intentionally separate (different audiences, see PR-5b for the comparison table).

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? No — new package (the previous hosting-teams package was never released).

@moonbox3 moonbox3 added documentation Improvements or additions to documentation python labels May 5, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 2 | Confidence: 77%

✓ Test Coverage

The activity-protocol channel's most complex code path — _stream_to_conversation (~60 lines of concurrent async task logic with rate-limited edits) — has zero test coverage. The _make_teams helper accepts a stream parameter but no test ever passes stream=True, and the _FakeAgent fixture always returns a plain coroutine rather than a ResponseStream, meaning streaming cannot be exercised with the current test infrastructure. The run_hook and stream_transform_hook extensibility points are also untested. Non-streaming paths have reasonable coverage with meaningful assertions.

✓ Design Approach

No actionable issues found in this dimension.


Automated review by eavanvalkenburg's agents

Comment thread python/packages/hosting-activity-protocol/tests/test_channel.py
@eavanvalkenburg eavanvalkenburg force-pushed the feat/hosting-channel-activity-protocol branch from 6ace616 to 1a97b16 Compare May 5, 2026 09:00
New ``agent-framework-hosting`` package implementing ADR 0026 / SPEC-002:
the channel-neutral host that lets a single ``Agent`` (or ``Workflow``)
fan out across multiple wire protocols ("channels") behind one Starlette
ASGI app.

Surface (re-exported from ``agent_framework_hosting``):

- ``AgentFrameworkHost`` — wraps a hostable target, mounts channels onto
  an ASGI app, owns per-isolation-key ``AgentSession`` reuse, threads
  request context (``response_id`` / ``previous_response_id``) into
  context providers via an ``ExitStack`` of ``bind_request_context``
  calls, and exposes an opt-in Hypercorn ``serve()`` helper (extra
  ``[serve]``).
- ``Channel`` protocol + ``ChannelContribution`` — the surface a channel
  package implements (routes, lifespans, identity hooks, …).
- ``ChannelRequest`` / ``ChannelSession`` / ``ChannelIdentity`` /
  ``ChannelPush`` / ``ChannelCommand[Context]`` / ``ChannelRunHook`` /
  ``ChannelStreamTransformHook`` / ``DeliveryReport`` /
  ``HostedRunResult`` / ``ResponseTarget`` / ``ResponseTargetKind`` /
  ``apply_run_hook`` — channel-side dataclasses + helpers.
- ``IsolationKeys`` + ``ISOLATION_HEADER_USER`` / ``..._CHAT`` +
  ``get/set/reset_current_isolation_keys`` — the host's ASGI middleware
  reads the ``x-agent-{user,chat}-isolation-key`` headers off each
  inbound request and exposes them to the agent stack via a
  ``ContextVar`` so storage-side providers (e.g.
  ``FoundryHostedAgentHistoryProvider``) can apply per-tenant
  partitioning without channels having to forward anything.

Includes 45 unit tests covering the host, channel contributions,
isolation contextvar, and shared types. Registers the package in
``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching
pyright ``executionEnvironments`` entry for tests.

Hypercorn is an optional dependency (``[serve]`` extra); the soft import
in ``serve()`` is annotated for pyright since it isn't on the default
install.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@eavanvalkenburg eavanvalkenburg force-pushed the feat/hosting-channel-activity-protocol branch from 1a97b16 to e8616b7 Compare May 5, 2026 09:10
Source-code changes
- _suppress_already_consumed: narrow contract — RuntimeError now logs
  at WARNING with exc_info; non-RuntimeError still logs at exception().
  Docstring clarifies that any non-clean teardown is observable.
- _BoundResponseStream: add aclose() and route __await__ through
  get_final_response() so the binding is always released — fixes
  contextvar leak when channels abandon the stream or use the
  await-the-stream convenience.
- Lifespan: aggregate startup/shutdown callback errors; every callback
  runs, all failures are logged with their qualname, and the first
  error is re-raised so Starlette still aborts boot.
- _build_run_kwargs: switch session-cache write to dict.setdefault so
  concurrent racers cannot orphan a session if create_session ever
  yields.
- _deliver_response: introduce DeliveryReport.failed for push outages
  vs explicit "no link" drops; an outage no longer triggers an
  originating fallback so the channel can decide degraded behaviour.

Test additions
- tests/test_isolation.py (new): full coverage of IsolationKeys, the
  contextvar helpers, header constants, and end-to-end ASGI
  middleware lift / reset / passthrough.
- tests/test_host.py: TestBindRequestContext, TestBoundResponseStream
  (aclose / __await__ / __getattr__ forwarding / double-close
  idempotency), TestWrapInputListMessages (list[Message] LAST
  precedence), TestLifespanAggregation (startup + shutdown).
- tests/test_types.py: TestApplyRunHook (sync/async/None), and
  TestDeliveryReport (new failed field).
- Updated test_push_exception_marks_skipped ->
  test_push_exception_lands_in_failed_no_fallback to match the new
  delivery contract.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread python/packages/hosting-activity-protocol/tests/test_channel.py
Comment thread python/packages/hosting-activity-protocol/tests/test_channel.py
eavanvalkenburg and others added 4 commits May 7, 2026 16:09
- Refactor workflow checkpoint restoration into shared helpers
  (_restore_workflow_checkpoint for blocking; the streaming sibling
  drains the rehydration stream) so the blocking and streaming paths
  rehydrate identically — clarifies the previously inline _maybe_restore
  by hoisting the pattern next to the blocking call site.
- Document that blocking workflow output is text-only by design;
  richer modalities ride the streaming AgentResponseUpdate channel,
  which preserves all content parts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These review comments were filed on PR-4 (microsoft#5640) but target lines that
live in the hosting-core package (PR-2 / microsoft#5638), so the fixes land here
and PR-4's stack will pick them up on rebase.

- _suppress_already_consumed: narrow the RuntimeError catch to the two
  documented benign messages (`Inner stream not available`, `Event loop
  is closed`); any other RuntimeError now logs at ERROR with a full
  traceback so executor bugs / runner-context state errors / checkpoint
  RuntimeErrors during the post-run flush no longer masquerade as
  benign cleanup noise. Still no propagation (we're in an
  async-generator finally during teardown) — see the docstring.
- _restore_workflow_checkpoint{,_streaming}: log a WARNING when a
  non-None latest checkpoint drains to zero events, so a stale or
  partially-written checkpoint_id surfaces as an operator signal
  instead of a silent state-loss.

(The `deliver_response` "no destinations resolvable" vs "every
destination errored" concern raised in 3198268038 is already addressed
by the existing `failed` vs `skipped` distinction surfaced through
`DeliveryReport.failed` — see lines 1080-1102 and the
`DeliveryReport` docstring.)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…vityProtocolChannel

The existing Bot-Framework-via-Azure-Bot-Service channel was previously
shipped under the name ``hosting-teams`` / ``TeamsChannel``. That name
is misleading for what the channel actually does -- it speaks the Bot
Framework Activity Protocol against Azure Bot Service, which fans out
across MS Teams, Slack, Webex, Telegram-via-Bot-Service, etc., and does
not provide any Teams-specific affordances.

This PR renames the package atomically and frees the ``hosting-teams``
name for a future Teams-native channel built on
``microsoft-teams-apps`` (PR-5b, spec req microsoft#28).

Renames (all in one commit):

- Package: ``agent-framework-hosting-teams`` ->
  ``agent-framework-hosting-activity-protocol``
- Module: ``agent_framework_hosting_teams`` ->
  ``agent_framework_hosting_activity_protocol``
- Channel class: ``TeamsChannel`` -> ``ActivityProtocolChannel``
- Helper: ``teams_isolation_key`` -> ``activity_protocol_isolation_key``
  (isolation key prefix ``teams:`` -> ``activity:``)
- Channel name: ``"teams"`` -> ``"activity"``; default mount path
  ``/teams`` -> ``/activity``
- Internal helper: ``_parse_teams_activity`` -> ``_parse_activity``
- Worker task name + a couple of error strings updated for consistency

Updates README.md and the module docstring to call out:

- this is the channel-neutral Activity Protocol channel,
- it surfaces what every Bot-Service-connected channel has in common
  (text in / text out),
- a forthcoming ``agent-framework-hosting-teams`` package will layer
  Teams-specific affordances (adaptive cards, message extensions,
  dialogs, SSO, ...) on the same Bot Service transport.

Workspace: registers ``agent-framework-hosting-activity-protocol`` in
``python/pyproject.toml`` and adds the matching pyright
``executionEnvironments`` entry.

Behavior is unchanged. Pyright + mypy clean, 11 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- security (#3198327004): add `service_url_allowed_hosts` constructor
  option (default `botframework.com` + `smba.trafficmanager.net`) and
  reject inbound activities whose `serviceUrl` host falls outside it
  with HTTP 400 — without this gate a malicious caller could redirect
  outbound replies (and the attached bearer token) to an
  attacker-controlled host
- security (#3198324219): add `inbound_auth_validator` async callback;
  log a loud WARNING at startup when no validator AND no operator
  reverse-proxy is configured so the dev-mode bypass cannot
  accidentally ship to production. Document the contract: prototype
  intentionally does not ship JWT validation (out of scope); operators
  must plug a validator or terminate auth in front of the channel
- retry semantics (#3198328746): distinguish transient outbound
  failures (httpx network errors, non-2xx from Bot Service) — return
  502 so Bot Service retries — from deterministic agent failures —
  return 200 so Bot Service does not retry the same broken activity
  in a loop
- bug (#3198330424): fix the placeholder-failure deadlock. When
  `send_initial_placeholder` fails, `activity_id` stays `None`, the
  edit-worker loop exit condition (`accumulated == last_sent`) is
  unreachable while no PUT is possible, and the worker would deadlock
  on `wake.wait()` forever after `worker_done` is set. Now: skip the
  worker entirely on placeholder failure and POST a single final
  activity at the end with whatever accumulated
- tests (#3198334465, #3187178091, #3198336045): add coverage for
  - `_is_service_url_allowed` allow/deny matrix + webhook 400 on
    disallowed serviceUrl
  - `inbound_auth_validator` allow/deny/raises paths
  - outbound `Authorization: Bearer <token>` header presence in
    production mode and absence in dev mode
  - the streaming path (`_stream_to_conversation`): placeholder +
    final edit, placeholder-failure fallback (with timeout guard
    against deadlock regression), and empty-stream `(no response)`
    placeholder replacement
  - retry-signal differentiation: outbound `httpx.ConnectError` →
    502; deterministic `ValueError` from the agent → 200

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@eavanvalkenburg eavanvalkenburg force-pushed the feat/hosting-channel-activity-protocol branch from e8616b7 to f60466c Compare May 7, 2026 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants