Skip to content

Add retroactive OTel execution lifecycle tracing#252

Open
morgan-wowk wants to merge 1 commit into
masterfrom
execution-tracing-core
Open

Add retroactive OTel execution lifecycle tracing#252
morgan-wowk wants to merge 1 commit into
masterfrom
execution-tracing-core

Conversation

@morgan-wowk
Copy link
Copy Markdown
Collaborator

@morgan-wowk morgan-wowk commented May 22, 2026

Summary

Emits a root execution span and one execution.status child span per
status history entry when an ExecutionNode reaches a terminal state. All
span timestamps are derived from the existing status history so durations
reflect actual time spent, not when this code ran.

  • New module: cloud_pipelines_backend/instrumentation/execution_tracing.py
  • Hook: metrics._handle_before_commit calls try_emit_execution_trace
  • Orchestrator: otel.setup_providers() so the exporter is active
  • Tests: InMemorySpanExporter-backed suite in tests/instrumentation/

Screenshots

Screenshot 2026-05-22 at 8.09.01 PM.png

Screenshot 2026-05-22 at 8.13.33 PM.png

Screenshot 2026-05-22 at 8.14.49 PM.png

Emits a root 'execution' span and one 'execution.status' child span per
status history entry when an ExecutionNode reaches a terminal state. All
span timestamps are derived from the existing status history so durations
reflect actual time spent, not when this code ran.

- New module: cloud_pipelines_backend/instrumentation/execution_tracing.py
- Hook: metrics._handle_before_commit calls try_emit_execution_trace
- Orchestrator: otel.setup_providers() so the exporter is active
- Tests: InMemorySpanExporter-backed suite in tests/instrumentation/
@morgan-wowk morgan-wowk force-pushed the execution-tracing-core branch from e9fe6e0 to 059c47b Compare May 23, 2026 02:57
@morgan-wowk morgan-wowk marked this pull request as ready for review May 23, 2026 03:22
@morgan-wowk morgan-wowk requested a review from Ark-kun as a code owner May 23, 2026 03:22
def _ns(*, dt: datetime.datetime) -> int:
"""Return *dt* as nanoseconds since the Unix epoch (required by OTel SDK).

Uses integer arithmetic on timedelta components to avoid float64 precision
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure timedelta.total_seconds induces loss of precision... Can you elaborate?

But not a blocker.


root.end(end_time=_ns(dt=last_time))
except Exception:
_logger.warning(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to do _logger.exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants