Skip to content

feat: otel thread ctx FFI#1915

Open
yannham wants to merge 13 commits intomainfrom
yannham/otel-thread-ctx-ffi
Open

feat: otel thread ctx FFI#1915
yannham wants to merge 13 commits intomainfrom
yannham/otel-thread-ctx-ffi

Conversation

@yannham
Copy link
Copy Markdown
Contributor

@yannham yannham commented Apr 23, 2026

What does this PR do?

This PR adds a basic FFI for the OTel thread-level context feature: create a new context, attach, detach, and update in place.

We also make ThreadContextRecord public, or at least exposed in the FFI. The rationale is that:

  1. it's imposed by the spec, so it should not be a liability regarding breaking changes: we can't really touch it anyway.
  2. as mentioned in the doc of the FFI, there's a potential for SDK updating themselves the contexts without going through libdatadog at all after publication. In this usage mode, the export of the C struct ThreadContextRecord is a way to document its expected memory layout.
Generated C header
// Copyright 2026-Present Datadog, Inc. https://www.datadoghq.com/
// SPDX-License-Identifier: Apache-2.0


#ifndef DDOG_OTEL_THREAD_CTX_H
#define DDOG_OTEL_THREAD_CTX_H

#pragma once

#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>

/**
 * In-memory layout of a thread-level context.
 *
 * **CAUTION**: The structure MUST match exactly the OTel thread-level context specification.
 * It is read by external, out-of-process code. Do not re-order fields or modify in any way,
 * unless you know exactly what you're doing.
 *
 * # Synchronization
 *
 * Readers are async-signal handlers. The writer is always stopped while a reader runs.
 * Sharing memory with a signal handler still requires some form of synchronization, which is
 * achieved through atomics and compiler fence, using `valid` and/or the TLS slot as
 * synchronization points.
 *
 * - The writer stores `valid = 0` *before* modifying fields in-place, guarded by a fence.
 * - The writer stores `valid = 1` *after* all fields are populated, guarded by a fence.
 * - `valid` starts at `1` on construction and is never set to `0` except during an in-place
 *   update.
 */
typedef struct ddog_ThreadContextRecord {
  /**
   * Trace identifier; all-zeroes means "no trace".
   */
  uint8_t trace_id[16];
  /**
   * Span identifier.
   */
  uint8_t span_id[8];
  /**
   * Whether the record is ready/consistent. Always set to `1` except during in-place update
   * of the current record.
   */
  uint8_t valid;
  uint8_t _reserved;
  /**
   * Number of populated bytes in `attrs_data`.
   */
  uint16_t attrs_data_size;
  /**
   * Packed variable-length key-value records.
   *
   * It's a contiguous list of blocks with layout:
   *
   * 1. 1-byte `key_index`
   * 2. 1-byte `val_len`
   * 3. `val_len` bytes of a string value.
   *
   * # Size
   *
   * Currently, we always allocate the max recommended size. This potentially wastes a few
   * hundred bytes per thread, but it guarantees that we can modify the context in-place
   * without (re)allocation in the hot path. Having a hybrid scheme (starting smaller and
   * resizing up a few times) is not out of the question.
   */
  uint8_t attrs_data[ddog_MAX_ATTRS_DATA_SIZE];
} ddog_ThreadContextRecord;

#ifdef __cplusplus
extern "C" {
#endif // __cplusplus

/**
 * Allocate and initialise a new thread context.
 *
 * Returns a non-null owned handle that must eventually be released with
 * `ddog_otel_thread_ctx_free`.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_new(const uint8_t (*trace_id)[16],
                                                          const uint8_t (*span_id)[8],
                                                          const uint8_t (*local_root_span_id)[8]);

/**
 * Free an owned thread context.
 *
 * # Safety
 *
 * `ctx` must be a valid non-null pointer obtained from `ddog_otel_thread_ctx_new` or
 * `ddog_otel_thread_ctx_detach`, and must not be used after this call. In particular, `ctx`
 * must not be currently attached to a thread.
 */
void ddog_otel_thread_ctx_free(struct ddog_ThreadContextRecord *ctx);

/**
 * Attach `ctx` to the current thread. Returns the previously attached context if any, or null
 * otherwise.
 *
 * # Safety
 *
 * `ctx` must be a valid non-null pointer obtained from this API. Ownership of `ctx` is
 * transferred to the TLS slot: the caller must not drop `ctx` while it is still actively
 * attached.
 *
 * ## In-place update
 *
 * The preferred method to update the thread context in place is [ddog_otel_thread_ctx_update].
 *
 * If calling into native code is too costly, it is possible to update an attached context
 * directly in-memory without going through libdatadog (contexts are guaranteed to have a
 * stable address through their lifetime). **HOWEVER, IF DOING SO, PLEASE BE VERY CAUTIOUS OF
 * THE FOLLOWING POINTS**:
 *
 * 1. The update process requires a [seqlock](https://en.wikipedia.org/wiki/Seqlock)-like
 *    pattern: [ThreadContextRecord::valid] must be first set to `0` before the update and set
 *    to `1` again at the end. Additionally, depending on your language's memory model, you
 *    might need specific synchronization primitives (compiler fences, atomics, etc.), since
 *    the context can be read by an asynchronous signal handler at any point in time. See the
 *    [Otel thread context
 *    specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4947)
 *    for more details.
 * 2. Only update the context from the thread it's attached to. Contexts are designed to be
 *    attached, written to and read from on the same thread (whether from signal code or
 *    program code). Thus, they are NOT thread-safe. Given the current specification, I don't
 *    think it's possible to safely update an attached context from a different thread, since
 *    the signal handler doesn't assume the context can be written to concurrently from another
 *    thread.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_attach(struct ddog_ThreadContextRecord *ctx);

/**
 * Remove the currently attached context from the TLS slot.
 *
 * Returns the detached context (caller now owns it and must release it with
 * `ddog_otel_thread_ctx_free`), or null if the slot was empty.
 */
struct ddog_ThreadContextRecord *ddog_otel_thread_ctx_detach(void);

/**
 * Update the currently attached context in-place.
 *
 * If no context is currently attached, one is created and attached, equivalent to calling
 * `ddog_otel_thread_ctx_new` followed by `ddog_otel_thread_ctx_attach`.
 */
void ddog_otel_thread_ctx_update(const uint8_t (*trace_id)[16],
                                 const uint8_t (*span_id)[8],
                                 const uint8_t (*local_root_span_id)[8]);

#ifdef __cplusplus
}  // extern "C"
#endif  // __cplusplus

#endif  /* DDOG_OTEL_THREAD_CTX_H */

Motivation

OTel thread-level context has been implemented in #1791 in order to provide better interop with the OTel eBPF profiler. The first user is supposed to be dd-trace-rs, but it turns out the dotnet SDK people are interested in using it as well (and eventually other non-Rust SDKs will use it and thus require an FFI).

Additional Notes

N/A

How to test the change?

There's a test to check that the TLS symbol is properly handled. For real usage, we plan to check when integrating in dotnet (or whichever is the first SDK to use it).

yannham and others added 6 commits April 22, 2026 18:18
Rust's cdylib linker emits a version script with `local: *` that hides
all non-Rust symbols, preventing `custom_labels_current_set_v2` from
appearing in the dynamic symbol table. Without a dynsym entry, external
readers (e.g. the eBPF profiler) cannot locate the thread-local slot.

Add a supplementary version script with an explicit `global:` entry for
the symbol, which takes precedence over the `local: *` wildcard. Also
force lld explicitly, since merging multiple version scripts is not
supported by GNU ld.

Also adds a temporary dummy FFI wrapper around `ThreadContext::attach`
to keep the TLSDESC access live during verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/yannham/otel-thread-ctx-ffi

Summary by Rule

Rule Base Branch PR Branch Change

Annotation Counts by File

File Base Branch PR Branch Change

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 21 21 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 56 56 No change (0%)
libdd-common 10 10 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 19 19 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 8 8 No change (0%)
libdd-trace-stats 1 1 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 198 198 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@datadog-datadog-prod-us1-2
Copy link
Copy Markdown

datadog-datadog-prod-us1-2 Bot commented Apr 23, 2026

Tests

Fix all issues with BitsAI or with Cursor

⚠️ Warnings

🧪 2 Tests failed

otel_thread_ctx_v1_in_dynsym from libdd-otel-thread-ctx-ffi::elf_properties   View in Datadog   (Fix with Cursor)
Test has failed
otel_thread_ctx_v1_tlsdesc_reloc from libdd-otel-thread-ctx-ffi::elf_properties   View in Datadog   (Fix with Cursor)
Test has failed

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 2b39efc | Docs | Datadog PR Page | Give us feedback!

@yannham yannham marked this pull request as ready for review April 23, 2026 14:48
@yannham yannham requested review from a team as code owners April 23, 2026 14:48
…mbol

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05868a50b9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

const SYMBOL: &str = "otel_thread_ctx_v1";

fn cdylib_path() -> PathBuf {
PathBuf::from(env!("CDYLIB_PROFILE_DIR")).join("liblibdd_otel_thread_ctx_ffi.so")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Point ELF tests at Cargo's deps output directory

During cargo test, Cargo emits cdylib artifacts under target/<profile>/deps, but cdylib_path() currently looks in target/<profile>. That means readelf is invoked on a non-existent file in normal test runs, so these new ELF-property tests fail even when the library is built correctly. Resolve the path from the deps directory (or otherwise discover the real artifact path) so the assertions run against the actual .so.

Useful? React with 👍 / 👎.

@yannham yannham requested a review from ivoanjo April 23, 2026 15:42
@yannham yannham force-pushed the yannham/otel-thread-ctx-ffi branch from e2bb632 to e98d81f Compare April 23, 2026 16:06
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yannham yannham force-pushed the yannham/otel-thread-ctx-ffi branch from e98d81f to 75add55 Compare April 23, 2026 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant