Skip to content

feat: replace v1 custom labels with OTEP 4947 TLS record format#1

Open
scottgerring wants to merge 12 commits intomasterfrom
feat/otel-ctx-v1
Open

feat: replace v1 custom labels with OTEP 4947 TLS record format#1
scottgerring wants to merge 12 commits intomasterfrom
feat/otel-ctx-v1

Conversation

@scottgerring
Copy link
Copy Markdown
Owner

@scottgerring scottgerring commented Mar 31, 2026

Summary

  • Replaces v1 string-based labelset format with OTEP 4947 compact TLS record format (trace context + inline attributes)
  • Adds process context module (OTEP 4719) for cross-process context sharing via memfd/mmap
  • Adds OTel TLS Rust module with record builder, reader, and key table configuration

@scottgerring scottgerring changed the title feat: replace v1 custom labels with v2 TLS record format feat: replace v1 custom labels with OTEP 4947 TLS record format Mar 31, 2026
Comment thread src/customlabels.h
@@ -1,165 +1,83 @@
/**
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we forked originally, we grabbed this when it was still C. I've switched back to C because we don't need anything from C++ and it keeps things simpler.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++-related incantations might come in handy if someone wants to pull this file from a C++ project (and they might not want to have to pull in a C compiler just for these bits), so I can see some usefulness in them eventually (the process context reference impl does this).

Having said that, we can re-add it any time so I think we're good with the simplification for now. (And in particular, if we want to keep it clean in both C and C++ we probably want a setup like the process context where we test-build with both, as I found that it's very easy to do a change that makes building with a C++ compiler unhappy)

Comment thread js/addon.cpp
@@ -1,281 +0,0 @@
// addon.c
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all the js stuff as its out of scope for us

Comment thread examples/spin/Cargo.toml
@@ -0,0 +1,13 @@
[package]
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had to make this a standalone crate so we can emit the build instructions

Copy link
Copy Markdown

@yannham yannham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed by commit, so apologies of some of them are outdated/irrelevant.

Comment thread src/customlabels.c Outdated
Comment thread src/customlabels.c Outdated
Comment thread src/customlabels.c Outdated
Comment thread src/customlabels.c
) {
custom_labels_tl_record_t *old_record = otel_thread_ctx_v1;
BARRIER;
otel_thread_ctx_v1 = new_record;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this store need to be either volatile or atomic (or rather, should the otel_thread_ctx_v1 be a volatile sig_atomic_t ?) Otherwise I believe the store might be optimized away, or the signal handler might come in the middle of a non-atomic store to the variable (all of that is quite theoretical: we do read back the value in our case, so I don't think it'll be optimized away, and usize stores are atomic on all reasonable platforms, but I've grown very afraid of what advanced optimizing compilers can do 😅 )

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I don't think it'll be optimized away, and usize stores are atomic on all reasonable platforms
Famous last words and no harm in being overly cautious here I think!

@yannham : It looks like sig_atomic_t is typically a 32 bit integer? I've tried __thread _Atomic(custom_labels_tl_record_t *) otel_thread_ctx_v1 . I think this captures the spirit of your comment. Unfortunately, bindgen chokes on this, but we don't use this from the Rust side of things, so I can use the preprocessor to hide it when bindgen runs.

This was a bit more hectic than I expected; lmk what you think!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered on another channel, but short answer, this works as well 🙂

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need any barrier here? Either the reader observes the write before or after; this is a pointer after all, and we document the semantics as "equivalent to a signal handler" (e.g. a thread observes its own things).

Copy link
Copy Markdown

@yannham yannham Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: I think some barrier is required (or some ordering), at least it would be in Rust. But I agree that the second barrier is not needed.

The semantics of signal handlers is a bit different than code running of the same program running in the same thread, basically because the compiler is oblivious to it. So what could happen is something like:

    custom_labels_tl_record_t *record = malloc(...);
    record->xxx = yyy;
    // other stores...
    custom_labels_set_current_record(record);

the function (say without the barrier) then gets inlined, so the compiler sees:

    custom_labels_tl_record_t *record = malloc(...);
    record->xxx = yyy;
    // other stores...
    custom_labels_tl_record_t *old_record = otel_thread_ctx_v1;
    otel_thread_ctx_v1 = record;

Those are just a bunch of stores with no atomics around (there's actually one read for old_record, but it doesn't change the reasoning), so nothing prevents them from being re-ordered, say as:

    custom_labels_tl_record_t *record = malloc(...);
    custom_labels_tl_record_t *old_record = otel_thread_ctx_v1; // moved up
    otel_thread_ctx_v1 = record; // moved up
    record->xxx = yyy;
    // other stores...

Now if your signal handler comes in at L3 it will see an uninitialized record.

Using an atomic with relaxed order isn't sufficient because it doesn't create any synchronization. The compiler is still free to re-order stuff past or before it. However a release barrier just before setting the pointer (as the first BARRIER here), or setting otel_thread_ctx_v1 atomically with release order should prevent that. With the caveat that I don't know what's the exact semantics of C's BARRIER here, but if it is at least as strong as a C++ compiler-only release fence, this should be good.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, good call, I missed that.

In theory the reader would be able to adapt (since it's expected to use safe memory reading APIs), and when it tried to interpret the data it would probably realize it doesn't make sense. In practice it's much better if we avoid that happening altogether, and the cost of this barrier should be really low so it's definitely worth it.

Comment thread src/reader.rs Outdated
Comment thread src/writer.rs Outdated
Comment thread src/process_context/reader.rs Outdated
Comment thread src/process_context/reader.rs Outdated
Comment thread src/process_context_ext.rs Outdated
Comment thread src/process_context/reader.rs Outdated
Copy link
Copy Markdown

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a pass but didn't finish. Need to switch gears for a bit!

Comment thread .gitignore
@@ -1,4 +1,5 @@
/target
.idea
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THE FORBIDDEN DATADOG FRUIT 👀 (Sorry couldn't resist)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True story: I won a jetbrains license at a conference last week, and immediately used it to restore my access to rust rover

Comment thread build.rs Outdated
Comment on lines +12 to +14
// dynamic-list is Linux-only
#[cfg(target_os = "linux")]
println!("cargo:rustc-link-arg=-Wl,--dynamic-list=./dlist");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is slightly weird -- isn't all of custom-labels linux-only?

Copy link
Copy Markdown
Owner Author

@scottgerring scottgerring Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it is. I thought I needed this at some point to have some modicum of LSP/build access on mac but it doesn't seem necessary anymore so it can definitely go.

Comment thread Cargo.toml
Comment on lines 1 to 10
[package]
name = "custom-labels"
version = "0.4.5"
version = "0.5.0"
edition = "2021"
license = "Apache-2.0"
keywords = ["profiling", "pprof", "polarsignals", "parca"]
description = "Custom labels for profilers"
homepage = "https://polarsignals.com"
repository = "https://github.com/polarsignals/custom-labels"
readme = "README.md"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe update these? E.g. maybe call it custom-labels-otep-wip or something like that? (and then change the metadata)

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think custom-labels really works now that it's "labels and some fixed metadata"; similar to the TLS var rename.

How about otel-thread-ctx

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! We can always adjust later if needed (e.g. before this becomes "stable")

Comment thread README.md

## Acknowledgements

* This is a fork of [polarsignals/custom-labels](https://github.com/polarsignals/custom-labels), extended to support the formalisation of the thread-local context sharing mechanism in [OTEP 4947](https://github.com/open-telemetry/opentelemetry-specification/pull/4947).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could call this out in a bit of a louder form at the beginning of the README. Something along the lines of

"This is a temporary fork of https://github.com/polarsignals/custom-labels to match up with the open-telemetry/opentelemetry-specification#4947 . It is experimental. This work will be incorporated/moved back upstream once the OTEP is accepted. Thanks to Polar signals for providing a great shoulder for us to stand on -- this work is (C) Polar Signals ..."

...or something like that?

Comment thread custom-labels-v1.md Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spotted a few more files that we should probably remove: GOVERNANCE.md, tsconfig.json. The Makefile is also outdated, although I'd also suggest removing and we can always re-add later.

Comment thread src/process_context/encoding.rs Outdated
Comment on lines +514 to +524
mod tests {
use super::*;

#[test]
fn test_roundtrip_string_values() {
let ctx = ProcessContext::new()
.with_resource("service.name", "my-service")
.with_resource("service.version", "1.2.3")
.with_resource("deployment.environment", "production");

let encoded = encode(&ctx).unwrap();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'll be honest and say I didn't look at the encoder very closely -- I think if we do change to a full decoder, the tests below kinda have us covered for "output of encoder is valid protobuf".

Comment thread src/process_context/model.rs Outdated
Comment on lines +7 to +8
/// Maximum varint value (14 bits)
pub const UINT14_MAX: u16 = 16383;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I think this could be local to the encoder (this limitation is actually in the encoder itself, not protobuf)

Comment on lines +17 to +27
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Value {
/// String value (AnyValue.string_value)
String(String),
/// Integer value (AnyValue.int_value)
Int(i64),
/// Key-value list (AnyValue.kvlist_value)
KvList(Vec<KeyValue>),
/// Array value (AnyValue.array_value)
Array(Vec<Value>),
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Since this implementation exists only for the thread context, I'd probably just simplify it even more -- support only strings and arrays which is actually the only thing we're making use of right now.

Comment thread src/process_context/writer.rs Outdated
Comment on lines +6 to +39
#[cfg(target_os = "linux")]
use std::ptr;
#[cfg(target_os = "linux")]
use std::sync::atomic::{AtomicU64, fence, Ordering};
#[cfg(target_os = "linux")]
use libc::timespec;

use tracing::info;

#[cfg(target_os = "linux")]
use std::sync::Mutex;

#[cfg(target_os = "linux")]
use libc::{
c_void, close, ftruncate, madvise, mmap, munmap, prctl, sysconf, MADV_DONTFORK, MAP_ANONYMOUS,
MAP_FAILED, MAP_PRIVATE, MAP_SHARED, PROT_READ, PROT_WRITE, _SC_PAGESIZE,
};

// prctl constants for naming anonymous mappings (Linux 5.17+)
#[cfg(target_os = "linux")]
const PR_SET_VMA: i32 = 0x53564d41;
#[cfg(target_os = "linux")]
const PR_SET_VMA_ANON_NAME: u64 = 0;

// memfd_create flags
#[cfg(target_os = "linux")]
const MFD_CLOEXEC: libc::c_uint = 0x0001;
#[cfg(target_os = "linux")]
const MFD_ALLOW_SEALING: libc::c_uint = 0x0002;
#[cfg(target_os = "linux")]
const MFD_NOEXEC_SEAL: libc::c_uint = 0x0008; // Linux 6.3+

/// Wrapper for memfd_create syscall
#[cfg(target_os = "linux")]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this #[cfg(target_os = "linux")] is kinda redundant; is it maybe possible to do it at the file or crate level?

Comment thread src/process_context/writer.rs Outdated
Comment on lines +56 to +64
/// Get the mapping size (1 page, as per PR #34)
#[cfg(target_os = "linux")]
fn mapping_size() -> Result<usize> {
let page_size = unsafe { sysconf(_SC_PAGESIZE) };
if page_size < 4096 {
return Err(Error::MappingFailed("failed to get page size".to_string()));
}
Ok(page_size as usize)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This... is weird -- is this 34 a reference to open-telemetry/sig-profiling#34? We actually don't need page size anymore, that PR was where we removed it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants