Attach values to CPU samples#63163
Draft
szegedi wants to merge 4 commits intonodejs:mainfrom
Draft
Conversation
Collaborator
|
Review requested:
|
96a4d45 to
1554b9f
Compare
Signed-off-by: Attila Szegedi <attila.szegedi@datadoghq.com>
…a as a structured tree as well as associating values with samples
1554b9f to
d64a519
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a feature where users of the V8 CPU profiler can associate arbitrary values with CPU samples. Typical usage for such values are labels, trace IDs, and so on. We'll typically refer to these values as "sample context".
This work is essentially a reimplementation of Datadog's equivalent functionality in the https://github.com/DataDog/pprof-nodejs project, only implemented elegantly and without workarounds we had to resort there 'cause our code is outside of both Node.js and V8.
It can be easily separated into two parts: support in V8 profiler code, and Node.js specific implementation on top of it.
V8 code to support associating values with samples
I tried to keep this to minimum, for obvious reasons of easy future patch management and upstreaming potential. It essentially just allows an opaque
void*to be added to each sample. In the public API part, this means that there's now:CpuProfilingOptions::sample_context_extractorwith which you can specify a callback that takes the current isolate and returns avoid*, this will be invoked from the sampling code (either Unix signal handler or Windows thread state mechanism.)void* CpuProfile::GetSampleContext(index)where the user can read the data assigned to every sample.LookupCpedMapAlignedPointerfunction. This is a signal-safe lookup implementation for the common pattern of using a JS map in the CPED for storing multiple independent bits of data (which is what Node.js also does withAsyncContextFrame) and then having an object with an internal field bound to a key in the map carry the nativevoid *pointer to embedder data. This function lives in the V8 code because hosting it in Node.js code would be very unfeasible. TL;DR: if we includeembedder-data-slot-inl.h,js-collection-inl.h,ordered-hash-table-inl.h, etc. in Node.js code, it pulls in tons of build infrastructure that only V8's own compilation is set up for. In V8 code on the other hand, it has a very nice minimal implementation as it can rely on internal signal-safe APIs that read hashtable data.Node.js code
With the above V8 bits in place, the Node.js implementation itself becomes fairly straightforward.
SyncCPUProfileHandlethat set the current sample context in the internal ALS:enterWithContext()forenterWith-like semantics andrunWithContext()forrun-like semantics.stopAndCapture()method on the handle (becausestop()is already taken) that returns an object tree basically mapping thev8::CPUProfilestructure and also carries sample contexts. The existingstop()that preemptively serializes to JSON is not suitable for use where you want to inspect the contexts, and it is also a nice object oriented way to get the data upon which arbitrary serializers (I'm thinking pprof and/or otlp) can be built later.snapshot()method on the handle that returns the same data asstopAndCapture()but doesn't stop profiling, rather internally immediately starts a new profiling session. This allows for creation of gapless continuous profiler solutions (like Datadog's is.)Internally, most of the interesting details pertain to needs of signal safe capture and storage of sample contexts. In C++ we consistently hold a reference to a sample context using a
std::shared_ptr<v8::Global<v8::Value>>. When profiling starts, we allocate a vector of these large enough to hold contexts for all captured samples, as we are not allowed to allocate anything in a signal handler. Having aGlobalreference to the value keeps it alive in V8. However, we could not be copying globals directly from CPED-bound context holder's to our internal buffer, hence using astd::shared_ptrwhich we can safely copy from the same context holder's internal field to our buffer.Finally, we must not try to chase pointers from CPED to our context holder if GC is in progress. For this reason, we register GC prologue and epilogue callbacks to observe GC, and furthermore capture and cache the current sample context in the prologue (then it is still safe to do it) and use it for all GC samples. This way, not even GC samples lose context.
Anyhow, as I said, this whole approach is pretty much what we use in Datadog's
pprof-nodejsalready and successfully for quite some time now. Our approach unfortunately had to resort to hijacking the PROF signal handler (and thus doesn't work on Windows), we had to reimplement the internal V8 hashmap lookup from scratch (which is brittle as V8 evolves), and we had to resort to timestamp bracketing to associate samples with our contexts in the buffer. None of these kluges are needed here, so this implementation is as clean as it gets IMHO.Review away :-)