Skip to content

Add support for NATS protocol#1712

Open
marctc wants to merge 12 commits intoopen-telemetry:mainfrom
grafana:add-nats
Open

Add support for NATS protocol#1712
marctc wants to merge 12 commits intoopen-telemetry:mainfrom
grafana:add-nats

Conversation

@marctc
Copy link
Copy Markdown
Contributor

@marctc marctc commented Apr 2, 2026

Summary

Add support to capture span from clients using NATS protocol.

Fixes #1143

Validation

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 78.72861% with 87 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.13%. Comparing base (d5c3c0b) to head (59c311f).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pkg/ebpf/common/nats_detect_transform.go 73.01% 56 Missing and 22 partials ⚠️
pkg/ebpf/common/common.go 77.77% 4 Missing and 2 partials ⚠️
pkg/ebpf/common/tcp_detect_transform.go 85.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1712      +/-   ##
==========================================
- Coverage   78.14%   78.13%   -0.01%     
==========================================
  Files         278      279       +1     
  Lines       34240    34620     +380     
==========================================
+ Hits        26758    27052     +294     
- Misses       6229     6293      +64     
- Partials     1253     1275      +22     
Flag Coverage Δ
integration-test 55.09% <15.15%> (-0.50%) ⬇️
integration-test-arm 28.61% <12.71%> (-0.12%) ⬇️
integration-test-vm-x86_64-5.15.152 28.62% <10.75%> (-0.56%) ⬇️
integration-test-vm-x86_64-6.10.6 29.22% <12.22%> (+0.14%) ⬆️
k8s-integration-test 41.25% <10.75%> (-0.91%) ⬇️
oats-test 37.26% <44.98%> (-0.36%) ⬇️
unittests 58.55% <77.99%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@marctc marctc force-pushed the add-nats branch 2 times, most recently from 1f87cbe to e86822a Compare April 2, 2026 13:44
Copy link
Copy Markdown
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a few minor comments, but this is looking pretty good!

return errors.New("invalid NATS sid")
}
for _, b := range field {
if (b < '0' || b > '9') && (b < 'a' || b > 'z') && (b < 'A' || b > 'Z') {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sql_detect_transform has the same code, maybe we can combine this into a common helper?


return natsFrame{clientID: meta.Name, valid: true}, nil
case "SUB":
if len(fields) != 3 && len(fields) != 4 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I found these statements confusing with != 3 && !=4. It might be better to write them as !(len==3 || len==4)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how it was before, but linter is not happy:

Error: pkg/ebpf/common/nats_detect_transform.go:139:6: QF1001: could apply De Morgan's law (staticcheck)
		if !(len(fields) == 3 || len(fields) == 4) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, cool!

)))
}

func TestParseNATSFrame(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add more tests with all the variations of !=4 && !=5?

@marctc marctc marked this pull request as ready for review April 2, 2026 13:57
@marctc marctc requested a review from a team as a code owner April 2, 2026 13:57
@marctc marctc requested a review from Copilot April 2, 2026 14:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end support for capturing NATS (plain TCP) messaging spans, including protocol detection/parsing in the eBPF TCP pipeline, wiring NATS into instrumentation selection/config/schema, and exporting NATS spans/metrics via Prometheus and OTEL (plus an OATS integration suite).

Changes:

  • Add NATS protocol parsing/detection in the TCP userspace parser and map it to new NATS span event types.
  • Export NATS spans/attributes and messaging publish/process duration metrics in both Prometheus and OTEL exporters.
  • Add documentation + config schema updates + an OATS NATS integration test suite (Python client).

Reviewed changes

Copilot reviewed 40 out of 41 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/obi/config.go Enable NATS in default instrumentation list.
pkg/obi/config_test.go Update config tests to include NATS defaults.
pkg/export/prom/prom.go Record messaging publish/process histograms for NATS spans.
pkg/export/prom/prom_test.go Add Prom metrics expectations and span fixtures for NATS.
pkg/export/otel/tracesgen/tracesgen.go Gate NATS spans on instrumentation selection; emit NATS messaging attributes; set span kinds.
pkg/export/otel/traces_test.go Add NATS trace attribute test vectors and instrumentation filtering expectations.
pkg/export/otel/metrics.go Record OTEL messaging publish/process metrics for NATS spans.
pkg/export/otel/metrics_test.go Add NATS metric expectations and connection-type coverage.
pkg/export/otel/metrics_svc_graph_test.go Include NATS in service graph connection type validation.
pkg/export/instrumentations/instr_options.go Add InstrumentationNATS, selection flag, and NATSEnabled/MQEnabled logic.
pkg/export/instrumentations/instr_options_test.go Add selection tests for NATSEnabled and MQEnabled behavior.
pkg/ebpf/common/tcp_detect_transform.go Detect NATS traffic, parse it, and emit main + optional extra span.
pkg/ebpf/common/tcp_detect_transform_test.go Add NATS parsing tests and coalesced publish+process extra-span emission test.
pkg/ebpf/common/nats_detect_transform.go New NATS protocol parser, heuristic detection, and span conversion.
pkg/ebpf/common/nats_detect_transform_test.go Unit tests for NATS heuristics, frame parsing, and event handling.
pkg/ebpf/common/common.go Add protocol type placeholder and mark NATS client spans as client events.
pkg/appolly/app/request/span.go Add NATS event types and integrate into span kind, naming, attrs, and service graph connection type.
pkg/appolly/app/request/span_test.go Extend span tests for new NATS event types and behaviors.
pkg/appolly/app/request/span_getters.go Add OTEL getter support for NATS messaging attributes.
pkg/appolly/app/request/span_getters_test.go Add getter test coverage for NATS messaging attributes.
Makefile Add oats-test-nats and include it in oats-test.
internal/test/oats/nats/yaml/oats_python_nats.yaml OATS spec validating NATS spans + messaging duration metrics.
internal/test/oats/nats/oats_test.go Register NATS OATS suite.
internal/test/oats/nats/go.mod Go module for the NATS OATS suite.
internal/test/oats/nats/go.sum Dependency lockfile for the NATS OATS suite.
internal/test/oats/nats/docker-compose-obi-python-nats.yml Compose stack: NATS server + python test app + instrumenter.
internal/test/oats/nats/docker-compose-include-base.yml Template include helper for OATS compose generation.
internal/test/oats/nats/docker-compose-generic-template.yml Generic OATS infra stack template (grafana/prom/tempo/collector).
internal/test/oats/nats/configs/tempo-config.yaml Tempo config for OATS NATS suite.
internal/test/oats/nats/configs/prometheus-config.yml Prometheus config for OATS NATS suite.
internal/test/oats/nats/configs/otelcol-config.yaml Collector pipeline for traces/metrics in OATS NATS suite.
internal/test/oats/nats/configs/instrumenter-config-traces.yml Instrumenter routes config used by the suite.
internal/test/oats/nats/configs/grafana-datasources.yaml Grafana datasources for the suite stack.
internal/test/integration/components/pythonnats/requirements.txt Pinned python dependency (nats-py) for integration component.
internal/test/integration/components/pythonnats/requirements.in Input dependency list for pip-compile.
internal/test/integration/components/pythonnats/main.py Python app that publishes to NATS and serves a /publish endpoint.
internal/test/integration/components/pythonnats/Dockerfile Container image for the python NATS test component.
docs/config-schema.json Add nats to instrumentation enums in the config schema.
devdocs/protocols/tcp/README.md Link NATS protocol parser documentation.
devdocs/protocols/tcp/nats.md New documentation for the NATS TCP parser and limitations.
devdocs/features.md Document NATS support in features matrix.

Comment thread pkg/ebpf/common/nats_detect_transform.go
Comment thread pkg/ebpf/common/tcp_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Copy link
Copy Markdown
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

@rafaelroquetto rafaelroquetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think we should do another pass to split the code for clarity, and also ensure no unecessary string allocations are taking place (that's one of the reasons why I refactored the sql code when doing the large buffers API, so I would prefer if we did not drift from the approach).

It might be worth it doing a local review telling your agent to look into AGENTS.md - for some reason I think copilot (the github one) has been sloppy and missing a lot of obvious cases.

Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Comment thread pkg/ebpf/common/tcp_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
Comment thread pkg/ebpf/common/nats_detect_transform.go Outdated
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 22, 2026

CI Supervisor

Workflow Job Last state Re-running? Attempt
PR OATS test nats failure No 2/2
Pull request checks Generate and checks failure No 2/2
Pull request integration tests shard-5 (11 tests) failure No 2/2

@marctc marctc requested a review from rafaelroquetto April 22, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Instrument and collect telemetry for NATS

4 participants