Skip to content

[2/3] Integrate XSK maps into the core XDP datapath#6003

Open
ProjectsByJackHe wants to merge 43 commits into
mainfrom
jackhe/xdp_map_dp_int_pr_2_core
Open

[2/3] Integrate XSK maps into the core XDP datapath#6003
ProjectsByJackHe wants to merge 43 commits into
mainfrom
jackhe/xdp_map_dp_int_pr_2_core

Conversation

@ProjectsByJackHe

@ProjectsByJackHe ProjectsByJackHe commented May 18, 2026

Copy link
Copy Markdown
Contributor

Description

Full E2E demo using tools from this PR : DEMO

Part 2 of the plan: #5982
Fixes #5972

Adds the core msquic datapath integrations for the new API contract defined in #5983

Key design decisions:

  • If map mode is enabled, then we do not create the WinSock or Epoll or .... datapaths. No more best effort!
  • Init failures in map mode hard fails the datapath initialization. No silent failures!

Testing

Added datapath unit tests. See the E2E demo.

Additionally, the plan is to leverage some supporting PRs that included code to ingest the latest XDP version, and kicking off a couple of manual CI runs based on that: https://github.com/microsoft/msquic/actions/runs/27313685557/job/80690420168

Documentation

Will come as part 3 in the plan: #5982

@codecov

codecov Bot commented May 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 60.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.51%. Comparing base (f87bfd3) to head (3432322).
⚠️ Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
src/core/settings.c 33.33% 2 Missing ⚠️

❌ Your patch check has failed because the patch coverage (60.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6003      +/-   ##
==========================================
+ Coverage   84.88%   85.51%   +0.62%     
==========================================
  Files          60       60              
  Lines       18797    18846      +49     
==========================================
+ Hits        15956    16116     +160     
+ Misses       2841     2730     -111     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread submodules/xdp-for-windows
Comment thread src/inc/quic_datapath.h Outdated
Comment thread src/inc/quic_datapath.h Outdated
Comment thread src/platform/datapath_raw_win.c Outdated
Comment thread src/platform/datapath_raw_xdp_win.c Outdated
Comment thread src/platform/platform_internal.h
Comment thread src/platform/platform_internal.h Outdated
@ProjectsByJackHe

Copy link
Copy Markdown
Contributor Author

Planning on waiting until #6006 gets resolved to address the next wave of feedback.

Comment thread src/platform/datapath_raw_xdp_win.c Outdated
Comment thread src/platform/datapath_raw_xdp_win.c Outdated
@mtfriesen

Copy link
Copy Markdown
Contributor

It's not clear to me why we don't have end-to-end tests exercising everything from providing the XSKMAP before creating a registration through actual XDP data paths flowing? Since we intend to officially support this feature, it is risky for both us and our partners to supply/depend on an API that does not have regression test coverage and a paved pattern to follow.

@ProjectsByJackHe

ProjectsByJackHe commented May 20, 2026

Copy link
Copy Markdown
Contributor Author

It's not clear to me why we don't have end-to-end tests exercising everything from providing the XSKMAP before creating a registration through actual XDP data paths flowing? Since we intend to officially support this feature, it is risky for both us and our partners to supply/depend on an API that does not have regression test coverage and a paved pattern to follow.

That's in the works. I took the approach of building up an E2E flow manually first to see the shape of how the datapath integration would look like. There's likely going to be some limitations in the CI environment that prevents some scenarios from being exercised, but I am currently adding coverage for the parts that can be automated now that the skeleton of the general changes is mapped out in the current PR iteration.

@mtfriesen

Copy link
Copy Markdown
Contributor

Let's ensure we add tests with each chunk of work we do. I understand it may be a lot of work to add tests, but the best time to protect the code with automation and review the coverage is when making the change.

@ProjectsByJackHe

Copy link
Copy Markdown
Contributor Author

Let's ensure we add tests with each chunk of work we do. I understand it may be a lot of work to add tests, but the best time to protect the code with automation and review the coverage is when making the change.

XDP needs to release a new driver version consumable by MsQuic in the CI before I can push my tests.

I have them right now working over a local VM with the latest xdp drivers installed, and find that everything is working.

@ProjectsByJackHe

ProjectsByJackHe commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

Either we merge #6034 (or wait until the next XDP release) to add automated coverage, or having a one-off manual dispatch for the tests + manual local VM runs is good enough to give confidence to merge this integration. Let's decide on this first and align timelines. Currently, the map tests are pushed for review, but disabled.

Comment thread .github/workflows/test.yml Outdated
Comment thread src/test/lib/precomp.h Outdated
Comment thread src/test/lib/TestListener.h Outdated
Comment thread src/test/bin/quic_gtest.cpp Outdated
Comment thread src/core/settings.c Outdated
Comment thread scripts/test.ps1
Comment thread src/test/lib/HandshakeTest.cpp Outdated
Comment thread src/test/lib/HandshakeTest.cpp Outdated
Comment thread src/test/lib/DataTest.cpp Outdated
Comment thread src/test/bin/quic_gtest.cpp Outdated
Comment thread src/test/bin/quic_gtest.cpp
Comment thread scripts/run-gtest.ps1
Comment thread src/platform/unittest/DataPathTest.cpp Outdated
Comment thread src/platform/datapath_raw.h Outdated

_IRQL_requires_max_(PASSIVE_LEVEL)
QUIC_STATUS
CxPlatDpRawInsertXskByMapConfigs(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a corresponding cleanup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you asking where in msquic we close/cleanup the map handles given to msquic? If so, the ownership of the map handles is external; the app should own creation/deletion. When XSKs get destroyed upon msquic cleanup, it is the responsibility of the app to handle cleaning up the map handles as well, otherwise it will just have a bunch of stale entries in the map, where xdp will be directing traffic to dead sockets (dropping them basically).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When XSKs get destroyed upon msquic cleanup, it is the responsibility of the app to handle cleaning up the map handles as well

I'm not sure I follow this - usually symmetry for create/delete is preferred. If MsQuic is inserting entries into the maps, it seems ideal for it to revert that during its teardown.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.
We recently had to do something similar for rules. We should have a best effort removal from the map on cleanup and failure.
The app owns the map, but MsQuic owns the XSK, so lets make sure we don't leave them in the map.

Comment thread src/platform/datapath_xplat.c
Comment thread src/test/bin/quic_gtest.cpp Outdated
Comment thread src/test/bin/quic_gtest.cpp Outdated
Comment thread .github/workflows/test.yml Outdated
}

const uint32_t FakeIfIndex = 0xDEAD;
const QUIC_XDP_MAP_HANDLE FakeHandle = (QUIC_XDP_MAP_HANDLE)(intptr_t)-1;

@ProjectsByJackHe ProjectsByJackHe Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be wondering, why -1? it was previously 0x1234 (which I thought was out of bounds for windows handles).
Why not NULL or INVALID_HANDLE (0)? Indeed, 0 is invalid on windows but not posix.

Given that the general abstraction of raw-only-datapath mode is cross-platform, POSIX runs these tests too, and so we can add that coverage.

@mtfriesen mtfriesen Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Maybe we need a cross-plat abstraction for "invalid [socket|map|file] handle" which can either be a value we know must be invalid at compile time, or open some dynamically created object (e.g., an event object) that we expect will mismatch the expected handle type.

@mtfriesen mtfriesen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to solve the testing problem - map mode should run all existing MsQuic tests that are not intrinsically incompatible with an XDP map.

Nearly all MsQuic protocol level tests should be compatible with XDP maps, so getting those exercised is a min bar.

CxPlatInitialize();

CXPLAT_DATAPATH* Datapath;
CXPLAT_DATAPATH* Datapath = NULL;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitnit: this is C++, nullptr or default init should be prefered.

Suggested change
CXPLAT_DATAPATH* Datapath = NULL;
CXPLAT_DATAPATH* Datapath{};

Comment thread src/core/settings.c
if (Source->IsSet.XdpEnabled && !Source->XdpEnabled && MsQuicLib.XdpMapConfigCount > 0) {
QuicTraceLogError(
SettingXdpDisabledInMapMode,
"[ lib] Error: XdpEnabled cannot be set to FALSE when XDP map mode is active.");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's make logs slightly more actionable for users. "xdp map mode" is our current name for this feature, but we don't expose it in the API. Let's also avoid double negatives.

Suggested change
"[ lib] Error: XdpEnabled cannot be set to FALSE when XDP map mode is active.");
"[ lib] Error: Xdp must be enabled when an XDP map was configured.");

Comment thread src/inc/quic_datapath.h
//
// N.B. Currently only supported for Windows user-mode.
//
const struct QUIC_XDP_MAP_CONFIG* XdpMapConfigs;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider SAL to annotate the length.

Comment thread src/inc/quic_datapath.h
//
// N.B. Currently only supported for Windows user-mode.
//
const struct QUIC_XDP_MAP_CONFIG* XdpMapConfigs;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: We don't repeat the "struct" keyword on use:

Suggested change
const struct QUIC_XDP_MAP_CONFIG* XdpMapConfigs;
const QUIC_XDP_MAP_CONFIG* XdpMapConfigs;

Comment thread src/inc/quic_datapath.h
// The map configs must remain valid for the lifetime of the datapath.
//
// N.B. Currently only supported for Windows user-mode.
//

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This comment is a bit overly detailed for the config parameter; it is likely to fall out of sync.
The details would be better in a design doc file or at the relevant point in the code.

Suggested change
//
//
// External XDP map configurations. When present, the datapath insert XSK sockets in
// the provided maps at instead of configuring per-connection rules.
// The map configs must remain valid for the lifetime of the datapath.
//
// N.B. Currently only supported for Windows user-mode.
//

// since we are skipping OS platform specific initializations.
//
CXPLAT_DBG_ASSERT(InitConfig->XdpMapConfigs != NULL);
if (NewDataPath == NULL) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already checked above?

if (NewDataPath == NULL) {
return QUIC_STATUS_INVALID_PARAMETER;
}
if (UdpCallbacks != NULL) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if UdpCallbacks == NULL, shouldn't we fail? Or is it intentional to support a scenario with no callbacks?
I suspect this is an artifact of the "TCP or UDP callbacks are needed" logic, but we only have UDP here.

DataPathInitialize(
if (InitConfig->XdpMapConfigCount > 0) {
//
// Raw-only datapath: the raw datapath must initialize successfully

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in that first if branch would be better factored in a function.
Either having a RawOnlyDataPathInitialize function, or extracting the logic from DataPathInitialize that is always needed, so you can do a DataPathCommonInitialize followed by RawDataPathInitialize here.

CxPlatDataPathInitialize should only deal with dispatching, not have low level logic.


CXPLAT_DBG_ASSERT(Datapath->UdpHandlers.Receive != NULL || Config->Flags & CXPLAT_SOCKET_FLAG_PCP);
CXPLAT_DBG_ASSERT(IsServerSocket || Config->PartitionIndex < Datapath->PartitionCount);
CXPLAT_DBG_ASSERT(CxPlatDpRawIsRawDatapathOnly(Datapath->RawDataPath) || IsServerSocket || Config->PartitionIndex < Datapath->PartitionCount);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do we need this change? It isn't clear to me why xdp maps would interfere with partioning.


_IRQL_requires_max_(PASSIVE_LEVEL)
QUIC_STATUS
CxPlatDpRawInsertXskByMapConfigs(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this.
We recently had to do something similar for rules. We should have a best effort removal from the map on cleanup and failure.
The app owns the map, but MsQuic owns the XSK, so lets make sure we don't leave them in the map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The XDP datapath must use XDP maps instead of XDP programs when maps are provided

3 participants