Skip to content

Fix CancelOnLossSend test race by arming loss helper before send#6079

Open
guhetier wants to merge 3 commits into
mainfrom
guhetier/fix-cancel-on-loss-test-race_copilot
Open

Fix CancelOnLossSend test race by arming loss helper before send#6079
guhetier wants to merge 3 commits into
mainfrom
guhetier/fix-cancel-on-loss-test-race_copilot

Conversation

@guhetier

@guhetier guhetier commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Description

The Misc/WithCancelOnLossArgs.CancelOnLossSend/1 test (the DropPackets: true variant) was racy. It armed the loss helper after calling MsQuicStream::Send(QUIC_SEND_FLAG_CANCEL_ON_LOSS):

TEST_QUIC_SUCCEEDED(ClientContext.Stream->Send(&MessageBuffer, 1, QUIC_SEND_FLAG_CANCEL_ON_LOSS));
if (DropPackets) {
    LossHelper.DropPackets(1);   // armed too late
}

Send is asynchronous: it queues the data to an MsQuic worker thread. On a fast loopback datapath (this manifested in the UseXdp + UseQtip BVT configuration), the worker can transmit the stream data and the server can dispatch QUIC_STREAM_EVENT_RECEIVE before the test thread executes DropPackets(1). The server callback then sets ExitCode = SuccessExitCode (42) and signals SendPhaseEndedEvent; the main test thread wakes and asserts 42 != ErrorExitCode (24).

Trace from the failing run (quic.log lines, same TID 2784.2064):

Time Event
22:28:27.381556 QUIC_STREAM_EVENT_RECEIVE [23 bytes] (server)
22:28:27.407382 [test][hook] Selective packet drop (26 ms too late)
22:28:27.468536 QUIC_STREAM_EVENT_PEER_SEND_ABORTED (0x18) (server)

The fix moves the DropPackets(1) call to before Send, so the very next packet hitting the server receive path (the stream data) is the one dropped. This causes the client to detect loss and fire QUIC_STREAM_EVENT_CANCEL_ON_LOSS as intended, which in turn delivers PEER_SEND_ABORTED to the server with ErrorExitCode.

Fixes #6065

Testing

Existing test Misc/WithCancelOnLossArgs.CancelOnLossSend/* covers this change. Ran the parameterized test 20 times locally with the Debug + schannel build — 20/20 passed (previously intermittent in CI).

Documentation

No documentation impact.

The Misc/WithCancelOnLossArgs.CancelOnLossSend/1 test was racy: it called LossHelper.DropPackets(1) AFTER MsQuicStream::Send(QUIC_SEND_FLAG_CANCEL_ON_LOSS) had already queued the data to the worker thread. On a fast loopback datapath (e.g. UseXdp + UseQtip), the worker can transmit the stream data and the server can dispatch QUIC_STREAM_EVENT_RECEIVE before the test thread arms the drop. The test then observes SuccessExitCode (42) instead of the expected ErrorExitCode (24) coming from QUIC_STREAM_EVENT_CANCEL_ON_LOSS.

Arm the loss helper before Send so the next packet on the server receive path (the stream data) is the one dropped, which triggers loss detection on the client and fires CANCEL_ON_LOSS as intended.

Fixes #6065

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.62%. Comparing base (a40688d) to head (f51ca8d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6079      +/-   ##
==========================================
+ Coverage   85.32%   85.62%   +0.29%     
==========================================
  Files          60       60              
  Lines       18798    18841      +43     
==========================================
+ Hits        16040    16132      +92     
+ Misses       2758     2709      -49     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/test/lib/DataTest.cpp Outdated
@guhetier guhetier marked this pull request as ready for review June 9, 2026 21:18
@guhetier guhetier requested a review from a team as a code owner June 9, 2026 21:18
@mtfriesen mtfriesen requested a review from Copilot June 10, 2026 17:48

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a race in the Misc/WithCancelOnLossArgs.CancelOnLossSend/1 test by ensuring the selective packet-drop hook is armed before calling MsQuicStream::Send(..., QUIC_SEND_FLAG_CANCEL_ON_LOSS), so the stream data packet is reliably the one dropped on fast loopback datapaths.

Changes:

  • Move LossHelper.DropPackets(1) to occur before Send(...) in the DropPackets: true variant to eliminate a timing window.
  • Expand the in-test comment explaining why arming after Send is racy.

Comment thread src/test/lib/DataTest.cpp Outdated
Comment thread src/test/lib/DataTest.cpp Outdated
mtfriesen
mtfriesen previously approved these changes Jun 10, 2026
Comment thread src/test/lib/DataTest.cpp Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants