[CIP-14][CELEBORN-2220] Support Slow Start Push in C++ Client#3730
Open
afterincomparableyum wants to merge 1 commit into
Open
[CIP-14][CELEBORN-2220] Support Slow Start Push in C++ Client#3730afterincomparableyum wants to merge 1 commit into
afterincomparableyum wants to merge 1 commit into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3730 +/- ##
============================================
- Coverage 57.53% 57.47% -0.05%
Complexity 214 214
============================================
Files 396 396
Lines 27857 27857
Branches 2710 2710
============================================
- Hits 16025 16009 -16
- Misses 10682 10696 +14
- Partials 1150 1152 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Implement SlowStartPushStrategy in the C++ client, mirroring the Java client's TCP-like congestion control for push speed: - Track a per worker CongestControlContext that grows the in-flight request limit on success (slow start, then congestion avoidance once the limit reaches the threshold) and halves it on congestion, falling back to sleeping when congestion persists at a limit of 1. - Select the strategy through celeborn.client.push.limitStrategy=SLOWSTART - Add configs celeborn.client.push.slowStart.initialSleepTime (default 500ms) and celeborn.client.push.slowStart.maxSleepTime (default 2s) to control sleep intervals while ramping up. - Add PushStrategyTest covering strategy creation, sleep time calculation, congestion control, and per host isolation, and add to CelebornConfTest with the new configs.
dd40144 to
f36a0c3
Compare
Contributor
Author
|
Ping @SteNicholas @RexXiong could one of y'all take a look please. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Implement SlowStartPushStrategy in the C++ client, mirroring the Java client's TCP-like congestion control for push speed:
Track a per worker CongestControlContext that grows the in-flight request limit on success (slow start, then congestion avoidance once the limit reaches the threshold) and halves it on congestion, falling back to sleeping when congestion persists at a limit of 1.
Select the strategy through celeborn.client.push.limitStrategy=SLOWSTART
Add configs celeborn.client.push.slowStart.initialSleepTime (default 500ms) and celeborn.client.push.slowStart.maxSleepTime (default 2s) to control sleep intervals while ramping up.
Add PushStrategyTest covering strategy creation, sleep time calculation, congestion control, and per host isolation, and add to CelebornConfTest with the new configs.
Why are the changes needed?
This is needed to bridge the gap between features that the java client has vs what the c++ client has.
Does this PR resolve a correctness bug?
Does this PR introduce any user-facing change?
How was this patch tested?
CI/CD