feat: Support new 2026 retries#3379
Conversation
|
Detected 1 possible performance regressions:
|
|
|
||
| # Minimum `aws-sdk-core` version for new gem builds | ||
| MINIMUM_CORE_VERSION = "3.247.0" | ||
| MINIMUM_CORE_VERSION = "3.248.0" |
There was a problem hiding this comment.
Note to self to check this before merging.
jterapin
left a comment
There was a problem hiding this comment.
Nice - you should check with everyone else when these changes should be going out. Also is there a blog post or documentation on how new standard retries work?
| # make JSON parsing errors on 200-range responses retryable | ||
| response.error = Seahorse::Client::NetworkingError.new(e) |
There was a problem hiding this comment.
This is a new addition from last review so I have some questions:
- Does other parser handlers need the same treatment? Such as xml? Is there a reason why JSON is being specifically targeted here?
- If this is the final attempt, does this NetworkingErr gets surfaced to the user? Will this be helpful to the user?
There was a problem hiding this comment.
The SCP tests didn't cover the other parser handlers, but I believe they should also be updated. I'll make the changes.
With these changes, the error message raised on the final attempt will be the NetworkingErr, but it wraps the parsing error message so the root cause will still be surfaced.
There was a problem hiding this comment.
With these changes, the error message raised on the final attempt will be the NetworkingErr, but it wraps the parsing error message so the root cause will still be surfaced.
Is this how other SDKs are handling this case? Labeling it as a service error. Just curious 🤔
| Unreleased Changes | ||
| ------------------ | ||
|
|
||
| * Feature - Add `AWS_NEW_RETRIES_2026` environment variable to opt-in to updated `standard` retry mode with reduced backoff intervals. |
There was a problem hiding this comment.
Nice, have we considered having a similar changelog entries for DynamoDB since they have specific retryable behavior or is that not needed?
Now that I think about it - are service-specific behaviors documented anywhere?
There was a problem hiding this comment.
I also have a question on the users who ARE already on standard mode for their retry_mode - do they feel any changes from this update?
There was a problem hiding this comment.
Can I add a DynamoDB changelog entry even though only core was updated? We could extend the core changelog entry to mention that DynamoDB defaults are changed if new retry behavior is enabled. I don't believe any service specific behaviors are documented anywhere yet. Externally I think the blogpost should mention this new DynamoDB behavior, internally I could add more comments or documentation?
Customers who are already on standard and opt in to new retries will feel a difference. Due to the updated backoff timing, retries will be much faster. Throttling behavior will be the same, and due to the updated retry quota draining, customers will fail faster during sustained service errors, but this is intentional to help services recover faster.
There was a problem hiding this comment.
DynamoDB
I personally feel that blogpost is not enough for documentation. Not everyone will read blogpost releases nor aws-sdk-core's CHANGELOG entries.
Rethinking this...
This might be a good use case where we should have service-specific plugins. With plugins, you can be specific about this behavior + documentation. Now that I think about this - we might need to do something about the autogenerated config. See:
:max_attempts (Integer) — default: 3 — An integer representing the maximum number attempts that will be made for a single request, including the initial attempt. For example, setting this value to 5 will result in a request being retried up to 4 times. Used in standard and adaptive retry modes.
Above is what I see when I run codegen. Let's talk offline.
Customers who are already on standard ... will feel a difference.
We should probably add a separate entry about this. The way I read the above entry is like: "Ok so if I don't use that env var, i'm still on old standard retries mechanism"
There was a problem hiding this comment.
Sure I can try adding a DynamoDB plugin instead for retries.
We should probably add a separate entry about this. The way I read the above entry is like: "Ok so if I don't use that env var, i'm still on old standard retries mechanism"
Your original understanding is correct, customers who are already on standard and opt in to new retries will feel a difference. If they do not set the environment variable, there will not be any differences. New retry behavior is disabled by default and opt in only.
This PR adds support for new 2026 retries behavior, including changing the default retry mode, updating retry quotas, and introducing new behavior for long-polling operations and retry backoff headers.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.