Skip to content

feat: Support new 2026 retries#3379

Open
richardwang1124 wants to merge 16 commits into
version-3from
feature/new-retries
Open

feat: Support new 2026 retries#3379
richardwang1124 wants to merge 16 commits into
version-3from
feature/new-retries

Conversation

@richardwang1124
Copy link
Copy Markdown
Contributor

This PR adds support for new 2026 retries behavior, including changing the default retry mode, updating retry quotas, and introducing new behavior for long-polling operations and retry backoff headers.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@richardwang1124 richardwang1124 requested a review from a team as a code owner May 12, 2026 16:49
@github-actions
Copy link
Copy Markdown

Detected 1 possible performance regressions:

  • aws-sdk-core.gem_size_kb - z-score regression: 407.5 -> 408.5. Z-score: Infinity

Comment thread build_tools/services.rb Outdated

# Minimum `aws-sdk-core` version for new gem builds
MINIMUM_CORE_VERSION = "3.247.0"
MINIMUM_CORE_VERSION = "3.248.0"
Copy link
Copy Markdown
Contributor Author

@richardwang1124 richardwang1124 May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self to check this before merging.

Copy link
Copy Markdown
Contributor

@jterapin jterapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - you should check with everyone else when these changes should be going out. Also is there a blog post or documentation on how new standard retries work?

Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Comment thread gems/aws-sdk-core/CHANGELOG.md Outdated
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb
Comment thread gems/aws-sdk-core/spec/aws/plugins/retries/retry_quota_spec.rb
Comment on lines +39 to +40
# make JSON parsing errors on 200-range responses retryable
response.error = Seahorse::Client::NetworkingError.new(e)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new addition from last review so I have some questions:

  • Does other parser handlers need the same treatment? Such as xml? Is there a reason why JSON is being specifically targeted here?
  • If this is the final attempt, does this NetworkingErr gets surfaced to the user? Will this be helpful to the user?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SCP tests didn't cover the other parser handlers, but I believe they should also be updated. I'll make the changes.

With these changes, the error message raised on the final attempt will be the NetworkingErr, but it wraps the parsing error message so the root cause will still be surfaced.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With these changes, the error message raised on the final attempt will be the NetworkingErr, but it wraps the parsing error message so the root cause will still be surfaced.

Is this how other SDKs are handling this case? Labeling it as a service error. Just curious 🤔

Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Comment thread gems/aws-sdk-core/lib/aws-sdk-core/plugins/retry_errors.rb Outdated
Unreleased Changes
------------------

* Feature - Add `AWS_NEW_RETRIES_2026` environment variable to opt-in to updated `standard` retry mode with reduced backoff intervals.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, have we considered having a similar changelog entries for DynamoDB since they have specific retryable behavior or is that not needed?

Now that I think about it - are service-specific behaviors documented anywhere?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have a question on the users who ARE already on standard mode for their retry_mode - do they feel any changes from this update?

Copy link
Copy Markdown
Contributor Author

@richardwang1124 richardwang1124 May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I add a DynamoDB changelog entry even though only core was updated? We could extend the core changelog entry to mention that DynamoDB defaults are changed if new retry behavior is enabled. I don't believe any service specific behaviors are documented anywhere yet. Externally I think the blogpost should mention this new DynamoDB behavior, internally I could add more comments or documentation?

Customers who are already on standard and opt in to new retries will feel a difference. Due to the updated backoff timing, retries will be much faster. Throttling behavior will be the same, and due to the updated retry quota draining, customers will fail faster during sustained service errors, but this is intentional to help services recover faster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DynamoDB

I personally feel that blogpost is not enough for documentation. Not everyone will read blogpost releases nor aws-sdk-core's CHANGELOG entries.

Rethinking this...
This might be a good use case where we should have service-specific plugins. With plugins, you can be specific about this behavior + documentation. Now that I think about this - we might need to do something about the autogenerated config. See:

:max_attempts (Integer) — default: 3 — An integer representing the maximum number attempts that will be made for a single request, including the initial attempt. For example, setting this value to 5 will result in a request being retried up to 4 times. Used in standard and adaptive retry modes.

Above is what I see when I run codegen. Let's talk offline.

Customers who are already on standard ... will feel a difference.

We should probably add a separate entry about this. The way I read the above entry is like: "Ok so if I don't use that env var, i'm still on old standard retries mechanism"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I can try adding a DynamoDB plugin instead for retries.

We should probably add a separate entry about this. The way I read the above entry is like: "Ok so if I don't use that env var, i'm still on old standard retries mechanism"

Your original understanding is correct, customers who are already on standard and opt in to new retries will feel a difference. If they do not set the environment variable, there will not be any differences. New retry behavior is disabled by default and opt in only.

Copy link
Copy Markdown
Contributor

@jterapin jterapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants