fix(limit-count): upgrade redis-cluster lib so NOSCRIPT is not treated as node failure by shreemaan-abhishek · Pull Request #13579 · apache/apisix

shreemaan-abhishek · 2026-06-20T05:00:51Z

Description

The limit-count redis-cluster path executes the rate-limit script via evalsha with a fallback to eval when the server returns NOSCRIPT (added in #13363).

However, the bundled resty-redis-cluster = 1.05-1 treats any command error other than MOVED/ASK/CLUSTERDOWN (including NOSCRIPT) as a possible node failure and triggers a full cluster slot refresh:

elseif string.sub(err, 1, 11) == "CLUSTERDOWN" then
    return nil, "Cannot executing command, cluster status is failed!"
else
    --There might be node fail, we should also refresh slot cache
    self:refresh_slots()   -- NOSCRIPT lands here
    return nil, err
end

So in cluster mode every NOSCRIPT (first call on a fresh node, after SCRIPT FLUSH, or after failover to a replica that does not yet have the script cached) forces an unnecessary refresh_slots() over the whole cluster before the eval fallback runs. The request still succeeds, but each cache miss pays for a needless cluster-wide topology refresh.

lua-resty-redis-cluster = 1.3.3-0 handles NOSCRIPT explicitly and returns the error without refreshing slots, which matches the plugin's evalsha fallback:

elseif string.sub(err, 1, 8) == "NOSCRIPT" then
    return nil, err

This PR upgrades the dependency accordingly and adds a cluster test that flushes the script cache and verifies the request still succeeds through the eval fallback.

Note: APISIX already depends on the API7-maintained api7-lua-resty-redis-connector, so adopting the maintained lua-resty-redis-cluster fork (published on luarocks.org) is consistent with existing dependencies.

Checklist

I have explained the need for this PR and the problem it solves
I have explained the changes or the new features added to this PR
I have added tests corresponding to this change
I have updated the documentation to reflect this change
I have verified that this change is backward compatible (if not, please discuss on the APISIX mailing list first)

…d as node failure The limit-count redis-cluster path runs evalsha with a NOSCRIPT fallback to eval (apache#13363). The bundled resty-redis-cluster 1.05 treats any error other than MOVED/ASK/CLUSTERDOWN, including NOSCRIPT, as a node failure and triggers a full cluster slot refresh. So every NOSCRIPT (fresh node, SCRIPT FLUSH, or failover to a replica missing the script) forces an unnecessary refresh_slots() before the eval fallback runs. lua-resty-redis-cluster 1.3.3 handles NOSCRIPT explicitly and returns the error without refreshing slots, matching the plugin's evalsha fallback. Add a cluster test that flushes the script cache and asserts the request still succeeds via the eval fallback.

Copilot

Pull request overview

This PR upgrades the Redis Cluster Lua client dependency used by APISIX so that NOSCRIPT errors from EVALSHA are no longer treated as node failures (avoiding unnecessary cluster slot refreshes), and adds a regression test to ensure the limit-count Redis Cluster path correctly falls back to EVAL after script cache flush.

Changes:

Upgrade LuaRocks dependency from resty-redis-cluster to lua-resty-redis-cluster (v1.3.3-0) to avoid slot refresh on NOSCRIPT.
Add a Redis Cluster test that flushes scripts across the cluster and asserts the EVALSHA→EVAL fallback path succeeds.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`apisix-master-0.rockspec`	Updates the Redis Cluster client dependency to a version that handles `NOSCRIPT` without forcing a slot refresh.
`t/plugin/limit-count-redis-cluster.t`	Adds coverage for `NOSCRIPT` fallback behavior in Redis Cluster by flushing script caches and validating a successful request + expected log.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

membphis · 2026-06-22T03:36:03Z

Blocking concern: this dependency upgrade appears to change Redis Cluster timeout/retry behavior implicitly, beyond the intended NOSCRIPT fix.

The NOSCRIPT handling in lua-resty-redis-cluster = 1.3.3-0 matches the limit-count evalsha -> eval fallback, so that part looks good. However, the replacement is not behavior-neutral compared with resty-redis-cluster = 1.05-1:

The old library defaults to DEFAULT_MAX_REDIRECTION = 5 and DEFAULT_MAX_CONNECTION_ATTEMPTS = 3; the new library defaults to 2 and 2.
APISIX currently passes connect_timeout/read_timeout/send_timeout = conf.redis_timeout from apisix/utils/rediscluster.lua, and the plugin docs/schema expose redis_timeout as applying to redis-cluster.
In the new library, try_hosts_slots() and handle_command_with_retry() use fixed connect timeouts (100ms for the first attempt, 800ms for later attempts) instead of consistently using config.connect_timeout.

This can make policy = "redis-cluster" fail earlier than before in high-latency, TLS, cross-AZ, rolling replacement, MOVED/ASK-heavy, or transient node-failure scenarios. More importantly, increasing redis_timeout may no longer fully affect the Redis Cluster connection path as documented.

Could we either preserve the previous APISIX behavior in the wrapper, for example by explicitly setting the old retry defaults and ensuring redis_timeout is still honored by the cluster connect paths, or document the compatibility change and add tests that cover the new timeout/retry semantics? Without that, this dependency upgrade has a user-visible compatibility risk that is not disclosed by the PR title/description.

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. dependencies Pull requests that update a dependency file labels Jun 20, 2026

shreemaan-abhishek requested a review from Copilot June 22, 2026 02:13

Copilot started reviewing on behalf of shreemaan-abhishek June 22, 2026 02:14 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread t/plugin/limit-count-redis-cluster.t

nic-6443 approved these changes Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(limit-count): upgrade redis-cluster lib so NOSCRIPT is not treated as node failure#13579

fix(limit-count): upgrade redis-cluster lib so NOSCRIPT is not treated as node failure#13579
shreemaan-abhishek wants to merge 1 commit into
apache:masterfrom
shreemaan-abhishek:fix/redis-cluster-noscript

shreemaan-abhishek commented Jun 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

membphis commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

shreemaan-abhishek commented Jun 20, 2026

Description

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

membphis commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants