fix: Optimization of Rate Limiting Logic for Cluster, AI Token and WASM Plugin #2997

hanxiantao · 2025-10-12T07:11:25Z

Ⅰ. Describe what this PR did

集群限流和AI Token限流限流逻辑优化：限流统计调整为累加方式，保证限流值修改时不会重置请求次数和token使用量

目前实现有个小弊端，但应该没有更好的解决方式：集群限流场景下，再修改限制阈值时，实际请求次数会比限流阈值少一次，参考cluster-key-rate-limit测试用例中根据key限流模式

Ⅱ. Does this pull request fix one issue?

fixes #2996

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

1） cluster-key-rate-limit

全局限流模式：

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: test
  namespace: higress-system
spec:
  defaultConfig:
    rule_name: default_rule
    global_threshold:
      query_per_hour: 30
    redis:
      service_name: "redis.default.svc.cluster.local"
      service_port: 6379
    show_limit_quota_header: true
    rejected_msg: '{"code":-1,"msg":"Too many requests"}'
  url: oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:cluster-key-rate-limit-101202
  imagePullSecret: aliyun

请求3次后：

修改限流阈值为50，阈值变更为50：

根据key限流模式：

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: test
  namespace: higress-system
spec:
  defaultConfig:
    rule_name: default_rule
    rule_items:
      - limit_by_param: apikey
        limit_keys:
          - key: 9a342114-ba8a-11ec-b1bf-00163e1250b5
            query_per_hour: 3
          - key: a6a6d7f2-ba8a-11ec-bec2-00163e1250b5
            query_per_hour: 100
      - limit_by_per_param: apikey
        limit_keys:
          # 正则表达式，匹配以a开头的所有字符串，每个apikey对应的请求10qds
          - key: "regexp:^a.*"
            query_per_hour: 10
          # 正则表达式，匹配以b开头的所有字符串，每个apikey对应的请求100qd
          - key: "regexp:^b.*"
            query_per_hour: 100
          # 兜底用，匹配所有请求，每个apikey对应的请求1000qdh
          - key: "*"
            query_per_hour: 1000
    redis:
      service_name: "redis.default.svc.cluster.local"
      service_port: 6379
    show_limit_quota_header: true
  url: oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:cluster-key-rate-limit-101210
  imagePullSecret: aliyun

curl -kvv -X GET 'http://localhost:8082/foo?apikey=9a342114-ba8a-11ec-b1bf-00163e1250b5'

可以请求三次，触发限流后，这里的请求数不会一直累加了：

修改阈值为6，还是可以再请求两次：

2）ai-token-ratelimit

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: ai-token-ratelimit-1.0.0
  namespace: higress-system
spec:
  defaultConfigDisable: true
  failStrategy: FAIL_OPEN
  imagePullPolicy: UNSPECIFIED_POLICY
  imagePullSecret: aliyun
  matchRules:
    - config:
        global_threshold:
          token_per_hour: 100
        redis:
          service_name: redis.default.svc.cluster.local
          service_port: 6379
        rule_name: default_rule
      configDisable: false
      ingress:
        - ai-route-qwen.internal
  phase: UNSPECIFIED_PHASE
  priority: 600
  url: >-
    oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:ai-token-ratelimit-101201

触发限流：

修改阈值为300，可以继续请求：

Ⅴ. Special notes for reviews

lingma-agents · 2025-10-12T07:11:56Z

优化集群限流和AI Token限流逻辑以支持阈值动态调整

变更概述

问题修复
- 修复了集群限流和AI Token限流在修改限流阈值时会重置计数的问题，改为累加方式统计请求次数和token使用量。
- 优化了Redis脚本逻辑，确保在限流阈值变更时能够正确处理已有的计数状态。
- 改进了限流键的生成格式，使其更具可读性和一致性。
重构
- 重构了ai-token-ratelimit和cluster-key-rate-limit插件中的Redis键生成逻辑，统一使用格式化字符串常量。
- 重新组织了Redis Lua脚本结构，分离请求阶段和响应阶段的处理逻辑，提高代码可维护性。
性能优化
- 优化了Redis操作逻辑，减少不必要的键查询和设置操作，提高限流检查效率。
- 在Lua脚本中增加了对TTL异常情况的处理，确保限流窗口的有效性。
测试更新
- 更新了相关测试用例说明，验证限流阈值动态调整功能的正确性。
- 提供了详细的测试场景描述，包括全局限流模式和基于参数的限流模式。

变更文件

文件路径	变更说明
plugins/wasm-go/extensions/ai-token-ratelimit/main.go	优化了AI Token限流逻辑，支持动态调整限流阈值而不重置计数。重构了Redis键生成方式和Lua脚本逻辑，增强了限流准确性。
plugins/wasm-go/extensions/cluster-key-rate-limit/main.go	改进了集群限流机制，采用累加方式统计请求次数，避免阈值变更时的计数重置问题。更新了Redis键格式和Lua脚本实现。

💡 小贴士

与 lingma-agents 交流的方式

📜 直接回复评论
直接回复本条评论，lingma-agents 将自动处理您的请求。例如：

在当前代码中添加详细的注释说明。
请详细介绍一下你说的 LRU 改造方案，并使用伪代码加以说明。

📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如：

@lingma-agents 分析这个方法的性能瓶颈并提供优化建议。
@lingma-agents 对这个方法生成优化代码。

📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如：

@lingma-agents 请总结上述讨论并提出解决方案。
@lingma-agents 请根据讨论内容生成优化代码。

--- ### Optimize cluster current limiting and AI Token current limiting logic to support dynamic adjustment of thresholds

Summary of changes

Bug fix
- Fixed the problem that the cluster current limit and AI Token current limit would reset the count when modifying the current limit threshold, and changed to the cumulative method to count the number of requests and token usage.
- Optimized the Redis script logic to ensure that the existing counting status can be correctly processed when the current limit threshold is changed.
- Improved the generation format of throttling keys to make it more readable and consistent.
Refactor
- Reconstructed the Redis key generation logic in the ai-token-ratelimit and cluster-key-rate-limit plug-ins to use formatted string constants uniformly.
- Reorganized the Redis Lua script structure, separated the processing logic of the request phase and the response phase, and improved code maintainability.
Performance Optimization
- Optimized the Redis operation logic, reducing unnecessary key query and setting operations, and improving the efficiency of current limit checking.
- Added handling of TTL exceptions in Lua script to ensure the effectiveness of the current limiting window.
TESTING UPDATES
- Updated relevant test case descriptions to verify the correctness of the dynamic adjustment function of the current limit threshold.
- Provides detailed test scenario descriptions, including global current limiting mode and parameter-based current limiting mode.

Change files

File path	Change description
plugins/wasm-go/extensions/ai-token-ratelimit/main.go	Optimized the AI Token current limiting logic to support dynamic adjustment of the current limiting threshold without resetting the count. The Redis key generation method and Lua script logic have been reconstructed to enhance the current limiting accuracy.
plugins/wasm-go/extensions/cluster-key-rate-limit/main.go	The cluster current limiting mechanism is improved, and the number of requests is counted cumulatively to avoid the problem of count reset when the threshold is changed. Updated Redis key format and Lua script implementation.

💡 Tips

How to communicate with lingma-agents

📜Reply directly to comments
Reply directly to this comment and lingma-agents will automatically handle your request. For example:

_Add detailed comments to the current code. _
_Please introduce the LRU transformation plan you mentioned in detail and use pseudo code to illustrate it. _

📜 mark the line of code
Create a comment and @lingma-agents at a specific location in the file. For example:

_@lingma-agents analyzes the performance bottlenecks of this method and provides optimization suggestions. _
_@lingma-agents generate optimized code for this method. _

📜Ask a question in the discussion
@lingma-agents for help in any discussion. For example:

_@lingma-agents Please summarize the above discussion and suggest a solution. _
_@lingma-agents Please generate optimized code based on the discussion content. _

codecov-commenter · 2025-10-12T07:16:07Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.97%. Comparing base (ef31e09) to head (36b1ad9).
⚠️ Report is 739 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2997      +/-   ##
==========================================
+ Coverage   35.91%   44.97%   +9.06%     
==========================================
  Files          69       82      +13     
  Lines       11576    13378    +1802     
==========================================
+ Hits         4157     6017    +1860     
+ Misses       7104     7014      -90     
- Partials      315      347      +32

see 80 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

rinfx

LGTM

hanxiantao added 2 commits October 12, 2025 08:38

集群限流和AI Token限流限流逻辑优化

5c17584

集群限流和AI Token限流限流逻辑优化

eb9ea80

hanxiantao requested review from CH3CHO, erasernoob, johnlanni and rinfx as code owners October 12, 2025 07:11

修复单元测试

36b1ad9

rinfx approved these changes Oct 15, 2025

View reviewed changes

rinfx merged commit 1f301be into alibaba:main Oct 15, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Optimization of Rate Limiting Logic for Cluster, AI Token and WASM Plugin #2997

fix: Optimization of Rate Limiting Logic for Cluster, AI Token and WASM Plugin #2997

Uh oh!

hanxiantao commented Oct 12, 2025 •

edited

Loading

Uh oh!

lingma-agents bot commented Oct 12, 2025 •

edited by github-actions bot

Loading

与 lingma-agents 交流的方式

How to communicate with lingma-agents

Uh oh!

codecov-commenter commented Oct 12, 2025 •

edited

Loading

Uh oh!

rinfx left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: Optimization of Rate Limiting Logic for Cluster, AI Token and WASM Plugin #2997

fix: Optimization of Rate Limiting Logic for Cluster, AI Token and WASM Plugin #2997

Uh oh!

Conversation

hanxiantao commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ⅰ. Describe what this PR did

Ⅱ. Does this pull request fix one issue?

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

Uh oh!

lingma-agents bot commented Oct 12, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

优化集群限流和AI Token限流逻辑以支持阈值动态调整

与 lingma-agents 交流的方式

How to communicate with lingma-agents

Uh oh!

codecov-commenter commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rinfx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hanxiantao commented Oct 12, 2025 •

edited

Loading

lingma-agents bot commented Oct 12, 2025 •

edited by github-actions bot

Loading

codecov-commenter commented Oct 12, 2025 •

edited

Loading