Skip to content

Conversation

@hanxiantao
Copy link
Collaborator

@hanxiantao hanxiantao commented Oct 12, 2025

Ⅰ. Describe what this PR did

集群限流和AI Token限流限流逻辑优化:限流统计调整为累加方式,保证限流值修改时不会重置请求次数和token使用量

目前实现有个小弊端,但应该没有更好的解决方式:集群限流场景下,再修改限制阈值时,实际请求次数会比限流阈值少一次,参考cluster-key-rate-limit测试用例中根据key限流模式

Ⅱ. Does this pull request fix one issue?

fixes #2996

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

1) cluster-key-rate-limit

全局限流模式:

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: test
  namespace: higress-system
spec:
  defaultConfig:
    rule_name: default_rule
    global_threshold:
      query_per_hour: 30
    redis:
      service_name: "redis.default.svc.cluster.local"
      service_port: 6379
    show_limit_quota_header: true
    rejected_msg: '{"code":-1,"msg":"Too many requests"}'
  url: oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:cluster-key-rate-limit-101202
  imagePullSecret: aliyun

请求3次后:
cluster-key-rate-limit测试用例

cluster-key-rate-limit测试用例2

修改限流阈值为50,阈值变更为50:
cluster-key-rate-limit测试用例3

cluster-key-rate-limit测试用例4

根据key限流模式:

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: test
  namespace: higress-system
spec:
  defaultConfig:
    rule_name: default_rule
    rule_items:
      - limit_by_param: apikey
        limit_keys:
          - key: 9a342114-ba8a-11ec-b1bf-00163e1250b5
            query_per_hour: 3
          - key: a6a6d7f2-ba8a-11ec-bec2-00163e1250b5
            query_per_hour: 100
      - limit_by_per_param: apikey
        limit_keys:
          # 正则表达式,匹配以a开头的所有字符串,每个apikey对应的请求10qds
          - key: "regexp:^a.*"
            query_per_hour: 10
          # 正则表达式,匹配以b开头的所有字符串,每个apikey对应的请求100qd
          - key: "regexp:^b.*"
            query_per_hour: 100
          # 兜底用,匹配所有请求,每个apikey对应的请求1000qdh
          - key: "*"
            query_per_hour: 1000
    redis:
      service_name: "redis.default.svc.cluster.local"
      service_port: 6379
    show_limit_quota_header: true
  url: oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:cluster-key-rate-limit-101210
  imagePullSecret: aliyun
curl -kvv -X GET 'http://localhost:8082/foo?apikey=9a342114-ba8a-11ec-b1bf-00163e1250b5'

可以请求三次,触发限流后,这里的请求数不会一直累加了:
cluster-key-rate-limit key测试用例

cluster-key-rate-limit key测试用例2

修改阈值为6,还是可以再请求两次:

cluster-key-rate-limit key测试用例3

2)ai-token-ratelimit

apiVersion: extensions.higress.io/v1alpha1
kind: WasmPlugin
metadata:
  name: ai-token-ratelimit-1.0.0
  namespace: higress-system
spec:
  defaultConfigDisable: true
  failStrategy: FAIL_OPEN
  imagePullPolicy: UNSPECIFIED_POLICY
  imagePullSecret: aliyun
  matchRules:
    - config:
        global_threshold:
          token_per_hour: 100
        redis:
          service_name: redis.default.svc.cluster.local
          service_port: 6379
        rule_name: default_rule
      configDisable: false
      ingress:
        - ai-route-qwen.internal
  phase: UNSPECIFIED_PHASE
  priority: 600
  url: >-
    oci://registry.cn-hangzhou.aliyuncs.com/wasm-plugin/wasm-plugin:ai-token-ratelimit-101201

触发限流:
ai-token-ratelimit触发限流

修改阈值为300,可以继续请求:
ai-token-ratelimit触发限流2

Ⅴ. Special notes for reviews

@lingma-agents
Copy link

lingma-agents bot commented Oct 12, 2025

优化集群限流和AI Token限流逻辑以支持阈值动态调整

变更概述
  • 问题修复

    • 修复了集群限流和AI Token限流在修改限流阈值时会重置计数的问题,改为累加方式统计请求次数和token使用量。
    • 优化了Redis脚本逻辑,确保在限流阈值变更时能够正确处理已有的计数状态。
    • 改进了限流键的生成格式,使其更具可读性和一致性。
  • 重构

    • 重构了ai-token-ratelimitcluster-key-rate-limit插件中的Redis键生成逻辑,统一使用格式化字符串常量。
    • 重新组织了Redis Lua脚本结构,分离请求阶段和响应阶段的处理逻辑,提高代码可维护性。
  • 性能优化

    • 优化了Redis操作逻辑,减少不必要的键查询和设置操作,提高限流检查效率。
    • 在Lua脚本中增加了对TTL异常情况的处理,确保限流窗口的有效性。
  • 测试更新

    • 更新了相关测试用例说明,验证限流阈值动态调整功能的正确性。
    • 提供了详细的测试场景描述,包括全局限流模式和基于参数的限流模式。
变更文件
文件路径 变更说明
plugins/​wasm-go/​extensions/​ai-token-ratelimit/​main.​go 优化了AI Token限流逻辑,支持动态调整限流阈值而不重置计数。重构了Redis键生成方式和Lua脚本逻辑,增强了限流准确性。
plugins/​wasm-go/​extensions/​cluster-key-rate-limit/​main.​go 改进了集群限流机制,采用累加方式统计请求次数,避免阈值变更时的计数重置问题。更新了Redis键格式和Lua脚本实现。

💡 小贴士

与 lingma-agents 交流的方式

📜 直接回复评论
直接回复本条评论,lingma-agents 将自动处理您的请求。例如:

  • 在当前代码中添加详细的注释说明。

  • 请详细介绍一下你说的 LRU 改造方案,并使用伪代码加以说明。

📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如:

  • @lingma-agents 分析这个方法的性能瓶颈并提供优化建议。

  • @lingma-agents 对这个方法生成优化代码。

📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如:

  • @lingma-agents 请总结上述讨论并提出解决方案。

  • @lingma-agents 请根据讨论内容生成优化代码。

--- ### Optimize cluster current limiting and AI Token current limiting logic to support dynamic adjustment of thresholds
Summary of changes
  • Bug fix

    • Fixed the problem that the cluster current limit and AI Token current limit would reset the count when modifying the current limit threshold, and changed to the cumulative method to count the number of requests and token usage.
    • Optimized the Redis script logic to ensure that the existing counting status can be correctly processed when the current limit threshold is changed.
    • Improved the generation format of throttling keys to make it more readable and consistent.
  • Refactor

    • Reconstructed the Redis key generation logic in the ai-token-ratelimit and cluster-key-rate-limit plug-ins to use formatted string constants uniformly.
    • Reorganized the Redis Lua script structure, separated the processing logic of the request phase and the response phase, and improved code maintainability.
  • Performance Optimization

    • Optimized the Redis operation logic, reducing unnecessary key query and setting operations, and improving the efficiency of current limit checking.
    • Added handling of TTL exceptions in Lua script to ensure the effectiveness of the current limiting window.
  • TESTING UPDATES

    • Updated relevant test case descriptions to verify the correctness of the dynamic adjustment function of the current limit threshold.
    • Provides detailed test scenario descriptions, including global current limiting mode and parameter-based current limiting mode.
Change files
File path Change description
plugins/​wasm-go/​extensions/​ai-token-ratelimit/​main.​go Optimized the AI ​​Token current limiting logic to support dynamic adjustment of the current limiting threshold without resetting the count. The Redis key generation method and Lua script logic have been reconstructed to enhance the current limiting accuracy.
plugins/​wasm-go/​extensions/​cluster-key-rate-limit/​main.​go The cluster current limiting mechanism is improved, and the number of requests is counted cumulatively to avoid the problem of count reset when the threshold is changed. Updated Redis key format and Lua script implementation.

💡 Tips

How to communicate with lingma-agents

📜Reply directly to comments
Reply directly to this comment and lingma-agents will automatically handle your request. For example:

  • _Add detailed comments to the current code. _

  • _Please introduce the LRU transformation plan you mentioned in detail and use pseudo code to illustrate it. _

📜 mark the line of code
Create a comment and @lingma-agents at a specific location in the file. For example:

  • _@lingma-agents analyzes the performance bottlenecks of this method and provides optimization suggestions. _

  • _@lingma-agents generate optimized code for this method. _

📜Ask a question in the discussion
@lingma-agents for help in any discussion. For example:

  • _@lingma-agents Please summarize the above discussion and suggest a solution. _

  • _@lingma-agents Please generate optimized code based on the discussion content. _

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 44.97%. Comparing base (ef31e09) to head (36b1ad9).
⚠️ Report is 739 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2997      +/-   ##
==========================================
+ Coverage   35.91%   44.97%   +9.06%     
==========================================
  Files          69       82      +13     
  Lines       11576    13378    +1802     
==========================================
+ Hits         4157     6017    +1860     
+ Misses       7104     7014      -90     
- Partials      315      347      +32     

see 80 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@rinfx rinfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rinfx rinfx merged commit 1f301be into alibaba:main Oct 15, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants