diff --git a/docs/src/en/filter/filter.md b/docs/src/en/filter/filter.md index 88125e50..ec5c5bed 100644 --- a/docs/src/en/filter/filter.md +++ b/docs/src/en/filter/filter.md @@ -2,47 +2,82 @@ outline: deep --- # Built-in Filter Rules -RedisShake provides various built-in filter rules that users can choose from according to their needs. -## Filtering Keys -RedisShake supports filtering by key name, key name prefixes, and suffixes. You can set the following options in the configuration file, for example: +RedisShake evaluates filter rules after commands are parsed but before anything is sent to the destination. The filter therefore controls which commands ever leave RedisShake, and only the commands that pass this stage are eligible for further processing by the optional [function](./function.md) hook. + +## Where filtering happens + +``` +source reader --> filter rules --> (optional Lua function) --> writer / target +``` + +* Commands enter the filter after RedisShake has parsed the RESP payload from the reader. At this point the request is already considered valid and would be forwarded if no filters were configured. +* Filtering happens before any other transformation stage, so blocked commands never reach the optional Lua function or the writer. +* The stage operates on the same command representation that writers use, which keeps behaviour consistent for all readers. + +## How Filter Evaluation Works + +1. **Block rules run first.** If a key, database, command, or command group matches a `block_*` rule, the entire entry is dropped immediately. +2. **Allow lists are optional.** When no `allow_*` rule is configured for a category, everything is permitted by default. As soon as you define an allow list, only the explicitly listed items will pass. +3. **Multi-key consistency.** Commands with multiple keys (for example, `MSET`) must either pass for all keys or the entry is discarded. RedisShake also emits logs when a mixed result is detected to help you troubleshoot your patterns. + +Combining allow and block lists lets you quickly express exceptions such as “allow user keys except temporary cache variants.” Block rules take precedence, so avoid listing the same pattern in both allow and block lists. + +## Key Filtering + +RedisShake supports filtering by key names, prefixes, suffixes, and regular expressions. For example: + ```toml [filter] -allow_keys = ["user:1001", "product:2001"] # allowed key names -allow_key_prefix = ["user:", "product:"] # allowed key name prefixes -allow_key_suffix = [":active", ":valid"] # allowed key name suffixes -allow_key_regex = [":\\d{11}:"] # allowed key name regex, 11-digit mobile phone number -block_keys = ["temp:1001", "cache:2001"] # blocked key names -block_key_prefix = ["temp:", "cache:"] # blocked key name prefixes -block_key_suffix = [":tmp", ":old"] # blocked key name suffixes -block_key_regex = [":test:\\d{11}:"] # blocked key name regex, 11-digit mobile phone number with "test" prefix +allow_keys = ["user:1001", "product:2001"] # allow-listed key names +allow_key_prefix = ["user:", "product:"] # allow-listed key prefixes +allow_key_suffix = [":active", ":valid"] # allow-listed key suffixes +allow_key_regex = [":\\d{11}:"] # allow-listed key regex (11-digit phone numbers) +block_keys = ["temp:1001", "cache:2001"] # block-listed key names +block_key_prefix = ["temp:", "cache:"] # block-listed key prefixes +block_key_suffix = [":tmp", ":old"] # block-listed key suffixes +block_key_regex = [":test:\\d{11}:"] # block-listed key regex with "test" prefix ``` -If these options are not set, all keys are allowed by default. -## Filtering Databases -You can specify allowed or blocked database numbers, for example: +Regular expressions follow Go’s syntax. Escape backslashes carefully when writing inline TOML strings. Regex support allows complex tenant-isolation scenarios, such as filtering phone numbers or shard identifiers. + +## Database Filtering + +Limit synchronization to specific logical databases or skip known noisy ones: + ```toml [filter] allow_db = [0, 1, 2] block_db = [3, 4, 5] ``` -If these options are not set, all databases are allowed by default. -## Filtering Commands -RedisShake allows you to filter specific Redis commands, for example: +If neither `allow_db` nor `block_db` is set, all databases are synchronized. + +## Command and Command-Group Filtering + +Restrict the traffic by command name or by the Redis command group. This is useful when the destination lacks support for scripting or cluster administration commands. + ```toml [filter] allow_command = ["GET", "SET"] block_command = ["DEL", "FLUSHDB"] -``` - -## Filtering Command Groups -You can also filter by command groups. Available command groups include: -SERVER, STRING, CLUSTER, CONNECTION, BITMAP, LIST, SORTED_SET, GENERIC, TRANSACTIONS, SCRIPTING, TAIRHASH, TAIRSTRING, TAIRZSET, GEO, HASH, HYPERLOGLOG, PUBSUB, SET, SENTINEL, STREAM -For example: -```toml -[filter] allow_command_group = ["STRING", "HASH"] block_command_group = ["SCRIPTING", "PUBSUB"] ``` + +Command groups follow the [Redis command key specifications](https://redis.io/docs/reference/key-specs/). Use groups to efficiently exclude entire data structures (for example, block `SCRIPTING` to avoid unsupported Lua scripts when synchronizing to a cluster). + +## Configuration Reference + +| Option | Type | Description | +| --- | --- | --- | +| `allow_keys` / `block_keys` | `[]string` | Exact key names to allow or block. | +| `allow_key_prefix` / `block_key_prefix` | `[]string` | Filter keys by prefix. | +| `allow_key_suffix` / `block_key_suffix` | `[]string` | Filter keys by suffix. | +| `allow_key_regex` / `block_key_regex` | `[]string` | Regular expressions evaluated against the full key. | +| `allow_db` / `block_db` | `[]int` | Logical database numbers to include or exclude. | +| `allow_command` / `block_command` | `[]string` | Redis command names. | +| `allow_command_group` / `block_command_group` | `[]string` | Redis command groups such as `STRING`, `HASH`, `SCRIPTING`. | + +All options are optional. When both an allow and block rule apply to the same category, block rules win. Keep configurations symmetrical across active/standby clusters to avoid asymmetric data drops during failover. diff --git a/docs/src/en/filter/function.md b/docs/src/en/filter/function.md index d79cdfea..3c594e5c 100644 --- a/docs/src/en/filter/function.md +++ b/docs/src/en/filter/function.md @@ -4,15 +4,28 @@ outline: deep # What is function -RedisShake provides a function feature that implements the `transform` capability in [ETL (Extract-Transform-Load)](https://en.wikipedia.org/wiki/Extract,_transform,_load). By utilizing functions, you can achieve similar functionalities: -* Change the `db` to which data belongs, for example, writing data from source `db 0` to destination `db 1`. -* Filter data, for instance, only writing source data with keys starting with `user:` to the destination. -* Modify key prefixes, such as writing a source key `prefix_old_key` to a destination key `prefix_new_key`. -* ... +The **function** option extends the `[filter]` section with a Lua hook. Built-in filter rules run first to decide whether a command should leave RedisShake; only the surviving commands enter the Lua function, where you can reshape, split, or enrich them before they reach the destination. This hook is intended for lightweight adjustments that are difficult to express with static allow/block lists. -To use the function feature, you only need to write a Lua script. After RedisShake retrieves data from the source, it converts the data into Redis commands. Then, it processes these commands, parsing information such as `KEYS`, `ARGV`, `SLOTS`, `GROUP`, and passes this information to the Lua script. The Lua script processes this data and returns the processed commands. Finally, RedisShake writes the processed data to the destination. +With the function feature you can: + +* Change the database (`db`) to which data belongs (for example, write source `db 0` into destination `db 1`). +* Filter or drop specific data, keeping only keys that match custom business rules. +* Rewrite commands, such as expanding `MSET` into multiple `SET` commands or adding new key prefixes. +* Emit additional commands (for metrics or cache warming) derived from the incoming data stream. + +## Execution Flow + +1. RedisShake retrieves commands from the reader and parses metadata such as command name, keys, key slots, and group. +2. Built-in filter rules evaluate the command. Anything blocked here never reaches Lua or the writer. +3. For the remaining entries, RedisShake creates a Lua state and exposes read-only context variables (`DB`, `CMD`, `KEYS`, and so on) plus helper functions under the `shake` table. +4. Your Lua code decides which commands to send downstream by calling `shake.call` zero or more times. + +If your script does not invoke `shake.call`, the original command is suppressed. This makes it easy to implement drop-and-replace logic, but also means forgetting a `shake.call` will silently discard data. Always add logging while testing. + +## Quick Start + +Place the Lua script inline in the `[filter]` section of the configuration file: -Here's a specific example: ```toml [filter] function = """ @@ -30,32 +43,37 @@ address = "127.0.0.1:6379" [redis_writer] address = "127.0.0.1:6380" ``` -`DB` is information provided by RedisShake, indicating the db to which the current data belongs. `shake.log` is used for logging, and `shake.call` is used to call Redis commands. The purpose of the above script is to discard data from source `db 0` and write data from other `db`s to the destination. -In addition to `DB`, there is other information such as `KEYS`, `ARGV`, `SLOTS`, `GROUP`, and available functions include `shake.log` and `shake.call`. For details, please refer to [function API](#function-api). +`DB` is information provided by RedisShake, indicating the database to which the current data belongs. `shake.log` is used for logging, and `shake.call` emits a Redis command to the destination. The above script discards data from source `db 0` and forwards data from the other databases. ## function API ### Variables -Because some commands contain multiple keys, such as the `mset` command, the variables `KEYS`, `KEY_INDEXES`, and `SLOTS` are all array types. If you are certain that a command has only one key, you can directly use `KEYS[1]`, `KEY_INDEXES[1]`, `SLOTS[1]`. +Because some commands contain multiple keys, such as `MSET`, the variables `KEYS`, `KEY_INDEXES`, and `SLOTS` are all array types. If you are certain that a command has only one key, you can directly use `KEYS[1]`, `KEY_INDEXES[1]`, and `SLOTS[1]`. | Variable | Type | Example | Description | -|-|-|-|-----| -| DB | number | 1 | The `db` to which the command belongs | -| GROUP | string | "LIST" | The `group` to which the command belongs, conforming to [Command key specifications](https://redis.io/docs/reference/key-specs/). You can check the `group` field for each command in [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands) | -| CMD | string | "XGROUP-DELCONSUMER" | The name of the command | -| KEYS | table | {"key1", "key2"} | All keys of the command | -| KEY_INDEXES | table | {2, 4} | The indexes of all keys in `ARGV` | -| SLOTS | table | {9189, 4998} | The [slots](https://redis.io/docs/reference/cluster-spec/#key-distribution-model) to which all keys of the current command belong | -| ARGV | table | {"mset", "key1", "value1", "key2", "value2"} | All parameters of the command | +| --- | --- | --- | --- | +| `DB` | number | `1` | The database to which the command belongs. | +| `CMD` | string | `"XGROUP-DELCONSUMER"` | The name of the command. | +| `GROUP` | string | `"LIST"` | The command group, conforming to [Command key specifications](https://redis.io/docs/reference/key-specs/). You can check the `group` field for each command in [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands). | +| `KEYS` | table | `{"key1", "key2"}` | All keys of the command. | +| `KEY_INDEXES` | table | `{2, 4}` | Indexes of all keys inside `ARGV`. | +| `SLOTS` | table | `{9189, 4998}` | Hash slots of the keys (cluster mode). | +| `ARGV` | table | `{"mset", "key1", "value1", "key2", "value2"}` | All command arguments, including the command name at index `1`. | ### Functions -* `shake.call(DB, ARGV)`: Returns a Redis command that RedisShake will write to the destination. -* `shake.log(msg)`: Prints logs. + +* `shake.call(db, argv_table)`: Emits a command to the writer. The first element of `argv_table` must be the command name. You can call `shake.call` multiple times to split one input into several outputs (for example, expand `MSET` into multiple `SET`). +* `shake.log(msg)`: Prints logs prefixed with `lua log:` in `shake.log`. Use this to verify script behaviour during testing. ## Best Practices +### General Recommendations + +* **Keep scripts idempotent.** RedisShake may retry commands, so ensure the emitted commands do not rely on side effects. +* **Guard against missing keys.** Always check whether `KEYS[1]` exists before slicing to avoid runtime errors with keyless commands such as `PING`. +* **Prefer simple logic.** Complex loops increase Lua VM time and can slow down synchronization. Offload heavy transformations to upstream processes when possible. ### Filtering Keys @@ -63,14 +81,14 @@ Because some commands contain multiple keys, such as the `mset` command, the var local prefix = "user:" local prefix_len = #prefix -if string.sub(KEYS[1], 1, prefix_len) ~= prefix then +if not KEYS[1] or string.sub(KEYS[1], 1, prefix_len) ~= prefix then return end shake.call(DB, ARGV) ``` -The effect is to only write source data with keys starting with `user:` to the destination. This doesn't consider cases of multi-key commands like `mset`. +The effect is to only write source data with keys starting with `user:` to the destination. This does not consider cases of multi-key commands like `MSET`. ### Filtering DB @@ -85,12 +103,12 @@ shake.call(DB, ARGV) The effect is to discard data from source `db 0` and write data from other `db`s to the destination. - ### Filtering Certain Data Structures -You can use the `GROUP` variable to determine the data structure type. Supported data structure types include: `STRING`, `LIST`, `SET`, `ZSET`, `HASH`, `SCRIPTING`, etc. +You can use the `GROUP` variable to determine the data structure type. Supported data structure types include `STRING`, `LIST`, `SET`, `ZSET`, `HASH`, `SCRIPTING`, and more. #### Filtering Hash Type Data + ```lua if GROUP == "HASH" then return @@ -100,7 +118,7 @@ shake.call(DB, ARGV) The effect is to discard `hash` type data from the source and write other data to the destination. -#### Filtering [LUA Scripts](https://redis.io/docs/interact/programmability/eval-intro/) +#### Filtering [Lua Scripts](https://redis.io/docs/interact/programmability/eval-intro/) ```lua if GROUP == "SCRIPTING" then @@ -109,7 +127,22 @@ end shake.call(DB, ARGV) ``` -The effect is to discard `lua` scripts from the source and write other data to the destination. This is common when synchronizing from master-slave to cluster, where there are LUA scripts not supported by the cluster. +The effect is to discard Lua scripts from the source and write other data to the destination. This is common when synchronizing from master-slave to cluster, where there are Lua scripts not supported by the cluster. + +### Splitting Commands + +```lua +if CMD == "MSET" then + for i = 2, #ARGV, 2 do + shake.call(DB, {"SET", ARGV[i], ARGV[i + 1]}) + end + return +end + +shake.call(DB, ARGV) +``` + +This pattern expands one `MSET` into several `SET` commands to improve compatibility with destinations that prefer single-key writes. ### Modifying Key Prefixes @@ -119,9 +152,9 @@ local prefix_new = "prefix_new_" shake.log("old=" .. table.concat(ARGV, " ")) -for i, index in ipairs(KEY_INDEXES) do +for _, index in ipairs(KEY_INDEXES) do local key = ARGV[index] - if string.sub(key, 1, #prefix_old) == prefix_old then + if key and string.sub(key, 1, #prefix_old) == prefix_old then ARGV[index] = prefix_new .. string.sub(key, #prefix_old + 1) end end @@ -129,10 +162,11 @@ end shake.log("new=" .. table.concat(ARGV, " ")) shake.call(DB, ARGV) ``` + The effect is to write the source key `prefix_old_key` to the destination key `prefix_new_key`. ### Swapping DBs - + ```lua local db1 = 1 local db2 = 2 @@ -146,3 +180,9 @@ shake.call(DB, ARGV) ``` The effect is to write source `db 1` to destination `db 2`, write source `db 2` to destination `db 1`, and leave other `db`s unchanged. + +## Troubleshooting + +* **Script fails to compile:** RedisShake validates the Lua code during startup and panics on syntax errors. Check the configuration logs for the exact line number. +* **No data reaches the destination:** Ensure that `shake.call` is invoked for every branch. Adding `shake.log` statements helps confirm which code path runs. +* **Performance drops:** Heavy scripts may become CPU-bound. Consider narrowing the scope with filters or moving expensive operations out of RedisShake. diff --git a/docs/src/en/guide/config.md b/docs/src/en/guide/config.md index 53b353d6..f1bc69ea 100644 --- a/docs/src/en/guide/config.md +++ b/docs/src/en/guide/config.md @@ -39,7 +39,12 @@ RedisShake provides different Writers to interface with different targets, see t ## filter Configuration -You can set filter rules through the configuration file. Refer to [Filter and Processing](../filter/filter.md) and [function](../filter/function.md). +The `[filter]` section contains two layers: + +* **Rule engine:** Configure `allow_*` and `block_*` lists to keep or drop keys, databases, commands, and command groups. See [Filter and Processing](../filter/filter.md) for detailed semantics and examples. +* **Lua function hook:** Provide inline Lua code via the `function` option to rewrite commands after they pass the rule engine. See [function](../filter/function.md) for API details and best practices. + +Filters always run before the Lua hook. Commands blocked by the rule engine never enter the script or reach the writer, so you can reserve the Lua layer for the smaller, approved subset of traffic. ## advanced Configuration diff --git a/docs/src/zh/filter/filter.md b/docs/src/zh/filter/filter.md index ddc000e6..797ee1ec 100644 --- a/docs/src/zh/filter/filter.md +++ b/docs/src/zh/filter/filter.md @@ -2,47 +2,82 @@ outline: deep --- # 内置过滤规则 -RedisShake 提供了多种内置的过滤规则,用户可以根据需要选择合适的规则。 + +RedisShake 在命令完成解析后、写入目标端之前应用过滤规则。过滤器决定哪些命令能够离开 RedisShake,只有通过该阶段的命令才会进入可选的 [function](./function.md) 钩子继续处理。 + +## 过滤所在位置 + +``` +源端 reader --> 过滤规则 --> (可选 Lua function) --> writer / 目标端 +``` + +* 命令在 reader 解析 RESP 之后进入过滤阶段,此时已经确认请求合法,如未配置过滤器就会被直接转发。 +* 过滤早于其他加工阶段执行,被拦截的命令不会传递给可选的 Lua 脚本或 writer。 +* 该阶段使用与 writer 相同的命令表示形式,因此对所有 reader 都保持一致的行为。 + +## 过滤流程说明 + +1. **优先执行阻止规则。** 只要命中任意 `block_*` 规则(键、DB、命令或命令组),整个命令会立即被丢弃。 +2. **允许列表是可选的。** 未配置某类 `allow_*` 时,该类别默认全部放行。一旦配置允许列表,就只有明确列出的项才能通过。 +3. **多 Key 命令需全部通过。** `MSET` 等多 Key 命令需要全部 Key 同时满足过滤条件,否则整条命令会被丢弃,并在日志中提示混合结果,便于排查配置。 + +通过组合允许与阻止规则,可以快速表达诸如“允许 user 前缀但排除临时缓存”等需求。阻止规则优先生效,请避免同一模式同时出现在允许与阻止列表中。 ## 过滤 Key -RedisShake 支持通过键名、键名前缀和后缀进行过滤。您可以在配置文件中设置以下选项,例如: + +RedisShake 支持通过键名、前缀、后缀以及正则表达式进行过滤,例如: + ```toml [filter] -allow_keys = ["user:1001", "product:2001"] # 允许的键名 -allow_key_prefix = ["user:", "product:"] # 允许的键名前缀 -allow_key_suffix = [":active", ":valid"] # 允许的键名后缀 -allow_key_regex = [":\\d{11}:"] # 允许的键名正则,11位手机号 -block_keys = ["temp:1001", "cache:2001"] # 阻止的键名 -block_key_prefix = ["temp:", "cache:"] # 阻止的键名前缀 -block_key_suffix = [":tmp", ":old"] # 阻止的键名后缀 -block_key_regex = [":test:\\d{11}:"] # 阻止的键名正则,11位手机号,前缀为test +allow_keys = ["user:1001", "product:2001"] # 允许的键名 +allow_key_prefix = ["user:", "product:"] # 允许的键名前缀 +allow_key_suffix = [":active", ":valid"] # 允许的键名后缀 +allow_key_regex = [":\\d{11}:"] # 允许的键名正则(11 位手机号) +block_keys = ["temp:1001", "cache:2001"] # 阻止的键名 +block_key_prefix = ["temp:", "cache:"] # 阻止的键名前缀 +block_key_suffix = [":tmp", ":old"] # 阻止的键名后缀 +block_key_regex = [":test:\\d{11}:"] # 阻止的键名正则,带 test 前缀 ``` -如果不设置这些选项,默认允许所有键。 + +正则表达式使用 Go 语法,书写内联 TOML 时请注意反斜杠转义。借助正则可以灵活实现租户隔离或按编号过滤等场景。 ## 过滤数据库 -您可以指定允许或阻止的数据库编号,例如: + +可限制同步的逻辑库,或跳过已知的噪声库: + ```toml [filter] allow_db = [0, 1, 2] block_db = [3, 4, 5] ``` -如果不设置这些选项,默认允许所有数据库。 -## 过滤命令 -RedisShake 允许您过滤特定的 Redis 命令,例如: +如果未同时配置 `allow_db` 和 `block_db`,默认同步全部数据库。 + +## 过滤命令与命令组 + +可以按命令名称或 Redis 命令组进行限制,常用于目标端不支持某些脚本或管理命令的场景。 + ```toml [filter] allow_command = ["GET", "SET"] block_command = ["DEL", "FLUSHDB"] -``` - -## 过滤命令组 -您还可以按命令组进行过滤,可用的命令组包括: -SERVER, STRING, CLUSTER, CONNECTION, BITMAP, LIST, SORTED_SET, GENERIC, TRANSACTIONS, SCRIPTING, TAIRHASH, TAIRSTRING, TAIRZSET, GEO, HASH, HYPERLOGLOG, PUBSUB, SET, SENTINEL, STREAM -例如: -```toml -[filter] allow_command_group = ["STRING", "HASH"] block_command_group = ["SCRIPTING", "PUBSUB"] ``` + +命令组遵循 [Redis command key specifications](https://redis.io/docs/reference/key-specs/)。通过命令组可以快速过滤整类数据结构,例如在向集群迁移时阻止 `SCRIPTING`,避免目标不支持的 Lua 脚本。 + +## 配置项速查 + +| 配置项 | 类型 | 说明 | +| --- | --- | --- | +| `allow_keys` / `block_keys` | `[]string` | 精确匹配的键名白名单 / 黑名单。 | +| `allow_key_prefix` / `block_key_prefix` | `[]string` | 按键名前缀过滤。 | +| `allow_key_suffix` / `block_key_suffix` | `[]string` | 按键名后缀过滤。 | +| `allow_key_regex` / `block_key_regex` | `[]string` | 使用正则表达式匹配完整键名。 | +| `allow_db` / `block_db` | `[]int` | 包含或排除的逻辑库编号。 | +| `allow_command` / `block_command` | `[]string` | 指定允许或阻止的命令名称。 | +| `allow_command_group` / `block_command_group` | `[]string` | 指定允许或阻止的命令组,如 `STRING`、`HASH`、`SCRIPTING`。 | + +上述配置均为可选项;当允许与阻止规则同时命中时,以阻止为准。建议在主备实例之间保持一致的过滤配置,以避免切换后出现数据差异。 diff --git a/docs/src/zh/filter/function.md b/docs/src/zh/filter/function.md index 069ad562..751c5c66 100644 --- a/docs/src/zh/filter/function.md +++ b/docs/src/zh/filter/function.md @@ -4,16 +4,30 @@ outline: deep # 什么是 function -RedisShake 通过提供 function 功能,实现了的 [ETL(提取-转换-加载)](https://en.wikipedia.org/wiki/Extract,_transform,_load) 中的 `transform` 能力。通过利用 function 可以实现类似功能: -* 更改数据所属的 `db`,比如将源端的 `db 0` 写入到目的端的 `db 1`。 -* 对数据进行筛选,例如,只将 key 以 `user:` 开头的源数据写入到目标端。 -* 改变 Key 的前缀,例如,将源端的 key `prefix_old_key` 写入到目标端的 key `prefix_new_key`。 -* ... +**function** 选项是 `[filter]` 配置段的 Lua 钩子。内置过滤规则会先判定命令是否允许离开 RedisShake,只有通过过滤的命令才会进入 Lua 脚本,在写入目标端之前完成重写、拆分或补充信息。该钩子适用于静态 allow/block 规则难以覆盖的轻量级转换需求。 -要使用 function 功能,只需编写一份 lua 脚本。RedisShake 在从源端获取数据后,会将数据转换为 Redis 命令。然后,它会处理这些命令,从中解析出 `KEYS`、`ARGV`、`SLOTS`、`GROUP` 等信息,并将这些信息传递给 lua 脚本。lua 脚本会处理这些数据,并返回处理后的命令。最后,RedisShake 会将处理后的数据写入到目标端。 +通过 function 可以: + +* 更改数据所属的 `db`,例如将源端 `db 0` 写入到目标端 `db 1`。 +* 根据业务规则筛选或丢弃部分数据。 +* 重写命令,例如将 `MSET` 拆分成多个 `SET`,或追加新的 key 前缀。 +* 基于输入数据额外产生命令(如写入监控或预热缓存)。 + +## 执行流程 + +1. RedisShake 从 Reader 获取命令,并解析出命令名称、Key、slot、命令组等信息。 +2. 内置过滤规则会先判定命令,未通过的命令不会交给 Lua 或 writer。 +3. 对剩余数据,RedisShake 创建 Lua 虚拟机并注入只读上下文变量(如 `DB`、`CMD`、`KEYS`)以及 `shake` 辅助函数。 +4. Lua 脚本可多次调用 `shake.call` 决定向下游输出哪些命令。 + +如果脚本未调用 `shake.call`,原始命令将不会写入目标端。这方便实现“丢弃并替换”的逻辑,但也意味着忘记调用 `shake.call` 会导致数据悄然丢失,测试时请务必配合日志。 + +## 快速上手 + +在配置文件的 `[filter]` 节写入 Lua 脚本: -以下是一个具体的例子: ```toml +[filter] function = """ shake.log(DB) if DB == 0 @@ -29,32 +43,37 @@ address = "127.0.0.1:6379" [redis_writer] address = "127.0.0.1:6380" ``` -`DB` 是 RedisShake 提供的信息,表示当前数据所属的 db。`shake.log` 用于打印日志,`shake.call` 用于调用 Redis 命令。上述脚本的目的是丢弃源端 `db` 0 的数据,将其他 `db` 的数据写入到目标端。 -除了 `DB`,还有其他信息如 `KEYS`、`ARGV`、`SLOTS`、`GROUP` 等,可供调用的函数有 `shake.log` 和 `shake.call`,具体请参考 [function API](#function-api)。 +`DB` 表示当前命令所属的数据库;`shake.log` 用于打印日志,`shake.call` 则将命令写入目标端。上述脚本会丢弃源端 `db 0` 的数据,并同步其他数据库的数据。 ## function API ### 变量 -因为有些命令中含有多个 key,比如 `mset` 等命令。所以,`KEYS`、`KEY_INDEXES`、`SLOTS` 这三个变量都是数组类型。如果确认命令只有一个 key,可以直接使用 `KEYS[1]`、`KEY_INDEXES[1]`、`SLOTS[1]`。 +由于 `MSET` 等命令包含多个 Key,`KEYS`、`KEY_INDEXES`、`SLOTS` 都是数组类型。如果可以确定命令只有一个 Key,可直接使用 `KEYS[1]`、`KEY_INDEXES[1]`、`SLOTS[1]`。 | 变量 | 类型 | 示例 | 描述 | -|-|-|-|-----| -| DB | number | 1 | 命令所属的 `db` | -| GROUP | string | "LIST" | 命令所属的 `group`,符合 [Command key specifications](https://redis.io/docs/reference/key-specs/),可以在 [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands) 中查询每个命令的 `group` 字段 | -| CMD | string | "XGROUP-DELCONSUMER" | 命令的名称 | -| KEYS | table | \{"key1", "key2"\} | 命令的所有 Key | -| KEY_INDEXES | table | \{2, 4\} | 命令的所有 Key 在 `ARGV` 中的索引 | -| SLOTS | table | \{9189, 4998\} | 当前命令的所有 Key 所属的 [slot](https://redis.io/docs/reference/cluster-spec/#key-distribution-model) | -| ARGV | table | \{"mset", "key1", "value1", "key2", "value2"\} | 命令的所有参数 | +| --- | --- | --- | --- | +| `DB` | number | `1` | 命令所属的数据库 | +| `CMD` | string | `"XGROUP-DELCONSUMER"` | 命令名称 | +| `GROUP` | string | `"LIST"` | 命令所属的组,符合 [Command key specifications](https://redis.io/docs/reference/key-specs/),可在 [commands](https://github.com/tair-opensource/RedisShake/tree/v4/scripts/commands) 中查看 | +| `KEYS` | table | `{"key1", "key2"}` | 命令包含的所有 Key | +| `KEY_INDEXES` | table | `{2, 4}` | 所有 Key 在 `ARGV` 中的索引 | +| `SLOTS` | table | `{9189, 4998}` | 当前命令所有 Key 的 slot(集群模式) | +| `ARGV` | table | `{"mset", "key1", "value1", "key2", "value2"}` | 命令的所有参数,索引 `1` 为命令名称 | ### 函数 -* `shake.call(DB, ARGV)`:返回一个 Redis 命令,RedisShake 会将该命令写入目标端。 -* `shake.log(msg)`:打印日志。 + +* `shake.call(db, argv_table)`:向写入端输出命令。`argv_table` 的第一个元素必须是命令名称;可多次调用以拆分命令,例如将 `MSET` 拆分为多个 `SET`。 +* `shake.log(msg)`:在 `shake.log` 中输出带有 `lua log:` 前缀的日志,可用于调试脚本。 ## 最佳实践 +### 通用建议 + +* **保持幂等。** RedisShake 可能会重试命令,脚本应避免依赖不可重复的副作用。 +* **注意空 Key。** 某些命令(如 `PING`)没有 Key,访问 `KEYS[1]` 前需要判空,避免脚本运行异常。 +* **尽量保持脚本简单。** 复杂循环会增加 Lua VM 的执行时间,可考虑通过 filter 缩小处理范围或在链路外完成重度计算。 ### 过滤 Key @@ -62,14 +81,14 @@ address = "127.0.0.1:6380" local prefix = "user:" local prefix_len = #prefix -if string.sub(KEYS[1], 1, prefix_len) ~= prefix then +if not KEYS[1] or string.sub(KEYS[1], 1, prefix_len) ~= prefix then return end shake.call(DB, ARGV) ``` -效果是只将 key 以 `user:` 开头的源数据写入到目标端。没有考虑 `mset` 等多 key 命令的情况。 +效果是只将 key 以 `user:` 开头的源数据写入目标端;未考虑 `MSET` 等多 key 场景。 ### 过滤 DB @@ -82,14 +101,14 @@ end shake.call(DB, ARGV) ``` -效果是丢弃源端 `db` 0 的数据,将其他 `db` 的数据写入到目标端。 - +效果是丢弃源端 `db 0` 的数据,将其他 `db` 的数据写入目标端。 ### 过滤某类数据结构 -可以通过 `GROUP` 变量来判断数据结构类型,支持的数据结构类型有:`STRING`、`LIST`、`SET`、`ZSET`、`HASH`、`SCRIPTING` 等。 +可以通过 `GROUP` 变量来判断数据结构类型,支持 `STRING`、`LIST`、`SET`、`ZSET`、`HASH`、`SCRIPTING` 等。 #### 过滤 Hash 类型数据 + ```lua if GROUP == "HASH" then return @@ -99,7 +118,7 @@ shake.call(DB, ARGV) 效果是丢弃源端的 `hash` 类型数据,将其他数据写入到目标端。 -#### 过滤 [LUA 脚本](https://redis.io/docs/interact/programmability/eval-intro/) +#### 过滤 [Lua 脚本](https://redis.io/docs/interact/programmability/eval-intro/) ```lua if GROUP == "SCRIPTING" then @@ -108,9 +127,24 @@ end shake.call(DB, ARGV) ``` -效果是丢弃源端的 `lua` 脚本,将其他数据写入到目标端。常见于主从同步至集群时,存在集群不支持的 LUA 脚本。 +效果是丢弃源端的 Lua 脚本,将其他数据写入到目标端。常见于主从同步至集群时,目标不支持部分脚本。 + +### 拆分命令 + +```lua +if CMD == "MSET" then + for i = 2, #ARGV, 2 do + shake.call(DB, {"SET", ARGV[i], ARGV[i + 1]}) + end + return +end + +shake.call(DB, ARGV) +``` + +该模式将一条 `MSET` 拆分为多条 `SET`,适合目标端只能处理单 Key 写入的场景。 -### 修改 Key 的前缀 +### 修改 Key 前缀 ```lua local prefix_old = "prefix_old_" @@ -118,9 +152,9 @@ local prefix_new = "prefix_new_" shake.log("old=" .. table.concat(ARGV, " ")) -for i, index in ipairs(KEY_INDEXES) do +for _, index in ipairs(KEY_INDEXES) do local key = ARGV[index] - if string.sub(key, 1, #prefix_old) == prefix_old then + if key and string.sub(key, 1, #prefix_old) == prefix_old then ARGV[index] = prefix_new .. string.sub(key, #prefix_old + 1) end end @@ -128,10 +162,11 @@ end shake.log("new=" .. table.concat(ARGV, " ")) shake.call(DB, ARGV) ``` + 效果是将源端的 key `prefix_old_key` 写入到目标端的 key `prefix_new_key`。 ### 交换 DB - + ```lua local db1 = 1 local db2 = 2 @@ -144,4 +179,10 @@ end shake.call(DB, ARGV) ``` -效果是将源端的 `db 1` 写入到目标端的 `db 2`,将源端的 `db 2` 写入到目标端的 `db 1`, 其他 `db` 不变。 \ No newline at end of file +效果是将源端的 `db 1` 写入到目标端的 `db 2`,将源端的 `db 2` 写入到目标端的 `db 1`,其他 `db` 不变。 + +## 排障建议 + +* **脚本无法编译:** 启动时 RedisShake 会提前编译脚本并在语法错误时退出,检查日志中给出的行号。 +* **目标端没有数据:** 确保所有分支都调用了 `shake.call`,并使用 `shake.log` 输出关键信息进行验证。 +* **性能下降:** 脚本过重可能成为瓶颈,可通过 filter 缩小输入或将复杂计算移出 RedisShake。 diff --git a/docs/src/zh/guide/config.md b/docs/src/zh/guide/config.md index 4f311451..630134f6 100644 --- a/docs/src/zh/guide/config.md +++ b/docs/src/zh/guide/config.md @@ -41,7 +41,12 @@ RedisShake 提供了不同的 Writer 用来对接不同的目标端,配置详 ## filter 配置 -允许通过配置文件设置过滤规则,参考 [过滤与加工](../filter/filter.md) 与 [function](../filter/function.md)。 +`[filter]` 配置段包含两层能力: + +* **规则过滤器:** 通过 `allow_*`、`block_*` 列表控制同步哪些 Key、数据库、命令或命令组。详细语义与示例见 [过滤与加工](../filter/filter.md)。 +* **Lua function 钩子:** 在 `function` 选项中编写内联 Lua 代码,对通过规则过滤的命令进行改写或拆分。更多 API 与最佳实践见 [function](../filter/function.md)。 + +过滤器总是先于 Lua 执行。被规则拦截的命令既不会进入脚本,也不会写入目标端,从而把 Lua 的处理范围限定在已经允许的少量流量上。 ## advanced 配置