Skip to content

bugfix: fix RM_CHANNELS get NPE when resourceId is empty#5579

Merged
slievrly merged 10 commits intoapache:2.xfrom
dmego:fix_rm_channels_get_npe(2.x)
Jun 14, 2023
Merged

bugfix: fix RM_CHANNELS get NPE when resourceId is empty#5579
slievrly merged 10 commits intoapache:2.xfrom
dmego:fix_rm_channels_get_npe(2.x)

Conversation

@dmego
Copy link
Contributor

@dmego dmego commented May 15, 2023

  • I have registered the PR changes.

Ⅰ. Describe what this PR did

  • RM 注册的时候加了一个 resourceId 的校验,为空时抛出异常。
  • ChannelManager#getChannel 方法中,也加了一个 resourceId 不为空的校验,当为空时直接返回 null

Ⅱ. Does this pull request fix one issue?

fixes #5575

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

}

// ResourceId can not be null or empty
if (StringUtils.isBlank(resourceId)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不是在服务端的rmregistry request接收处理时抛异常回去更好,因为这个问题是在seata-go发生的,在服务端处理后就可以避免其他语言都要类似加一个该判断

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,我改一下

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a364176773 在服务端RM注册的时候也加了校验,不过发现两个问题:

  1. TM 和 RM 的注册使用的 sendAsyncRequest, 是没有返回值的,所以保留了客户端的校验。
  2. 测试的时候发现 AbstractIdentifyResponseCodec 编解码方法没有先调父类的。导致在客户端获取RegisterRMResponseRegisterTMResponse 时拿不到错误消息和请求成功结果。这个问题我也改了

Copy link
Contributor

@funky-eyes funky-eyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@funky-eyes funky-eyes added this to the 2.0.0 milestone May 15, 2023
@funky-eyes funky-eyes added type: bug Category issues or prs related to bug. module/core core module module/server server module module/serializer serializer module labels May 15, 2023

@Override
public <T> void encode(T t, ByteBuf out) {
super.encode(t, out);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it compatible with high and low versions?

low client -> current server
current client -> high server

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

问题应该不在这,而在server注册的时候抛的异常,因为本身旧客户端也没有执行父类的encode,要做的应该是不同版本下对连接时,resourceid为null的处理方式进行改变

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protocol incompatibility in the current version is not acceptable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已回退这块不兼容代码

}

private static void checkDbKeySet(Set<String> dbKeySet) {
if (dbKeySet == null || dbKeySet.isEmpty()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why throw an exception?
Registering a connection to RM and registering a DBKey are two processes.
The DBKey is allowed to be null when the connection is registered, and the DBKey can be appended to the connection later. Maybe the Datasource was not initialized when RM reconnect registered the connection.


为什么抛出异常?
RM的注册连接和注册DBKey是两个过程。
注册连接时DBKey允许为空,可以在这个连接上后续追加DBKey。因为RM reconnect注册连接时可能Datasource并未初始化完成。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我看了代码确实是这样,先init连接,后datasource加载再发送registry,rmregistry在注册的时候会合并本地的resourceid,这块逻辑应该先去除比较好,不知道有没有好一点的办法可以提前阻断不正确的连接方式,而不是事后才知晓

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重新调整了一下代码,如果 "注册连接时DBKey允许为空" 的话,Server 端的校验就不能加上,也不能抛异常。这样的话只能在客户端去加以控制了:

  • 对于 TCC Resource 这种需要手工配置的 resourceId,在注册时加以校验,为空直接抛出异常
  • 对于其他事务的 RM 端在注册时,当校验到 resourceId 为空时,打印 warn 日志,提示风险,待用户发现后可以进行检查
  • TC 端在发起二阶段选择 channel 时,先校验 resourceId 是否为空,是则直接返回 null,避免 NPE 问题

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
RegisterTMResponse response = new RegisterTMResponse(isSuccess);
response.setIdentified(isSuccess);
if (!isSuccess) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ResultCode is a system identifier used to indicate whether the process was successful.
identified indicates the business result of the business process.


ResultCode用于表示处理是否成功,是个过程中的系统标识。
identified 表示业务处理的业务结果。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我看在 RegisterTMResponse.class 的构造方法里,确实是直接调用的 setIdentified(isSuccess),

Copy link
Member

@slievrly slievrly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slievrly slievrly merged commit 1dc6a3c into apache:2.x Jun 14, 2023
@slievrly slievrly changed the title bugfix: fix RM_CHANNELS get npe when resourceId is empty bugfix: fix RM_CHANNELS get NPE when resourceId is empty Jun 14, 2023
YvCeung pushed a commit to YvCeung/incubator-seata that referenced this pull request Dec 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module/core core module module/serializer serializer module module/server server module type: bug Category issues or prs related to bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RM_CHANNELS get npe

3 participants