-
Notifications
You must be signed in to change notification settings - Fork 348
[WIP] High: fenced: Registration of a STONITH device without placement constraints fails. #3849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@HideoYamauchi What do you mean by placement constraint? A location constraint (in the |
|
Sorry.
yes. It seems that the problem occurs if you do not specify a location rule (placement score). *Since there are no issues up to 2.1.9, I think this is some kind of problem with the state transition calculation. Thinking about it a bit more... With just this fix, I don't think it will work well when -INFINITY constraints are mixed in. ---> I checked a bit and it seems that including the -INFINITY location constraint doesn't cause any problems. Best Regards, |
|
When you don't see the fence-resource started would a fencing action like off or reboot work? |
|
Thank you for your comment. The problem here is the registration of stonith devices with fenced. I know I'm repeating myself, but this problem is probably a regression. Best Regards, |
|
Yep that answers my question. Seems to be an issue when fenced is calling the scheduler-code not when it is called in the context of the scheduler-process (maybe there as well - but maybe worth checking if it is the context somehow). |
|
@HideoYamauchi Do you know if it would be possible to write a scheduler regression test to reproduce the failure? If so, we could use that to |
|
Thanks for your comment. If you're referring to CTS, I'm not familiar with the details of CTS. However, I think the problem can be reproduced by following the steps below. Step 1) Assuming that this is the initial cluster startup, clear everything in /var/lib/pacemaker/. Step 2) Configure the initial cluster with pcs cluster start --all. fence_scsi.xml Best Regards, |
|
By the way, this problem does not seem to occur if cib.xml already exists in /var/lib/pacemaker. Best Regards, |
Maybe a pointer in the direction that it is the context the scheduler code is run in. |
|
could be the empty I've been busy with other things and I have not personally looked at this in any detail yet. I just skimmed over the CIB that Hideo provided, along with the comment that the problem doesn't occur when Are you able to fill in the nodes section in the |
|
I agree with you that the cause is the timing and the presence or absence of content in the node section of cib.xml. Best Regards, |
|
A cib-change will immediately trigger fenced to react while a scheduler run may not be triggered right away ... |
If the cib.xml you import contains a node section, the cluster will be configured successfully. Best Regards, |
|
Hi All, This is a significant issue for our users.
I will close this PR for now and submit another one. I understand the difference with the 3.0 series that is the problem. Many thanks, |
|
It will be closed for re-PR. |
This fixes a regression introduced by bf7ffcd. As of that commit, the fake local node is created after all resources have been unpacked. So it doesn't get added to resources' allowed_nodes tables. This prevents registration of fencing devices when the fencer receives a CIB diff that doesn't contain the local node. For example, the user may have replaced the CIB with a boilerplate configuration that has an empty nodes section. See the following pull requests from Hideo Yamauchi and their discussions: ClusterLabs#3849 ClusterLabs#3852 Thanks to Hideo for the report and finding the cause. Signed-off-by: Reid Wahl <[email protected]>
This fixes a regression introduced by bf7ffcd. As of that commit, the fake local node is created after all resources have been unpacked. So it doesn't get added to resources' allowed_nodes tables. This prevents registration of fencing devices when the fencer receives a CIB diff that doesn't contain the local node. For example, the user may have replaced the CIB with a boilerplate configuration that has an empty nodes section. See the following pull requests from Hideo Yamauchi and their discussions: ClusterLabs#3849 ClusterLabs#3852 Thanks to Hideo for the report and finding the cause. Signed-off-by: Reid Wahl <[email protected]>
This fixes a regression introduced by bf7ffcd. As of that commit, the fake local node is created after all resources have been unpacked. So it doesn't get added to resources' allowed_nodes tables. This prevents registration of fencing devices when the fencer receives a CIB diff that removes the local node. For example, the user may have replaced the CIB with a boilerplate configuration that has an empty nodes section. Registering a fencing device requires that the local node be in the resource's allowed nodes table. One option would be to add the fake local node to all resources' allowed nodes tables immediately after creating it. However, it shouldn't necessarily be an allowed node for all resources. For example, if symmetric-cluster=false, then a node should not be placed in a resource's allowed nodes table by default. It's difficult to ensure correctness of the allowed nodes tables when a fake local node is involved. It may involve repeated code or a fragile and confusing dependence on the order of unpacking. Since the fake local node is a hack in the first place, it seems better to avoid using it where possible. Currently the only code that even sets the local_node_name member of scheduler->priv is in: * the fencer * crm_mon when showing the "times" section This commit works as follows. If the fencer receives a CIB diff notification that contains the nodes section, it triggers a full refresh of fencing device registrations. In our example above, where a user has replaced the CIB with a configuration that has an empty nodes section, this means all fencing device registrations will be removed. However, the controller also has a CIB diff notification callback: do_cib_updated(). The controller's callback repopulates the nodes section with up-to-date information from the cluster layer (or its node cache) if it finds that an untrusted client like cibadmin has sent a modified the nodes section. Then it updates the CIB accordingly. The fencer will receive this updated CIB and refresh fencing device registrations again, re-registering the fencing devices that were just removed. Note that in the common case, we're not doing all this wasted work. The "remove and then re-register" sequence should happen only when a user has modified the CIB in a sloppy way (for example, by deleting nodes from the CIB's nodes section that have not been removed from the cluster). In short, it seems better to rely on the controller's maintenance of the CIB's node list, than to rely on a "fake local node" hack in the scheduler. See the following pull requests from Hideo Yamauchi and their discussions: ClusterLabs#3849 ClusterLabs#3852 Thanks to Hideo for the report and finding the cause. Signed-off-by: Reid Wahl <[email protected]>
This fixes a regression introduced by bf7ffcd. As of that commit, the fake local node is created after all resources have been unpacked. So it doesn't get added to resources' allowed_nodes tables. This prevents registration of fencing devices when the fencer receives a CIB diff that removes the local node. For example, the user may have replaced the CIB with a boilerplate configuration that has an empty nodes section. Registering a fencing device requires that the local node be in the resource's allowed nodes table. One option would be to add the fake local node to all resources' allowed nodes tables immediately after creating it. However, it shouldn't necessarily be an allowed node for all resources. For example, if symmetric-cluster=false, then a node should not be placed in a resource's allowed nodes table by default. It's difficult to ensure correctness of the allowed nodes tables when a fake local node is involved. It may involve repeated code or a fragile and confusing dependence on the order of unpacking. Since the fake local node is a hack in the first place, it seems better to avoid using it where possible. Currently the only code that even sets the local_node_name member of scheduler->priv is in: * the fencer * crm_mon when showing the "times" section This commit works as follows. If the fencer receives a CIB diff notification that contains the nodes section, it triggers a full refresh of fencing device registrations. In our example above, where a user has replaced the CIB with a configuration that has an empty nodes section, this means all fencing device registrations will be removed. However, the controller also has a CIB diff notification callback: do_cib_updated(). The controller's callback repopulates the nodes section with up-to-date information from the cluster layer (or its node cache) if it finds that an untrusted client like cibadmin has sent a modified the nodes section. Then it updates the CIB accordingly. The fencer will receive this updated CIB and refresh fencing device registrations again, re-registering the fencing devices that were just removed. Note that in the common case, we're not doing all this wasted work. The "remove and then re-register" sequence should happen only when a user has modified the CIB in a sloppy way (for example, by deleting nodes from the CIB's nodes section that have not been removed from the cluster). In short, it seems better to rely on the controller's maintenance of the CIB's node list, than to rely on a "fake local node" hack in the scheduler. See the following pull requests from Hideo Yamauchi and their discussions: ClusterLabs#3849 ClusterLabs#3852 Thanks to Hideo for the report and finding the cause. Signed-off-by: Reid Wahl <[email protected]>
Hi All,
Starting with Pacemaker 3.0.0, registration of STONITH devices without placement constraints is not performed, so unfencing of STONITH devices fails.
Pacemaker 2.1.9 and earlier did not have this problem, so it seems that some fix in the 3.0.0 series of Pacemaker is affecting it.
I have not tracked down the fix, so I will send a tentative fix. (It is probably a degradation of some process.)
I will leave it to the community to come up with a fix that takes the regression into account.
Best Regards,
Hideo Yamauchi.