Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Dec 7, 2022

What changes were proposed in this pull request?

This is a follow-up to remove PVC_COUNTER decrement part to handle exception cases.

Why are the changes needed?

This PR handles the following two cases.

  1. If pod creation API throws KubernetesClientException, we should not change PVC_COUNTER.
  2. In case of (1), Spark try to delete pod. If pod deletion API also throws KubernetesClientException, we should not change PVC_COUNTER.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs with the newly added test case.

@dongjoon-hyun
Copy link
Member Author

cc @tedyu

@dongjoon-hyun
Copy link
Member Author

For reviewers, the following test case is added.

test("SPARK-41410: An exception during PVC creation should not increase PVC counter")

@dongjoon-hyun
Copy link
Member Author

I'll test this PR more in the cluster.

@dongjoon-hyun
Copy link
Member Author

I already suggest you to use my test code to verify your PR, @tedyu .

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you review this follow-up, @viirya ?

.inNamespace(namespace)
.resource(createdExecutorPod)
.delete()
PVC_COUNTER.decrementAndGet()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for case 1, right? For case 2, we don't need to do anything as PVC_COUNTER was not changed for that case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right for case 1 and 2. So, this is a complete and minimal patch, @viirya .

Copy link
Member Author

@dongjoon-hyun dongjoon-hyun Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is used for driver-owned PVCs, the deletion of pods is irrelevant to the number of PVC. That was my logical bug at the initial patch.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@dongjoon-hyun
Copy link
Member Author

Thank you so much, @viirya . Merged to master.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-41410-2 branch December 7, 2022 03:51
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?

This is a follow-up to remove `PVC_COUNTER` decrement part to handle exception cases.

### Why are the changes needed?

This PR handles the following two cases.

1. If pod creation API throws `KubernetesClientException`, we should not change `PVC_COUNTER`.
2. In case of (1), Spark try to delete pod. If pod deletion API also throws `KubernetesClientException`, we should not change `PVC_COUNTER`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs with the newly added test case.

Closes apache#38949 from dongjoon-hyun/SPARK-41410-2.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants