Skip to content

Add k8s join and disjoin test cases#16141

Merged
wangxin merged 11 commits intosonic-net:masterfrom
lixiaoyuner:dev/yunli1/add-k8s-join-disjoin-testcase
Dec 30, 2024
Merged

Add k8s join and disjoin test cases#16141
wangxin merged 11 commits intosonic-net:masterfrom
lixiaoyuner:dev/yunli1/add-k8s-join-disjoin-testcase

Conversation

@lixiaoyuner
Copy link
Contributor

@lixiaoyuner lixiaoyuner commented Dec 18, 2024

Description of PR

Summary: This PR is adding a test case to test if the sonic is able to join and disjoin k8s cluster and test whether k8s daemonset pods are able to run on sonic.
Fixes # (issue): It's a new test case to test k8s feature

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405
  • 202411

Approach

What is the motivation for this PR?

Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?

Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?

Run it in the KVM testbed and physical testbed, it works

Any platform specific information?

No, should be good for all platform

Supported testbed topology if it's a new test case?

Any

Documentation

In the PR md file.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@lixiaoyuner lixiaoyuner marked this pull request as ready for review December 20, 2024 02:20
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Pull request contains merge conflicts.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

skip:
reason: "kubesonic feature is not supported in slim image"
conditions:
- "hwsku in ['Arista-7050-QX-32S', 'Arista-7050-Q16S64', 'Arista-7060CX-32S-C32', 'Arista-7060CX-32S-C32-T1', 'Arista-7060CX-32S-Q32', 'Arista-7060CX-32S-D48C8', 'Arista-7050QX-32S-S4Q31', 'Arista-7050QX32S-Q32', 'Celestica-E1031-T48S4']"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arista-7050-QX32/Arista-720DT-G48S4/Nexus-3132-GX-Q32 should be using slim image as well, right? is there any more generic way to figure out if the test is invoked against a slim images?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this sku list from the firmware file, the sku are configured slim image certainly. For these 3 you mentioned, Arista-720DT-G48S4 is not using slim image. Arista-7050-QX32/Nexus-3132-GX-Q32 run old sonic versions, and I don't see any nightly test for this sku.



def restore_vmhost_param(vmhost):
logger.info("Start to restore vmhost param")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not guaranteed to be executed, right? the test can fail in the middle so that this function won't be executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in teardown function, should be executed, it's done by pytest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the test plan is cancelled, the kernel parameter can't be restored. Although it can't be restored, but there should no impact.


def check_dut_k8s_version_supported(duthost):
logger.info("Check if the k8s version is supported")
k8s_version = duthost.shell("kubeadm version -o short", module_ignore_errors=True)["stdout"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if kubeadm package is not installed on SONiC image? I think the code still works, but the log would be misleading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very helpful, this is so critical. I removed the module_ignore_errors=True, if no kubeadm installed, the test will fail directly.



@pytest.fixture()
def setup_and_teardown(duthost, vmhost, creds):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conftest.py can be used to share fixture across multiple files. This could be helpful as we add more test cases to this directory. This can be done here or in a future PR

ref: https://docs.pytest.org/en/stable/reference/fixtures.html#conftest-py-sharing-fixtures-across-multiple-files
https://github.com/sonic-net/sonic-mgmt/blob/master/tests/generic_config_updater/conftest.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it. Very useful info, let's do it in the future if we have more test cases.

logger.info(f"Minikube setup has started for {time_diff_seconds} seconds")
if time_diff_seconds > MINIKUBE_SETUP_MAX_SECOND:
logger.info("Minikube setup timeout, need to re-setup")
clean_up_and_setup_minikube(vmhost, creds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this while loop is for the file lock described in the readme. Here, if one testbed which first requested minikube master has taken more time than expected, then we restart the minikube master setup triggered by a second testbed. Is that right?

Depending on the reason for the delay, isn't it possible to still run into the minikube master setup conflict here if the first operation was not cancelled? I think we should fail for testbed1 if testbed1 requests and minikube takes longer than expected to start. Without this failure, we may also run into an infinite loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. The download step and setup step both have timeout(6 mins) now, it will fail if it's too long.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@isabelmsft isabelmsft self-requested a review December 27, 2024 22:22
@wangxin wangxin merged commit a1f63f9 into sonic-net:master Dec 30, 2024
lixiaoyuner added a commit to lixiaoyuner/sonic-mgmt that referenced this pull request Dec 30, 2024
What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.
lixiaoyuner added a commit to lixiaoyuner/sonic-mgmt that referenced this pull request Jan 3, 2025
What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.
@yutongzhang-microsoft
Copy link
Contributor

yutongzhang-microsoft commented Jan 3, 2025

@lixiaoyuner Can this new added test script support kvm testbed?

@lixiaoyuner
Copy link
Contributor Author

@lixiaoyuner Can this new added test script support kvm testbed?

Yes, it supports

yxieca pushed a commit that referenced this pull request Jan 6, 2025
* Add k8s join and disjoin test cases (#16141)

What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.

* Update the conditional mark for kubesonic test

* Filter the version number less than and equal 30 for broadcom
@mssonicbld
Copy link
Collaborator

@lixiaoyuner PR conflicts with 202411 branch

kperumalbfn pushed a commit that referenced this pull request Jan 6, 2025
[202411] Add k8s join and disjoin test cases
lixiaoyuner added a commit to lixiaoyuner/sonic-mgmt that referenced this pull request Jan 7, 2025
What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.
bingwang-ms pushed a commit that referenced this pull request Jan 10, 2025
* Add k8s join and disjoin test cases (#16141)

What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.

* Update the condition mark for 202405

* Add the conditions logical operator

* Remove the asic filter
@yutongzhang-microsoft
Copy link
Contributor

Hi, @lixiaoyuner , can you help me confirm if this script can run on kvm testbed?

nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
What is the motivation for this PR?
Test the sonic device join k8s cluster and disjoin k8s cluster.

How did you do it?
Setup a single master node cluster on the server of testbed, make sonic device join the cluster and disjoin the cluster to check if it works, deploy a daemonset to check if the pods are able to run on sonic.

How did you verify/test it?
Run it in the KVM testbed and physical testbed, it works

Any platform specific information?
No, should be good for all platform

Supported testbed topology if it's a new test case?
Any

Documentation
In the PR md file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants