Skip to content

Fix flaky tacacs authorization one server down case#7683

Merged
yejianquan merged 2 commits intosonic-net:masterfrom
wangxin:fix-tacacs-authorization
Mar 8, 2023
Merged

Fix flaky tacacs authorization one server down case#7683
yejianquan merged 2 commits intosonic-net:masterfrom
wangxin:fix-tacacs-authorization

Conversation

@wangxin
Copy link
Collaborator

@wangxin wangxin commented Mar 8, 2023

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205

Approach

What is the motivation for this PR?

The tacacs authorization test has a case to verify that authorization is still working if one of the tacacs servers is down.

In case the previous authorization method is local, after it is switched to tacacs+, the tacacs client will start to contact the tacacs servers for authorization. If we immediately run a command after the authorization method is changed, this command may fail with authorization because the client is still trying the invalid tacacs server.

How did you do it?

The fix is to add a delay which is longer than the configured tacacs timeout after authorization method is changed from local to tacacs+.

Extra improvements:

  • Fixed issues detected by pre-commit.
  • Created new test fixtures to configure and restore tacacs authorization configuration.
  • Improved the code for starting and stopping tacacs server. The existing method of stopping tacacs server takes up to 40 seconds. Overall testing time of test_authorization.py takes up to 270 seconds. After the improvement, overall test time of this script is around 70 seconds.

How did you verify/test it?

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

The tacacs authorization test has a case to verify that authorization
is still working if one of the tacacs servers is down.

In case the previous authorization method is local, after it is switched
to tacacs+, the tacacs client will start to contact the tacacs servers
for authorization. If we immediately run a command after the authorization
method is changed, this command may fail with authorization because the
client is still trying the invalid tacacs server.

The fix is to add a delay which is longer than the configured tacacs timeout
after authorization method is changed from local to tacacs+.

Extra improvements:
* Fixed issues detected by pre-commit.
* Created new test fixtures to configure and restore tacacs authorization
  configuration.
* Improved the code for starting and stopping tacacs server. The existing
  method of stopping tacacs server takes up to 40 seconds. Overall testing
  time of test_authorization.py takes up to 270 seconds. After the
  improvement, overall test time of this script is around 70 seconds.

Signed-off-by: Xin Wang <[email protected]>
@wangxin wangxin requested review from liuh-80 and yejianquan March 8, 2023 06:55
yejianquan
yejianquan previously approved these changes Mar 8, 2023
Copy link
Collaborator

@yejianquan yejianquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

liuh-80
liuh-80 previously approved these changes Mar 8, 2023
@wangxin wangxin dismissed stale reviews from liuh-80 and yejianquan via 12a4821 March 8, 2023 08:29
@yejianquan yejianquan merged commit c73e4e9 into sonic-net:master Mar 8, 2023
yejianquan pushed a commit that referenced this pull request Mar 8, 2023
Approach
What is the motivation for this PR?
The tacacs authorization test has a case to verify that authorization is still working if one of the tacacs servers is down.

In case the previous authorization method is local, after it is switched to tacacs+, the tacacs client will start to contact the tacacs servers for authorization. If we immediately run a command after the authorization method is changed, this command may fail with authorization because the client is still trying the invalid tacacs server.

How did you do it?
The fix is to add a delay which is longer than the configured tacacs timeout after authorization method is changed from local to tacacs+.

Extra improvements:

Fixed issues detected by pre-commit.
Created new test fixtures to configure and restore tacacs authorization configuration.
Improved the code for starting and stopping tacacs server. The existing method of stopping tacacs server takes up to 40 seconds. Overall testing time of test_authorization.py takes up to 270 seconds. After the improvement, overall test time of this script is around 70 seconds.

Co-authorized by: [email protected]
yejianquan pushed a commit that referenced this pull request Mar 8, 2023
Approach
What is the motivation for this PR?
The tacacs authorization test has a case to verify that authorization is still working if one of the tacacs servers is down.

In case the previous authorization method is local, after it is switched to tacacs+, the tacacs client will start to contact the tacacs servers for authorization. If we immediately run a command after the authorization method is changed, this command may fail with authorization because the client is still trying the invalid tacacs server.

How did you do it?
The fix is to add a delay which is longer than the configured tacacs timeout after authorization method is changed from local to tacacs+.

Extra improvements:

Fixed issues detected by pre-commit.
Created new test fixtures to configure and restore tacacs authorization configuration.
Improved the code for starting and stopping tacacs server. The existing method of stopping tacacs server takes up to 40 seconds. Overall testing time of test_authorization.py takes up to 270 seconds. After the improvement, overall test time of this script is around 70 seconds.

Co-authorized by: [email protected]
@wangxin wangxin deleted the fix-tacacs-authorization branch July 12, 2023 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants