Decrease link probing interval after switchover to better determine the overhead of a toggle#43
Merged
zjswhhh merged 17 commits intosonic-net:masterfrom Mar 22, 2022
Conversation
… decreaseLinkProberIntervalAfterSwitchover
… decreaseLinkProberIntervalAfterSwitchover
… decreaseLinkProberIntervalAfterSwitchover
lolyu
reviewed
Mar 22, 2022
src/link_prober/LinkProber.cpp
Outdated
| // | ||
| // get link prober interval | ||
| // | ||
| uint32_t LinkProber::getProbingInterval() |
Contributor
There was a problem hiding this comment.
Would it better to have this function as inline?
Collaborator
Author
There was a problem hiding this comment.
Updated accordingly.
zjswhhh
added a commit
that referenced
this pull request
Mar 22, 2022
…he overhead of a toggle (#43) ### Description of PR Summary: Fixes # (issue) This PR is to get more accurate timestamp of when toggle completes on mux. The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```. When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? To better determine the overhead of a toggle. #### How did you do it? Decrease link probing interval after switchover is triggered. #### How did you verify/test it? Tested cases below on dual testbed: 1. switchover succeeds, icmp_respnder is on. 2. switchover completes but icmp_responder is off. In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh
added a commit
to zjswhhh/sonic-linkmgrd
that referenced
this pull request
Mar 23, 2022
…he overhead of a toggle (sonic-net#43) ### Description of PR Summary: Fixes # (issue) This PR is to get more accurate timestamp of when toggle completes on mux. The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```. When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? To better determine the overhead of a toggle. #### How did you do it? Decrease link probing interval after switchover is triggered. #### How did you verify/test it? Tested cases below on dual testbed: 1. switchover succeeds, icmp_respnder is on. 2. switchover completes but icmp_responder is off. In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh
added a commit
to zjswhhh/sonic-linkmgrd
that referenced
this pull request
Mar 23, 2022
…he overhead of a toggle (sonic-net#43) ### Description of PR Summary: Fixes # (issue) This PR is to get more accurate timestamp of when toggle completes on mux. The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```. When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? To better determine the overhead of a toggle. #### How did you do it? Decrease link probing interval after switchover is triggered. #### How did you verify/test it? Tested cases below on dual testbed: 1. switchover succeeds, icmp_respnder is on. 2. switchover completes but icmp_responder is off. In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
zjswhhh
added a commit
that referenced
this pull request
Mar 23, 2022
…he overhead of a toggle #43 (#48) ### Description of PR Original commit & PR in master branch: c43cf7a Jing Zhang Tue Mar 22 16:22:00 2022 -0700 Decrease link probing interval after switchover to better determine the overhead of a toggle (#43) Summary: Fixes # (issue) This PR is to get more accurate timestamp of when toggle completes on mux. The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db ```LINK_PROBE_STATS table```. When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well. sign-off: Jing Zhang zhangjing@microsoft.com ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? To better determine the overhead of a toggle. #### How did you do it? Decrease link probing interval after switchover is triggered. #### How did you verify/test it? Tested cases below on dual testbed: 1. switchover succeeds, icmp_respnder is on. 2. switchover completes but icmp_responder is off. In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
4 tasks
zjswhhh
added a commit
that referenced
this pull request
Apr 1, 2022
### Description of PR Summary: Fixes # (issue) Disable part of the feature introduced in #43. The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. sign-off: Jing Zhang zhangjing@microsoft.com ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? We need to reconsider the design of this feature. To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
4 tasks
zjswhhh
added a commit
that referenced
this pull request
Apr 1, 2022
…switch overhead #49 (#54) ### Description of PR Can't cleanly cherry pick the commit from master branch: 34a68d1 disable switchover measuring based on link prober (#49) Summary: Fixes # (issue) Disable part of the feature introduced in #43. The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. sign-off: Jing Zhang zhangjing@microsoft.com ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? We need to reconsider the design of this feature. To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
zjswhhh
added a commit
to zjswhhh/sonic-linkmgrd
that referenced
this pull request
Apr 15, 2022
…switch overhead sonic-net#49 (sonic-net#54) ### Description of PR Can't cleanly cherry pick the commit from master branch: 34a68d1 disable switchover measuring based on link prober (sonic-net#49) Summary: Fixes # (issue) Disable part of the feature introduced in sonic-net#43. The link probing interval will NOT be decreased by default. Link prober state change events will still be posted in `LINK_PROBE_STATS|PORTNAME` in state db. sign-off: Jing Zhang zhangjing@microsoft.com ### Type of change - [x] New feature ### Approach #### What is the motivation for this PR? We need to reconsider the design of this feature. To be more specific, this is a special case of decreasing probing interval, it's for measurement purposes only. We still want to trigger the toggle in 300ms when pack loss happens. The negative count should be 30 instead of 3 when interval is decreased to 10ms.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary:
Fixes # (issue)
This PR is to get more accurate timestamp of when toggle completes on mux.
The method is to decrease link probing interval to 10ms after a switchover is triggered, and write the timestamp of link prober state change to state db
LINK_PROBE_STATS table.When switchover is over, revert the probing interval change. If switchover does not complete within 400ms, revert the change as well.
sign-off: Jing Zhang zhangjing@microsoft.com
Type of change
Approach
What is the motivation for this PR?
To better determine the overhead of a toggle.
How did you do it?
Decrease link probing interval after switchover is triggered.
How did you verify/test it?
Tested cases below on dual testbed:
In both cases, link prober events are posted to state db as expected. Link probing interval is decreased and reverted as expected.
Any platform specific information?
Documentation