Skip to content

Add ActiveActiveStateMachine implementation#64

Merged
lolyu merged 30 commits intosonic-net:masterfrom
lolyu:addActiveActiveStateMachine
May 6, 2022
Merged

Add ActiveActiveStateMachine implementation#64
lolyu merged 30 commits intosonic-net:masterfrom
lolyu:addActiveActiveStateMachine

Conversation

@lolyu
Copy link
Contributor

@lolyu lolyu commented Apr 13, 2022

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Doc/Design
  • Unit test

Approach

What is the motivation for this PR?

Add link manager state machine ActiveActiveStateMachine to work if the port is in active-active cable type.

How did you do it?

Add ActiveActiveStateMachine to cover the following scenarions:

  • if the mux state is active, and link prober receives unknown event, linkmgrd will try to switch the mux to standby.
  • if the mux state is standby, and link prober receives active event, linkmgrd will try to switch the mux to active.
  • if the mux state is active, and link down event received, linkmgrd will try to switch the mux to standby.
  • if the mux state is active, and mux config set to standby, linkmgrd will try to switch the mux to standby even with link prober active event.

How did you verify/test it?

Add unittests to cover the above scenarios.

Any platform specific information?

Documentation

@lolyu lolyu force-pushed the addActiveActiveStateMachine branch 5 times, most recently from 12e385d to a5af507 Compare April 18, 2022 10:30
lolyu added 4 commits April 18, 2022 10:33
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@lolyu lolyu force-pushed the addActiveActiveStateMachine branch from a5af507 to 9cf2ae9 Compare April 18, 2022 10:33
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@lolyu lolyu force-pushed the addActiveActiveStateMachine branch from 9cf2ae9 to 0b1c2e0 Compare April 19, 2022 08:34
lolyu added 15 commits April 19, 2022 08:40
…pe port

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
For port in `active-active` port cable type, use `Wait` state as initial
state.
1. Modify `Wait` state so it could change to `Unknown` state if port
cable type is `active-active`.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
1. Explicit inistantiate template function `postLinkProberEvent` to work
for `IcmpSelfEvent` and `IcmpPeerEvent`.
2. Use lambda functions to call `processEvent` instead of with
`boost::bind`, so it could use those overloads of `processEvent`.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@lolyu lolyu requested review from yxieca and zjswhhh April 25, 2022 05:55
@lolyu lolyu marked this pull request as ready for review April 25, 2022 05:55
lolyu added 8 commits April 25, 2022 06:42
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
@lolyu lolyu force-pushed the addActiveActiveStateMachine branch from 8608abe to 83ddb52 Compare April 25, 2022 13:00
@sonic-net sonic-net deleted a comment from lgtm-com bot Apr 25, 2022
lolyu added 2 commits April 25, 2022 13:27
Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
Add a mux prober timer `mDeadlineTimer`, so that, after a mux probe, if
linkmgrd didn't receives a valid mux state probe response as either active
or standby, it will tries to probe again.

Signed-off-by: Longxiang Lyu <lolv@microsoft.com>
if (errorCode == boost::system::errc::success) {
if (ms(mCompositeState) == mux_state::MuxState::Label::Unknown ||
ms(mCompositeState) == mux_state::MuxState::Label::Error ||
ms(mCompositeState) == mux_state::MuxState::Label::Wait) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought we got rid of Wait in active-active?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Wait will only occur in the initial state (Wait, Wait, Down)

mDeadlineTimer.cancel();
startMuxWaitTimer();
} else {
probeMuxState();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if in non-auto mode, and probing back a mux state that is different than mux mode, will we switch to match mux mode?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the question is, is mux mode a "must-be" or a "preference"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the mux mode is more like a one-time-off setting, after setting the mux mode, linkmgrd now assumes the toggle must be a success, and won't react to any more state changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it make more sense in active-active to keep mux state aligned with mux mode? But we can leave it as same as active-standby (one-time setting) for now.

Copy link
Collaborator

@zjswhhh zjswhhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lolyu lolyu merged commit df51322 into sonic-net:master May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants