-
Notifications
You must be signed in to change notification settings - Fork 694
Update netlink messages handler #2233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
4f2cf4f
Do not ignore netlink messages on interfaces belong to LAG
liorghub df2d788
Add VS test for fix in netlink messages handler
liorghub 7f21453
Merge branch 'Azure:master' into fix_netdev_oper
liorghub a8228f6
Revert former fix and ignore netlink messages of a port only when it …
liorghub 45232f2
Remove unneeded include
liorghub 5bb10de
Fix comment
liorghub File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, this would be handled by teamsyncd. Can you check?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny I checked, teamsyncd is handling messages being sent for the port-channel interface itself, those messages are marked with type="team". The bug I fixed concerns the handling of messages for ports that belongs to port-channel. These messages are not marked with type="team".
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, @judyjoseph , can you check this? This seems to be basic change and missed. @liorghub, What is the functional impact?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The functional impact is in LLDP, there we check state DB PORT_TABLE for "netdev_oper_status" up before sending LLDP commands. If "netdev_oper_status" is down, LLDP command is not being sent causing wrong LLDP behavior.
See the following code in lldpmgrd.
https://github.com/Azure/sonic-buildimage/blob/cc30771f6b97234a6dd19d8f97d5dfd44551cf20/dockers/docker-lldp/lldpmgrd#L170
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. lgtm. As Xu suggested, please add VS tests to cover this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny I did a quick check .. noting down the events from syslog. I find that the 'netdev_oper_status' is set much earlier for an interface as long as the interface is connected and up. The teamd member addition happens earlier.
@liorghub could you share a bit more details on when you observe this behavior -- is it seen always with lldp ? for all port channel member interfaces ( or only for interface which were initially oper down, after a while they become oper up as they become part of portchannel ? )
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@judyjoseph
Hi judy,
Issue happens when switch is booting.
Ethernet0 is part of port-channel.
As you can see below, portsyncd gets several netlink messages for Ethernet0,
The last message that arrives without "master" (master:0) is at 07:19:15.359655 and it is oper down.
Later we get more messages for Ethernet0 with oper up but we ignore them since they are marked with "master".
Interfaces that have master can be either part of vlan bridge or part of port-channel.
We want to ignore only vlan bridge (confirmed with @zhenggen-xu)
Since the last massage for Ethernet0 we handle is with oper down, state DB holds "netdev_oper_status" = "down", this is causing wrong LLDP behaviour.
Issue is persistent and occurs after each reboot.
See below logs: