[action] [PR:3630] Enable FDB learning event after all ports removed from default 1Q bridge#79
Merged
mssonicbld merged 1 commit intoAzure:202412from May 9, 2025
Conversation
<!-- Please make sure you have read and understood the contribution guildlines: https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md 1. Make sure your commit includes a signature generted with `git commit -s` 2. Make sure your commit title follows the correct format: [component]: description 3. Make sure your commit message contains enough details about the change and related tests 4. Make sure your pull request adds related reviewers, asignees, labels Please also provide the following information in this pull request: --> **What I did** This PR is to fix an orchagent crash issue. Error logs are as below. ``` 2025 May 1 05:51:07.128331 str4-7060x6-64pe-7 ERR swss#orchagent: :- meta_generic_validation_remove: object 0x3a0000000000d7 reference count is 1, can't remove 2025 May 1 05:51:07.128331 str4-7060x6-64pe-7 ERR swss#orchagent: :- removeDefaultBridgePorts: Failed to remove bridge port, rv:-17 2025 May 1 05:51:07.128566 str4-7060x6-64pe-7 INFO swss#supervisord: orchagent terminate called after throwing an instance of 'std::runtime_error' 2025 May 1 05:51:07.128566 str4-7060x6-64pe-7 INFO swss#supervisord: orchagent what(): PortsOrch initialization failure 2025 May 1 05:51:07.815330 str4-7060x6-64pe-7 INFO swss#supervisord 2025-05-01 05:51:07,814 WARN exited: orchagent (terminated by SIGABRT (core dumped); not expected) ``` It's because FDB is learnt on the default bridge, which increased reference count for bridge port and caused port removal failure. The issue is addressed by **not** setting `SAI_SWITCH_ATTR_FDB_EVENT_NOTIFY` when creating switch, and enable it after all ports being removed from default bridge. **Why I did it** This PR is to fix an orchagent crash issue. **How I verified it** 1. The change is verified on multiple platforms. FDB learning can be done normally after this change. **Broadcom** ``` admin@str4-7060x6-64pe-fan-4:~$ fdbshow No. Vlan MacAddress Port Type ----- ------ ----------------- ----------- ------- 1 1234 B6:2C:7E:FC:80:00 Ethernet496 Dynamic 2 1234 D6:5E:2C:C0:B8:0B Ethernet496 Dynamic 3 1235 CE:8F:2A:A1:00:01 Ethernet496 Dynamic ``` **Mellanox** ``` admin@str-msn2700-01:~$ fdbshow No. Vlan MacAddress Port Type ----- ------ ----------------- --------- ------- 1 1000 7C:FE:90:5E:60:01 Ethernet4 Dynamic Total number of entries 1 ``` **Cisco** ``` admin@str3-8101-03:~$ fdbshow No. Vlan MacAddress Port Type ----- ------ ----------------- ----------- ------- 1 1000 9C:09:8B:B6:E6:00 Ethernet240 Dynamic Total number of entries 1 ``` 2. The existing VS test `test_fdb.py` can cover the change. **Details if related**
Collaborator
Author
|
Original PR: sonic-net/sonic-swss#3630 |
Collaborator
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
mssonicbld
added a commit
that referenced
this pull request
May 26, 2025
```<br>* 32da647 - (HEAD -> 202503) Merge branch '202412' of https://github.com/Azure/sonic-swss.msft into 202503 (2025-05-26) [Sonic Automation] * 06b16c3 - (base/202412) [fpmsyncd]Fixing blackhole route to publish protocol field to APPL_DB (#83) (2025-05-23) [Sudharsan Dhamal Gopalarathnam] * b801f2d - [202412] [SRv6] add MySID counters support (#82) (2025-05-19) [Yakiv Huryk] * a999b4d - Merge pull request #81 from r12f/code-sync-202412 (2025-05-17) [Dashuai Zhang] |\ | failure_prs.log fd87e1f - Merge remote-tracking branch 'base/202411' into code-sync-202412 (2025-05-16) [r12f] |/| | failure_prs.log 623b018 - (origin/202411) [202411] Setting default nexthop weight to 1 in fpmsyncd (2025-05-15) [Kumaresh Perumal] | |\ | | failure_prs.log a99088e - Removed logging code. (2025-05-15) [Mahdi Ramezani] | | failure_prs.log 5cdc78e - Fixed a compile error. (2025-05-15) [Mahdi Ramezani] | | failure_prs.log a79b7e0 - Set default nexthop weight to 1. Added unit tests for 'getNextHopWt'. (2025-05-15) [Mahdi Ramezani] | |/ * | 2a0856b - Merge pull request #78 from nazariig/202412-trim-azure (2025-05-14) [Nazarii Hnydyn] * | 2daf207 - Enable FDB learning event after all ports removed from default 1Q bridge (#79) (2025-05-09) [mssonicbld] * | 3b70292 - Move timestamps out of counter table to avoid update too frequently (#75) (2025-04-28) [mssonicbld] * | 3fa0d72 - Merge pull request #74 from mssonicbld/sonicbld/202412-merge (2025-04-23) [mssonicbld] * | be436da - Merge branch '202411' of https://github.com/sonic-net/sonic-swss into 202412 (2025-04-23) [Sonic Automation] |/ * 79f04e3 - Initialize the last fec ber computed values if not found (#3621) (2025-04-22) [mssonicbld]<br>```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What I did
This PR is to fix an orchagent crash issue. Error logs are as below.
It's because FDB is learnt on the default bridge, which increased reference count for bridge port and caused port removal failure.
The issue is addressed by not setting
SAI_SWITCH_ATTR_FDB_EVENT_NOTIFYwhen creating switch, and enable it after all ports being removed from default bridge.Why I did it
This PR is to fix an orchagent crash issue.
How I verified it
Broadcom
Mellanox
Cisco
test_fdb.pycan cover the change.Details if related