Skip to content

Multi-ASIC implementation#3888

Merged
SuvarnaMeenakshi merged 62 commits intosonic-net:masterfrom
SuvarnaMeenakshi:pr_2_multiasic
Mar 31, 2020
Merged

Multi-ASIC implementation#3888
SuvarnaMeenakshi merged 62 commits intosonic-net:masterfrom
SuvarnaMeenakshi:pr_2_multiasic

Conversation

@SuvarnaMeenakshi
Copy link
Contributor

@SuvarnaMeenakshi SuvarnaMeenakshi commented Dec 12, 2019

- What I did
To support SONiC on multi-asic platforms, the approach is to create SONiC docker stack for each ASIC. Each SONiC docker stack is separated by network namespaces for each ASIC.
Each network namespace will comprise of key dockers :
database
swss
syncd
bgp
teamd
interfaces-config
lldp

- How I did it
Main changes done to support SONiC on multi-asic platform are:

  1. systemd service template for multi-instance service is added to start N number of systemd services where N is the number of ASICs in the platform.
  2. Changes in FRR config to support BGP between the ASIC namespaces. To support this, a new type is added in DEVICE_METADATA called "InternalFrontend" and "InternalBackend" to differentiate between a front-end ASIC and back-end ASIC.
  3. Change in docker_img_ctl.j2 to support starting dockers with a specific ASIC instance number which will be passed to the <docker_name>.sh script by the systemd service.
  4. Apart from multiple instances of database docker, a global database docker will exist in the host namespace.

- How to verify it
Build a multi-asic VS image.
Follow instruction (in README.vsd) to bring up multi-asic VS image by adding the number of ASICs in device/platform/asic.conf and also adding a specific topology.sh in HWSKU directory to create a virtual ASIC connectivity in the VS.
Once the VS image comes up, ensure the dockers come up and verify BGP sessions between ASICs.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

Lawrence Lee and others added 14 commits December 12, 2019 10:48
* create systemd target to start/stop all containers in a specific namespace at once

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
* add multi-instance option to redis-cli
* create database template service
* change docker_image_ctl to use new redis sockets
* increase database service timeout duration to 120s

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
* add instance number option and unix socket option to docker_image_ctl for swss
* change swss service to service template
* add instance number option and unix socket option to swss.sh
* force all redis-cli calls through redis-cli wrapper

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
* add syncd service template
* add instance number argument for syncd script
* add rsyslog service template compatible with network namespaces
* change rsyslog_config to use database instance unix socket

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit e871f11)
* add wrapper for 'sudo ip netns exec' command

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit 85745b7)
* increase from 2GB to 8GB to prevent OOM crashes
* necessary because multi-ASIC devices have increased memory requirements

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit 845778e)
* add bgp service template
* create /etc/sonic/frr* for multiple instances of bgp docker container

[frr docker]: add multi-ASIC support to bgpd.conf
* add `redistribute connected` option to advertise routes to local interfaces
* add `all` option to next-hop-self
* add route-map to hide internal routes
* add route-map to allow duplicate frontend router id

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit 4ea01b7)
* change teamd service file to template
* force teamd to restart when swss does

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit 0297279)
* set submodule to point at newest commit

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit ebab66d)
* create service template
* enable template instances in sonic_debian_extension
* use iproute2 in interfaces-config script if working with network namespaces
* make swss multi-instance dependent on multi-instance interfaces config

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit 08af0db)
* use 'asic' as namespace prefix instead of 'namespace' in accordance
with PR sonic-net#3269

Signed-off-by: Lawrence Lee <t-lale@microsoft.com>
(cherry picked from commit c1444e3)
@lguohan lguohan requested a review from qiluo-msft January 29, 2020 19:05
@lguohan
Copy link
Collaborator

lguohan commented Jan 29, 2020

@pavel-shirshov , can you check the bgp template?

@@ -0,0 +1,13 @@
[Unit]
Description=Update rsyslog configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do not understand why when need multi-instance for rsyslog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed multiple instances for rsyslog service. Changes required for rsyslog service support for multi-asic platform will be added in a different PR.

if [[ x"$WARM_BOOT" != x"true" ]]; then
/bin/systemctl start ${PEER}
if [[ ! -z $DEV ]]; then
/bin/systemctl start ${PEER}@$DEV
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need consistent spacing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -0,0 +1,16 @@
[Unit]
Description=BGP container
#Requires=updategraph.service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SuvarnaMeenakshi @lguohan : we don't need this dependency for Multi Instance ? Single Instance still have this dependency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support for updategraph and config-setup service for multi-asic platform will be added in a separate PR. As updategraph service is not updated to support multiple config_db, this dependency cannot be added here.

@qiluo-msft qiluo-msft self-assigned this Jan 31, 2020
@qiluo-msft
Copy link
Collaborator

qiluo-msft commented Jan 31, 2020

Please solve the conflicts #Closed

@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

Copy link

@paulnice paulnice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me. Please wait for others.

@lguohan
Copy link
Collaborator

lguohan commented Mar 20, 2020

retest vsimage please

@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

1 similar comment
@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

@lguohan
Copy link
Collaborator

lguohan commented Mar 24, 2020

pending on vsimage test passing.

@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

1 similar comment
@SuvarnaMeenakshi
Copy link
Contributor Author

retest vsimage please

@SuvarnaMeenakshi SuvarnaMeenakshi merged commit 4b8067e into sonic-net:master Mar 31, 2020
abdosi pushed a commit that referenced this pull request Apr 2, 2020
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
abdosi added a commit that referenced this pull request Apr 3, 2020
abdosi pushed a commit that referenced this pull request Apr 15, 2020
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 24, 2020
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 24, 2020
tiantianlv pushed a commit to SONIC-DEV/sonic-buildimage that referenced this pull request Apr 24, 2020
Changes made to support multi-asic platform. Added multi-instance support for swss, syncd, database, bgp, teamd and lldp.
@wendani
Copy link
Contributor

wendani commented May 9, 2020

Multi-ASIC is not part of the 201911 release https://github.com/Azure/SONiC/wiki/Release-Progress-Tracking-201911

Why do we include it in 201911 @abdosi ?

@SuvarnaMeenakshi SuvarnaMeenakshi deleted the pr_2_multiasic branch December 14, 2020 09:28
mssonicbld added a commit that referenced this pull request Jul 23, 2025
…atically (#22686)

#### Why I did it
src/sonic-utilities
```
* e18640e - (HEAD -> master, origin/master, origin/HEAD) Switchport mode update for 'show interfaces status' (#3788) (3 hours ago) [Shivashankar C R]
* 809646a - Revert "Addition of prober_type in config and show commands for muxcable (#3884)" (#3979) (17 hours ago) [Xin Wang]
* 3db35d5 - `vnet_route_check.py` should not report VNET routes in APP DB but not in STATE DB and ASIC DB as mismatches (#3990) (26 hours ago) [mramezani95]
* 8647356 - [show][config][plugin] add processing of ModuleNotFoundError with log_warning (#3832) (32 hours ago) [Maksym Kovalchuk]
* 20976de - fix show bgp cli on multiple asic device (#3981) (5 days ago) [Liping Xu]
* 46c82ab - [db_migrator] Fix parse_xml fails when minigraph has SonicQosProfile (#3972) (6 days ago) [Xin Wang]
* 1c3f789 - Fix route_check.py to ignore local p2p IP prefixes (#3882) (7 days ago) [prabhataravind]
* 898a037 - Make 'show interface errors' lookup the correct oper_error_status count published by OA (#3956) (8 days ago) [Bobby McGonigle]
* eda6ada - [sonic-package-manager] Save tag that was used to install the application (#3917) (8 days ago) [DavidZagury]
* c409594 - [SPM] Add support for configuring systemd service Type in package manifests (#3946) (8 days ago) [DavidZagury]
* 09b4292 - [trim]: Add Packet Trimming Asym DSCP CLI (#3920) (9 days ago) [Nazarii Hnydyn]
* f751730 - Lodoga-Prime: lodogaprime platform support (#3954) (13 days ago) [NobutomoNakano]
* 0424ae0 - Add GCU Support for SKU Mellanox-SN4280-C48/O8C40/O8V40 (#3964) (13 days ago) [Sai Rama Mohan Reddy S]
* 57b9846 - fix issue #22476 remove quagga in show bgp cmd (#3947) (2 weeks ago) [Liping Xu]
* 5d11fc5 - Fix comparison error when replace (#3941) (3 weeks ago) [jingwenxie]
* f6d6d9a - Fix for 22138: Chassisd does not wait for the execution to complete for previous admin state change requests - Replaces PR: #3845 (#3937) (4 weeks ago) [rameshraghupathy]
* be72304 - [YANG] remove uses clause handling, now part of sonic-yang-mgmt (#3814) (4 weeks ago) [Brad House]
* 19a6b3c - Switch to using chrony instead of ntpd : gcu - services_validator.py (#3929) (4 weeks ago) [Anukul Verma]
* 5db9c27 - Fixed cli command for ECN config on voq switch (#3928) (4 weeks ago) [saksarav-nokia]
* 020f9d0 - Improved GCU's field validation logic for the WRED_PROFILE table (#3910) (4 weeks ago) [mramezani95]
* f15e2d0 - feat: support specific BP port info for show int (#3926) (4 weeks ago) [Chenyang Wang]
* 5a59f19 - [multi-asic] Fix the 'config reload' flow in case when multiple golden_config.json files provided (#3895) (5 weeks ago) [Vadym Hlushko]
* 82ec8f4 - fix show cmd for bgp (#3922) (5 weeks ago) [Liping Xu]
* e0f9da4 - Skip checking offload flags for static routes/sids in route check and add check_sids (#3919) (5 weeks ago) [Changrong Wu]
* 5ea861d - [copp]: Added CoPP show configuration commands (#3863) (6 weeks ago) [Ravi Minnikanti(Marvell)]
* 9fd8c3c - [sfputil] Use host lane mask as part of rx-output enable/disable (#3911) (6 weeks ago) [mihirpat1]
* 3e157a2 - Support reboot cause: Kernel Panic - Out of memory (#3918) (6 weeks ago) [byu343]
* 49d36ff - [gcu]: Add marvell-teralynx platform to gcu field validator (#3881) (6 weeks ago) [Ravi Minnikanti(Marvell)]
* 8415aee - [Mellanox] Collect sai.xml to sysdump (#3903) (6 weeks ago) [Sai Rama Mohan Reddy S]
* 6e26c8d - [intfstat] Align output format between cached/non-cached scenarios (#3902) (6 weeks ago) [Yair Raviv]
* 57d825e - Add version_202411_02 function (#3864) (6 weeks ago) [Ben Levi]
* d5051cd - [Smartswitch][reboot] Addition of pre shutdown and post startup function calls (#3900) (7 weeks ago) [Gagan Punathil Ellath]
* b3509b9 - Add CLI show commands to view bgp network, neighbors and summary on per-vrf basis (#3866) (7 weeks ago) [Navdha Jindal]
* dfa51d3 - Upgrade portstat to support nonzero option and sort heterogeneous interfaces names (#3894) (7 weeks ago) [Changrong Wu]
* ba255b6 - Issue #22407: ConfigReload fails when RADIUS statistics is enabled (#3860) (8 weeks ago) [Anders Linn]
* 7116edf - Fix warm-reboot script so it can be run via reboot DBus service (#3872) (8 weeks ago) [jkmar]
* 2f1c4e0 - config: Modify AAA config commands to use pass_db decorator (#3755) (8 weeks ago) [Anders Linn]
* d6d866f - show command for icmp echo offload sessions (#3889) (8 weeks ago) [manamand2020]
* 1b3498c - add TH5-512 hwsku into gcu support list (#3896) (8 weeks ago) [Dashuai Zhang]
* b106a82 - Addition of prober_type in config and show commands for muxcable (#3884) (9 weeks ago) [harjotsinghpawra]
* 733bdde - [smartswitch] Fix incorrect reboot status check and improve debug logging in reboot scripts (#3888) (9 weeks ago) [Vasundhara Volam]
* 60110fa - feat: support namespace arg for show mac (#3873) (9 weeks ago) [Chenyang Wang]
* aeba823 - feat: support namespace arg for show bfd (#3885) (9 weeks ago) [Chenyang Wang]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.