Skip to content

HLD for Shutdown and Startup Fabric module#1694

Open
mlok-nokia wants to merge 3 commits intosonic-net:masterfrom
mlok-nokia:shutdown_startup_fabric_hld
Open

HLD for Shutdown and Startup Fabric module#1694
mlok-nokia wants to merge 3 commits intosonic-net:masterfrom
mlok-nokia:shutdown_startup_fabric_hld

Conversation

@mlok-nokia
Copy link
Copy Markdown

@mlok-nokia mlok-nokia commented May 8, 2024

This is HLD for Shutdown and Startup the Fabric Module.

@mlok-nokia mlok-nokia force-pushed the shutdown_startup_fabric_hld branch from 88063c1 to 04ba416 Compare May 10, 2024 13:57
@kenneth-arista
Copy link
Copy Markdown

@jfeng-arista for awareness

2. Modify the module_db_update() to call get_module_admin_status() to check the config module. If the module_cfg_status is not set to down, then populate the CH-TBDASSIS_FABRIC_ASIC_TABLE. Otherwise, just ignore it even the SFM module is present. This mechanism prevents the event is triggered in the swss.sh when admin_status is set to down state.

# 3 Test Considerations
UTs are also added to simulate the Fabric shutdown and startup
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to consider sonic-mgmt test to cover this. The tests should cover validating effect on thermal and pci devices.

@mlok-nokia
Copy link
Copy Markdown
Author

@abdosi @judyjoseph I have update the document with investigation of the Impact of PCIed and Thermal sensors. Based on the current implementation, there is NO impact. Please review it


# 3 Impact and Test Considerations
## 3.1 Impact of the PCIed and Thermal sensors
For PCIed, based on the investigation, the current design of the Fabric module shutdown has NO impact on the PCIed. The PCIed current checks the basic PCI components. For the Fabric slot which is shutdown, if platform supports PCI on the Fabric card, it should check if its power is on that particular card before it is added to the PCIe check. That is how is handled in the Arista vendor code.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Fabric is shutdown through CLI, the Vendor code needs to modify the pci device list and remove the the device from list. We would also need support to be able to update this list dynamically. Currently, it's loaded during boot with the start of pcied daemon. If the list is updated, this process needs to restart in the current implementation.

judyjoseph pushed a commit to sonic-net/sonic-utilities that referenced this pull request May 29, 2024
…ule(SFM) by using "config chassis modules shutdown/startup" commands (#3283)

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below:
sonic-net/SONiC#1694
arfeigin pushed a commit to arfeigin/sonic-utilities that referenced this pull request Jun 16, 2024
…ule(SFM) by using "config chassis modules shutdown/startup" commands (sonic-net#3283)

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below:
sonic-net/SONiC#1694
nmoray pushed a commit to nmoray/sonic-utilities that referenced this pull request Jun 25, 2025
…ule(SFM) by using "config chassis modules shutdown/startup" commands (sonic-net#3283)

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below:
sonic-net/SONiC#1694
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants