Skip to content

[Mellanox][Smartswitch]Changes for mounting dbus socket#20816

Merged
qiluo-msft merged 4 commits intosonic-net:masterfrom
gpunathilell:dbus_soc
Jan 18, 2025
Merged

[Mellanox][Smartswitch]Changes for mounting dbus socket#20816
qiluo-msft merged 4 commits intosonic-net:masterfrom
gpunathilell:dbus_soc

Conversation

@gpunathilell
Copy link
Contributor

@gpunathilell gpunathilell commented Nov 15, 2024

Why I did it

This PR is a temporary change, once the rshim interface will be replaced this PR will not be required anymore

To mount the dbus socket in pmon container as systemctl command has to be executed to start/stop service from PMON container during admin state/ reboot command execution

  • dockers/docker-platform-monitor/Dockerfile.j2 - Addition of dbus package for mellanox specific platform in order to use dbus-send command
  • files/build_templates/docker_image_ctl.j2 - Mount socket, since we need to use the systemctl command to start/stop service from pmon container
Work item tracking
  • Microsoft ADO (number only):

How I did it

How to verify it

dbus-send commands in Pmon container can be performed in order to start / stop the [email protected] which is relevant for starting or stopping the rshim service

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@gpunathilell
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@gpunathilell
Copy link
Contributor Author

/azpw ms_conflict

@gpunathilell
Copy link
Contributor Author

@prgeor Please review

@oleksandrivantsiv
Copy link
Collaborator

/azpw ms_conflict

liat-grozovik
liat-grozovik previously approved these changes Nov 27, 2024
@liat-grozovik
Copy link
Collaborator

/azpw ms_conflict

1 similar comment
@liushilongbuaa
Copy link
Contributor

/azpw ms_conflict

-v /var/run/hw-management:/var/run/hw-management:rw \
-v mlnx_sdk_socket:/var/run/sx_sdk \
-v /tmp/nv-syncd-shared/:/tmp \
-v /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket \
Copy link
Collaborator

@qiluo-msft qiluo-msft Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_bus_socket

Is it true that code inside pmon can manipulate any host service? #Closed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, PMON can control host services. We need this creation/removal of the midplane interfaces

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gpunathilell looks to be a hacky approach to me accessing host from a docker. Instead,
rshim can monitor the udev event for the pcie link to be up when the DPU is up and create the interface once the pcie link is up

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor DPU state is controlled by the PMON. PMON knows when to create or remove the midplane interface (rshim). Rshim can't subscribe to the event because the dependencies are opposite. Before removing the PCI interface, we need to stop rshim.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor, @qiluo-msft, we will use the rshim service to control the midplane interface only in 202411 release. In 202505 it will be repleased with the physical function interface and the dependency to rshim and host services will be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oleksandrivantsiv To reduce the attacking surface, could you limit the changes to only mellanox and only SmartSwitch?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft it is limited to PMON container for Mellanox platform. We will add restriction to Smart Switch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft please check updated implementation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qiluo-msft kindly reminder

@oleksandrivantsiv
Copy link
Collaborator

/azpw run

@gpunathilell
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

# TODO: Mellanox will remove the --tmpfs exception after SDK socket path changed in new SDK version
{%- if docker_container_name == "pmon" %}
if [[ $NUM_DPU -gt 0 ]]; then
SMARTSWITCH_MNT= " -v /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket"
Copy link
Collaborator

@qiluo-msft qiluo-msft Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMARTSWITCH_MNT

If the conditions are not met, please init SMARTSWITCH_MNT to empty string. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mssonicbld
Copy link
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@gpunathilell
Copy link
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@qiluo-msft qiluo-msft merged commit 821c43c into sonic-net:master Jan 18, 2025
VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
)

Why I did it
This PR is a temporary change, once the rshim interface will be replaced this PR will not be required anymore

To mount the dbus socket in pmon container as systemctl command has to be executed to start/stop service from PMON container during admin state/ reboot command execution

dockers/docker-platform-monitor/Dockerfile.j2 - Addition of dbus package for mellanox specific platform in order to use dbus-send command
files/build_templates/docker_image_ctl.j2 - Mount socket, since we need to use the systemctl command to start/stop service from pmon container

How I did it
How to verify it
dbus-send commands in Pmon container can be performed in order to start / stop the [email protected] which is relevant for starting or stopping the rshim service
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants