[Dynamic buffer calc] Support dynamic buffer calculation#6194
[Dynamic buffer calc] Support dynamic buffer calculation#6194qiluo-msft merged 10 commits intosonic-net:masterfrom
Conversation
1. add tables required for the feature: - ASIC_TABLE in platform/mellanox/asic_table.j2 (Mellanox specific) - PERIPHERAL_TABLE in platform/mellanox/peripheral_table.j2 (Mellanox specific) - PORT_PERIPHERAL_TABLE on a per-platform basis in device/mellanox/x86_64-mlnx_msn3800-r0/port_peripheral_config.j2 (Mellanox specific) - DEFAULT_LOSSLESS_BUFFER_PARAMETER and LOSSLESS_TRAFFIC_PATTERN in files/build_templates/buffers_config.j2 - Add lossless PGs (3-4) for each port in files/build_templates/buffers_config.j2 2. copy the newly introduced j2 files into image and rendering them when system starts 3. update the CLI options for buffermgrd so that it can start with dynamic mode 4. Adjust the order in which swss daemons start, making buffermgrd start before orchagent. This is to make sure buffermgrd can get everything ready before orchagent starts especially during warm reboot. 5. Optimize the way in which orchagent fetches the asic vendor name: - fetch the vendor name when creates the docker and pass it as a docker environment variable - orchagent and buffermgrd can use this passed-in variable 6. Clear buffer related tables from STATE_DB when swss docker starts 7. Update the src/sonic-config-engine/tests/sample_output/buffers-dell6100.json according to the buffer_config.j2 8. Remove buffer pool sizes for ingress pools and egress_lossy_pool Update the buffer settings for dynamic buffer calculation Signed-off-by: Stephen Sun <stephens@nvidia.com>
1. If all the buffer configuration aligns the default, use dynamic buffer calculation mode. Otherwise, use the traditional mode 2. Dynamic mode is adopted in switches newly installed from scratch and traditional mode in switches installed from ninigraph This is done by: - introducing a wrapper to buffermgrd, which enables it to test which mode is adopted and feed buffermgrd with corresponding CLI arguments - preparing default buffer configuration templates for both dynamic and traditional mode for each SKU - introduce "buffer_model" in DEVICE_METADATA|localhost for mellanox platform Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- For Mellanox, dynamic by default - For other vendors, traditional mode. Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
…able Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
| fi | ||
| if [ ! -f /etc/sonic/peripheral_table.json ] && [ -f /usr/share/sonic/device/$PLATFORM/port_peripheral_config.j2 ]; then | ||
| sonic-cfggen -d -t /usr/share/sonic/device/$PLATFORM/port_peripheral_config.j2 > /etc/sonic/peripheral_table.json | ||
| fi |
There was a problem hiding this comment.
Move this block into inside swss container? #Closed
There was a problem hiding this comment.
Given that the asic table and peripheral table won't be changed on-the-fly, I'd like to have them called once only when the docker is created instead of called each time the swss docker starts.
Let me know if there are other concerns.
There was a problem hiding this comment.
Seems here is called when the docker swss is started, not created.
You already check the files existing, so not too much time wasted after file created.
In reply to: 541481038 [](ancestors = 541481038)
There was a problem hiding this comment.
ok.
Let's move to buffermgrd.sh. Put the code generating asic_table.json and peripheral_table.json just ahead of the place where it is used.
There was a problem hiding this comment.
Just investigated and realized it’s will be very difficult to move this part into swss docker because it requires asic_table.j2 and peripheral_table.j2 to be copied in docker swss. However, it’s defined on a per vendor basis and in platform/<vendor> dir, which makes it difficult to copy into docker in the Dockerfile.j2.
So we have to have this part in docker_image_control.
Meanwhile, I also noticed that the PLATFORM isn't defined in this function. This can be achieved by moving this part into start() function.
There was a problem hiding this comment.
Regarding "it’s defined on a per vendor basis and in platform/ dir, which makes it difficult to copy into docker in the Dockerfile.j2"
I checked docker inspect swss and found /usr/share/sonic/device/$PLATFORM/ is already mapped into swss container.
{
"Type": "bind",
"Source": "/usr/share/sonic/device/x86_64-mlnx_msn2700-r0",
"Destination": "/usr/share/sonic/platform",
"Mode": "ro",
"RW": false,
"Propagation": "rprivate"
},
Mounting is happened after running a container, so I did not propose modify Dockerfile.j2, instead, modify dockers/docker-orchagent/docker-init.sh.
I think you can treat asic_table.j2 the same way as port_peripheral_config.j2. Why they are copied into different folders?
In reply to: 541513109 [](ancestors = 541513109,541481038)
There was a problem hiding this comment.
The port peripheral config is defined on a per platform basis because each platform has its unique configuration. But the asic_table.j2 is shared among all platforms of the same vendor. It should not be copied to each of the platform folders.
|
|
||
| {%- if docker_container_name == "swss" %} | ||
| # Obtain the vendor name | ||
| ASIC_VENDOR=`$SONIC_CFGGEN -y /etc/sonic/sonic_version.yml -v asic_type` |
There was a problem hiding this comment.
Move this into inside swss container? #Closed
There was a problem hiding this comment.
The same as the previous one. I'd like call this once during docker creation instead of many times.
| { | ||
| "DEVICE_METADATA": { | ||
| "localhost": { | ||
| "buffer_model": {% if default_buffer_model == "dynamic" %}"dynamic"{% else %}"traditional"{% endif %}, |
There was a problem hiding this comment.
traditional [](start = 90, length = 11)
dynamic vs static? #Closed
There was a problem hiding this comment.
dynamic / traditional is the term we already agreed on, isn't it?
| sudo cp $BUILD_TEMPLATES/qos_config.j2 $FILESYSTEM_ROOT_USR_SHARE_SONIC_TEMPLATES/ | ||
|
|
||
| # Copy the templates for dynamically buffer calculation | ||
| {% if sonic_asic_platform == "mellanox" or sonic_asic_platform == "vs" %} |
There was a problem hiding this comment.
sonic_asic_platform [](start = 43, length = 19)
What is the future expectation on other platform?
If nothing, why need it for vs? #Closed
There was a problem hiding this comment.
I believe other vendors will support the dynamic model in the future.
Meanwhile, we need this to verify this feature on vs platform.
By default, the dynamic model is disabled on vs platform and is enabled when the test cases of this feature are executed.
src/sonic-config-engine/tests/sample_output/py2/buffers-dell6100.json
Outdated
Show resolved
Hide resolved
src/sonic-config-engine/tests/sample_output/py3/buffers-dell6100.json
Outdated
Show resolved
Hide resolved
src/sonic-config-engine/tests/sample_output/py2/buffers-dell6100.json
Outdated
Show resolved
Hide resolved
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- remove vs related stuff from Mellanox specific asic_table.json and peripheral_table.json - fix minor errors Signed-off-by: Stephen Sun <stephens@nvidia.com>
| {%- if DEVICE_METADATA is defined and DEVICE_METADATA['localhost']['platform'] is defined %} | ||
| {%- set platform = DEVICE_METADATA['localhost']['platform'] %} | ||
| {%- else -%} | ||
| {%- set platform = "vs-platform" %} |
There was a problem hiding this comment.
The same issue, move vs related part into platform/vs/peripheral_table.j2. Please don't mix platforms into mellanox folder. #Closed
Remove vs stuff from Mellanox related files and fix indent Signed-off-by: Stephen Sun <stephens@nvidia.com>
|
retest vs please |
| # Inject mlnx sdk libs to platform monitor | ||
| $(DOCKER_PLATFORM_MONITOR)_DEPENDS += $(APPLIBS) $(SX_COMPLIB) $(SXD_LIBS) $(SX_GEN_UTILS) $(PYTHON_SDK_API) $(APPLIBS_DEV) $(SX_COMPLIB_DEV) $(SXD_LIBS_DEV) $(SX_GEN_UTILS_DEV) | ||
|
|
||
| export SONIC_BUFFER_MODEL=dynamic |
There was a problem hiding this comment.
Do not change build time config in component makefile, if you need, change the pipeline command line.
There was a problem hiding this comment.
Hi Qi,
Yes, it works to change the CLI option but what's the benefit to change a working code?
- Why I did it
To support dynamic buffer calculation.
This PR also depends on the following PRs for sub modules
- How I did it
buffer_modelinDEVICE_METADATA|localhostto represent which buffer model is running in the system currently:dynamicfor the dynamic buffer calculation modeltraditionalfor the traditional model in which thepg_profile_lookup.iniis usedbuffermgrdcan use this passed-in variableUpdate the buffer settings for dynamic buffer calculation
Signed-off-by: Stephen Sun stephens@nvidia.com
- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)