Third party container management using the Sonic Application Framework#1286
Third party container management using the Sonic Application Framework#1286adyeung merged 1 commit intosonic-net:masterfrom
Conversation
681d1be to
e8caba9
Compare
|
Reviewed in community meeting. Recording https://zoom.us/rec/share/EXGfLxRpPT6buxU2we1OAKFqj-Nsq6vdQywuABzI7JhxNRcK-O_DSXY6xC8xVRUh.h2xG7KEZxjHlmV87 |
|
Microsoft and Nvidia will be the reviewer for this feature. |
|
@zhangyanzhao Please add me to the review. |
|
@zhangyanzhao can you add @stepanblyschak as a reviewer? it should not be me :) |
There was a problem hiding this comment.
Do we have the capability to run the script inside the TPC? e.g We may need to mount the specific config file changes in the host and use it inside TPC or for some dependencies I may need to execute some script before starting services inside the container, do we have an option to do that?
There was a problem hiding this comment.
Yes ,capability to run the script inside TPC is supported
There was a problem hiding this comment.
How about the connectivity for the TPC to the outside world? hope we dont expose the entire host networking to the TPC. What if my requirement is to run the TPC on the mgmt VRF?
There was a problem hiding this comment.
TPC with mgmt VRF support can be provided later but currently we don't support it.
There was a problem hiding this comment.
If this is not captured in the HLD, please add it in the restrictions/limitations section.
There was a problem hiding this comment.
Included in the Limitations section
There was a problem hiding this comment.
IMO, we should not differentiate like this, wherever possible we should design generically, in this case, App ext can be extended to support docker image without manifest file as well.
There was a problem hiding this comment.
Not sure why we need this tpcm flag? why cant have the generic design?
There was a problem hiding this comment.
we can have a flag as "type" instead of "tpcm"
There was a problem hiding this comment.
Cant we skip type field altogether?
There was a problem hiding this comment.
Skipped as suggested
There was a problem hiding this comment.
How do we make sure that TPC is not hijacking any of the control package meant for processing by SONiC system? do we have any validation in place?
There was a problem hiding this comment.
TPC docker wont run in "privileged" mode and runs with the default security settings and won't have escalated privileges.
There was a problem hiding this comment.
If we can skip '--tpcm' option in the command, would be good.
There was a problem hiding this comment.
Skipped as suggested
There was a problem hiding this comment.
I don't think various system resource limits for container shall be part of the manifest. The manifest, once associated with a docker container, will be the same for all devices, however, some devices have more powerful hardware and these limits may not be relevant for them. Instead, resources limits configuration is up to the user using a particular device. This configuration can be done e.g. through FEATURE table and for all containers, not just TPC.
There was a problem hiding this comment.
TPC overall memory limit can be restricted to 20% of system memory ( say, 3087MB from 15G system ram).
By default, if individual TPC memory limit is not specified, 20% of TPC overall memory limit can be assigned ( say, 617MB from 3087MB in a 15G system ram).
Idea is that summation of the memory limits configured for all TPCs must not go beyond the overall TPC memory limit set.
This way the resource limits gets adjusted as per the underlying hardware.
Agreed that configuration could be placed in config db.
There was a problem hiding this comment.
@sg893052 Could you please update the document to make memory/cpu limits configuration part of config db?
There was a problem hiding this comment.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
There was a problem hiding this comment.
So it is not necesary to pre define these docker applications in SONiC build,
This is not true for regular application extensions. An application extension is not required to be defined at SONiC build.
FYI, dhpc-relay and macsec are not the only application extensions we have. There are some which are not even in SONiC source tree.
There was a problem hiding this comment.
The extended version of sonic extension is the TPCM where it need not comply with sonic.
There was a problem hiding this comment.
@sg893052 So I just suggest to change the wording:
So it is not necesary to pre define these docker applications in SONiC build
to
So it is not necesary to for these docker applications to be SONiC compliant
since the first one is not true for regular (non-TPC) dockers.
There was a problem hiding this comment.
Why do we need a manifest CLI tree? You could just pass in a JSON file to install command and then updating manifest is just an upgrade with a different manifest file.
There was a problem hiding this comment.
Creating manifest file for each TPC docker is tedious task and customizing parameter is also difficult for a user.
With the manifest cli tree, it can be easily created, viewed. End user need not worry about TPC container creation.
There was a problem hiding this comment.
Not sure it can be set to true for TPC. support-rate-limit requires having supervisor managed rsyslogd inside container. This is very specific to SONiC containers. It should be set to false
There was a problem hiding this comment.
@sg893052 Ok, could you please update the doc to set it to false?
There was a problem hiding this comment.
What is the command? How do you translate it for docker create command? Is it --entrypoint? If so, I suggest to rename this field to entrypoint.
There was a problem hiding this comment.
"command" attribute refers to the container startup arguments. As suggested, the attribute name could be updated accordingly.
There was a problem hiding this comment.
@sg893052 Ok, btw you may want to update manifest reference to include new attributes - https://github.com/sonic-net/SONiC/blob/82ca34c8548a065b5274ba9fdee6df93037215ea/doc/sonic-application-extension/sonic-application-extension-guide.md#manifest-reference
There was a problem hiding this comment.
Not sure I understand how the command to run inside container mounts the root filesystem to container. Please clarify this example.
There was a problem hiding this comment.
This was just an example, by specifying "command":"--path.rootfs=/host", you would configure the container to use the host filesystem as the root filesystem path. This adjustment can impact the way the container interacts with and accesses files and directories during its runtime.
As agreed, this field will be updated to "entrypoint"
There was a problem hiding this comment.
Why not deducing transport protocol from URL and having just --from-tarball? E.g curl can work with http(s),scp,sfpt deduced from input URL
There was a problem hiding this comment.
Please ensure that for non TPC containers, users cannot modify manifest. I don't see why it might be useful for the user. Container configuration like resource limits needs to be outside the manifest as I suggested in previous comment. Manifest is an immutable property of an image, chaning it might break the extension or the system and it would be hard to create any validation mechanism for that.
There was a problem hiding this comment.
Will ensure this is allowed only for TPC containers.
There was a problem hiding this comment.
Perhaps we could have an option to indicate whether manifest file update allowed or not during image creation, by default, we cant modify the manifest file but when we create the image for TPC container, we could set it to true.
There was a problem hiding this comment.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
There was a problem hiding this comment.
As per discussion online, could you please summarize what are the benefits of preparing docker image on the box instead of preparing image outside the box?
There was a problem hiding this comment.
TPCM is the extended version of AEF(application extension framework) which will create manifest file and associate with TPC and manipulate the container resources through manifest file.
To summarize the benefits of preparing docker image on the box instead of preparing image outside the box.
- Leveraging existing TPC Docker images that lack a manifest file.
- Eliminating the need for manual steps to modify Docker images, such as unpacking, appending the manifest file, and repackaging.
- Facilitating seamless upgrade scenarios, bypassing the necessity of repackaging Docker images with the manifest file. Moreover, the updated manifest file, with resource adjustments, can be used for impending Docker image upgrades.
- At a per-device level, permitting users to supply manifest files for TPCs.
There was a problem hiding this comment.
@sg893052 Thanks, I see one advantage is that the user does not need to have a registry to place his re-built standard docker images and can pull them from a well known existing docker registries.
|
@stepanblyschak @lguohan @venkatmahalingam @stephenxs @adyeung @Kalimuthu-Velappan @babukr Could you please help review the answers for the comments provided. Please let us know if we could have a meeting to go over. |
There was a problem hiding this comment.
I think, resource limits are good options which need not be restricted only for TPC.
There was a problem hiding this comment.
@sg893052 @venkatmahalingam Agree, shouldn't be restricted for TPC. In fact, resource limits were discussed to be added to regular app.ext when it first was introduced but this functionality was delayed since we had no use for configuring limits.
There was a problem hiding this comment.
The inclusion of resource limits for TPC within the CONFIG DB is a distinct feature that can also be extended to SONiC containers. Therefore, the resource limits aspect will be addressed in a subsequent phase.
There was a problem hiding this comment.
If possible, please avoid tpcm name usage in the description.
There was a problem hiding this comment.
Is SONiC-2-SONiC upgrade considered? What will happen with TPC extension if we upgrade base SONiC image to new version?
|
@sg893052 The doc looks good to me, please make sure new manifest attributes are included in - https://github.com/sonic-net/SONiC/blob/82ca34c8548a065b5274ba9fdee6df93037215ea/doc/sonic-application-extension/sonic-application-extension-guide.md#manifest-reference. |
|
@stepanblyschak pls review and signoff if you are satisfied with the change |
|
moved to 202405 release |
|
All prior review comments addressed. If there are no further comments, this will go merge by end of the week. |
Uh oh!
There was an error while loading. Please reload this page.