[trim]: Update log level severity to avoid errors during capabilities query#3916
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
By logging these events at NOTICE level, we may miss some legitimate errors as well. While sonic-mgmt log analyzer may still see the NOTICE level entries and report them, this won't help customers who rely solely on monitoring the ERROR log level. |
@kperumalbfn please share your opinion |
|
@kperumalbfn , @developfast , please review |
There was a problem hiding this comment.
This change is a nice improvement - toStr(status) helper and the reduction in log flooding both make the logs much more actionable.
That said, I share some of the concern raised about blanket use of NOTICE. By lowering all non-success statuses, we risk missing genuinely unexpected errors, especially for customers who monitor only ERROR.
Would it make sense to distinguish cases here? For example:
Log SAI_STATUS_NOT_SUPPORTED (expected) at INFO/DEBUG.
Log other non-success statuses (e.g., INVALID_PARAMETER, FAILURE) at ERROR.
This way we cut down the noise while still preserving visibility into real errors that operators should act on.
NOTICE is still be part of sonic syslog as the default log level is NOTICE for swss. |
| capList.resize(enumList.count); | ||
| enumList.list = capList.data(); | ||
|
|
||
| return sai_query_attribute_enum_values_capability(gSwitchId, objType, attrId, &enumList); |
There was a problem hiding this comment.
@kperumalbfn @nazariig IMO, We should suppress the unsupported error log here and proceed by returning a SUCCESS status. The same handling should also be implemented in queryAttrCapabilitiesSai.
This would not mask the ERROR logs.
There was a problem hiding this comment.
@dhanasekar-arista by design queryEnumCapabilitiesSai is a generic API and it simply forwards the result without adding any extra business logic. It is a client code's responsibility to handle the status properly and do the logging using the relevant severity level
Yes, but customer syslog monitoring tool may have different alarms raised for different levels. And there is a chance that a valid error log may be missed. Hence i'm not in favor of this change. |
08372cb to
495d00e
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azpw run |
|
/AzurePipelines run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
A bunch of random trimming and macsec vstests are failing not related to pr change @kperumalbfn - should we bypass? |
495d00e to
0679665
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…um capabilities query Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com>
0679665 to
762682f
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…um capabilities query (sonic-net#3916) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com> Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
…um capabilities query (sonic-net#3916) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com> Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com>
…um capabilities query (sonic-net#3916) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com> Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com> Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
…um capabilities query (sonic-net#3916) Signed-off-by: Nazarii Hnydyn <nazariig@nvidia.com> Co-authored-by: Sudharsan Dhamal Gopalarathnam <sudharsand@nvidia.com> Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
Signed-off-by: Nazarii Hnydyn nazariig@nvidia.com
These changes help to avoid flooding log with errors during Packet Trimming
attribute/enumcapabilities checkHLD: sonic-net/SONiC#2033
What I did
Why I did it
How I verified it
Details if related