[HLD] Event Driven TechSupport Invocation & CoreDump Mgmt #818
[HLD] Event Driven TechSupport Invocation & CoreDump Mgmt #818liat-grozovik merged 72 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
| state = enabled|disabled; # Enable this to make the Techsupport Invocation event driven based on core-dump generation | ||
| rate_limit_interval = 300; # Minimum Time in seconds, between two successive techsupport invocations. | ||
| Manual Invocations will be considered as well in the cooloff calculation | ||
| max_techsupport_size = 10; # A perentage value should be specified. |
There was a problem hiding this comment.
@vivekreddynv typo? Did you mean percentage?
| This signifies maximum Size to which /var/dump/ directory can be grown until. | ||
| The actual value in bytes is calculate based on the available space in the filesystem hosting /var/dump | ||
| When the limit is crossed, the older techsupport dumps are incrementally deleted | ||
| max_core_size = 5; # A perentage value should be specified. |
There was a problem hiding this comment.
@vivekreddynv typo? Did you mean percentage?
|
|
||
| Where `<comm>` value in the command name associated with a process. comm value of a running process can be read from `/proc/[pid]/comm` file | ||
|
|
||
| ## 4. Schema Additions |
There was a problem hiding this comment.
@vivekreddynv suggest to follow SONiC DB schema description conventions (ABNF):
| description "First Revision"; | ||
| } | ||
|
|
||
| typedef enable-knob { |
There was a problem hiding this comment.
@vivekreddynv we already have this type in SONiC:
https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-types.yang#L54
typedef admin_status {
type enumeration {
enum up;
enum down;
}
}There was a problem hiding this comment.
I feel "enabled|disabled" makes better sense here over "up|down".
There was a problem hiding this comment.
@vivekreddynv sorry, a typo:
https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/yang-models/sonic-types.yang#L119
typedef admin_mode {
type enumeration {
enum enabled;
enum disabled;
}
}There was a problem hiding this comment.
Oh okay, I'll change this
| leaf feature_name { | ||
| description "The name of this feature"; | ||
| /* TODO: Leafref once the FEATURE YANG is added*/ | ||
| type string; |
There was a problem hiding this comment.
@vivekreddynv please add a key length constraint
|
@nazariig do you have any further comments? if not can you please approve? |
#### What I did sonic-utilities changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". [HLD](sonic-net/SONiC#818 ) Summary of the changes: - Added the AUTO GEN CLI for the CFG DB tables required for this feature - Added the coredump_gen_handler.py & techsupport_cleanup.py scripts. - Added the UT's required for these scripts. - Enhanced coredump-compress & generate-dump scripts
#### Why I did it Changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". [HLD](sonic-net/SONiC#818 ) Requires: sonic-net/sonic-utilities#1796. Merging in any order would be fine. Summary of the changes: - Added the YANG Models for the new tables introduces as a part of this feature. - Enhanced init_cfg.json with the default config required - Added a compile Time flag which enables/disables the config required for this feature inside the init_cfg.json - Enhanced the supervisor-proc-exit-listener script to populate `<feature>:<critical_proc> = <comm>:<pid>` info in the STATE_DB when it observes an proc exit notification for the critical processes running inside the docker.
#### What I did sonic-utilities changes required for feature "Event Driven TechSupport Invocation & CoreDump Mgmt". [HLD](sonic-net/SONiC#818 ) Summary of the changes: - Added the AUTO GEN CLI for the CFG DB tables required for this feature - Added the coredump_gen_handler.py & techsupport_cleanup.py scripts. - Added the UT's required for these scripts. - Enhanced coredump-compress & generate-dump scripts
Signed-off-by: Vivek Reddy Karri vkarri@nvidia.com
The high-level design document for the "Event Driven TechSupport Invocation & CoreDump Mgmt" enhancement