Skip to content

[201911] [multi_asic] Script to monitor errors on internal links#2966

Closed
tjchadaga wants to merge 1 commit into201911from
int_link_monitor
Closed

[201911] [multi_asic] Script to monitor errors on internal links#2966
tjchadaga wants to merge 1 commit into201911from
int_link_monitor

Conversation

@tjchadaga
Copy link
Contributor

What I did

Added script to monitor SAI_PORT_STAT_IF_IN_ERRORS & SAI_PORT_STAT_IF_OUT_ERRORS on internal (backend) ports of multi-asic device.

How I did it

Script is added to monit to regularly scan and check for errors on internal ports. If error count is above configured threshold, and the internal port-channel has more than min_links active links, then the port is shut down and syslog is generated.

How to verify it

UT cases are added to test the error detection.

Previous command output (if the output of a command-line utility has changed)

New command output (if the output of a command-line utility has changed)

self.error_counter_names = ['SAI_PORT_STAT_IF_IN_ERRORS',
'SAI_PORT_STAT_IF_OUT_ERRORS']
# Using fixed values for 201911
self.threshold = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is comment applicable ? Is threshold configurable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fixed for 201911, but would be read from configDB for master and newer branches


def get_active_lag_member_count(self, namespace, lag_name):
''' Returns number of member ports that are operationally up in the given portchannel '''
lag_members = self.appdb[namespace].keys('APPL_DB', 'LAG_MEMBER_TABLE:{}:*'.format(lag_name))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use config db instead of appl db ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are looking for active (operationally up) lag members, does it matter if we check in appl db?

Copy link
Contributor

@arlakshm arlakshm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants