Skip to content

Improve MIBUpdater to re-connect DBConnector when re-init data.#290

Merged
liuh-80 merged 10 commits intosonic-net:masterfrom
liuh-80:dev/liuh/improve-reinit
Oct 30, 2023
Merged

Improve MIBUpdater to re-connect DBConnector when re-init data.#290
liuh-80 merged 10 commits intosonic-net:masterfrom
liuh-80:dev/liuh/improve-reinit

Conversation

@liuh-80
Copy link
Contributor

@liuh-80 liuh-80 commented Sep 14, 2023

Improve MIBUpdater to re-connect DBConnector when re-init data.

Work item tracking

Microsoft ADO (number only): 24705208

- What I did
Fix when redis restart, some MIBUpdater's db connection will broken and keeps report error to syslog issue.

There will be following message repeat in syslog:

#012  File "/usr/local/lib/python3.7/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 674, in _update_per_namespace_data#
#012    msg = pubsub.get_message()#
#012  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1626, in get_message#
#012    return _swsscommon.PubSub_get_message(self, timeout)#
#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection

- How I did it
Re-connect DBConnector in every MIBUpdater's reinit_data method.

- How to verify it
Pass all UT

Manually test with following steps:

  1. in database container kill redis server.
  2. start redis in database container later: service redis-server start
  3. check syslog, confirm following log exist but now new log few minutes later after mibs re-init:

sudo cat /var/log/syslog | grep rfc2737.py | grep PubSub_get_message

Sep 22 03:30:17.223944 vlab-01 ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#12 File "/usr/local/lib/python3.9/dist-packages/ax_interface/mib.py", line 43, in start#012 self.update_data()#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 326, in update_data#012 updater.update_data(i, self.statedb[i])#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 666, in update_data#012 self._update_per_namespace_data(self.pub_sub_dict[db_index])#12 File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 675, in _update_per_namespace_data#012 msg = pubsub.get_message()#12 File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1996, in get_message#012 return _swsscommon.PubSub_get_message(self, timeout, interrupt_on_signal)#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection

- Description for the changelog
Improve MIBUpdater to re-connect DBConnector when re-init data.

@liuh-80
Copy link
Contributor Author

liuh-80 commented Sep 15, 2023

Pipeline build break, already create a fix PR. pending for #293 merge first

@liuh-80
Copy link
Contributor Author

liuh-80 commented Sep 18, 2023

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@liuh-80 liuh-80 marked this pull request as ready for review September 19, 2023 01:03
@liuh-80 liuh-80 requested a review from qiluo-msft September 19, 2023 01:04
@qiluo-msft

This comment was marked as resolved.

@liuh-80

This comment was marked as resolved.


# run the background update task
self.update_data()
self.redis_exception_happen = False
Copy link
Contributor

@qiluo-msft qiluo-msft Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redis_exception_happen

I am worried about the contract of member redis_exception_happen between multiple classes. How about define a new argument reconnect=False in reinit_data(), and keep redis_exception_happen as a private member only in this class. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, move redis_exception_happen to local variable. also because the python method override issue, add new method reinit_connection() to MIBUpdater interface.

if self.update_counter > self.reinit_rate:
# reconnect when redis exception happen
if redis_exception_happen:
self.reinit_connection()
Copy link
Contributor

@qiluo-msft qiluo-msft Oct 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](http://example.com/codeflow?start=24&length=4)

You can remove one extra level of indentation. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@liuh-80 liuh-80 merged commit 347a6e4 into sonic-net:master Oct 30, 2023
ssithaia-ebay pushed a commit to ssithaia-ebay/sonic-snmpagent that referenced this pull request May 23, 2025
…c-net#290)

Improve MIBUpdater to re-connect DBConnector when re-init data.

#### Work item tracking
Microsoft ADO (number only): 24705208

**- What I did**
Fix when redis restart, some MIBUpdater's db connection will broken and keeps report error to syslog issue.

There will be following message repeat in syslog:
```
sonic-net#12  File "/usr/local/lib/python3.7/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 674, in _update_per_namespace_data#
sonic-net#12    msg = pubsub.get_message()#
sonic-net#12  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1626, in get_message#
sonic-net#12    return _swsscommon.PubSub_get_message(self, timeout)#
#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection
```

**- How I did it**
Re-connect DBConnector in every MIBUpdater's reinit_data method.

**- How to verify it**
Pass all UT

Manually test with following steps:
1. in database container kill redis server.
2. start redis in database container later:  service redis-server start
2. check syslog, confirm following log exist but now new log few minutes later after mibs re-init:

sudo cat /var/log/syslog | grep rfc2737.py | grep PubSub_get_message

Sep 22 03:30:17.223944 vlab-01 ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):sonic-net#12  File "/usr/local/lib/python3.9/dist-packages/ax_interface/mib.py", line 43, in start#012    self.update_data()sonic-net#12  File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 326, in update_data#012    updater.update_data(i, self.statedb[i])sonic-net#12  File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 666, in update_data#012    self._update_per_namespace_data(self.pub_sub_dict[db_index])sonic-net#12  File "/usr/local/lib/python3.9/dist-packages/sonic_ax_impl/mibs/ietf/rfc2737.py", line 675, in _update_per_namespace_data#012    msg = pubsub.get_message()sonic-net#12  File "/usr/lib/python3/dist-packages/swsscommon/swsscommon.py", line 1996, in get_message#012    return _swsscommon.PubSub_get_message(self, timeout, interrupt_on_signal)#012RuntimeError: RedisError: Failed to select, err=3: errstr=Server closed the connection


**- Description for the changelog**
Improve MIBUpdater to re-connect DBConnector when re-init data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants