[db_migrator.py] Fix issue while upgrading from 202205 to 202211 via fast reboot#2948
Conversation
scripts/db_migrator.py
Outdated
|
|
||
| if self.stateDB.keys(self.stateDB.STATE_DB, "FAST_REBOOT|system"): | ||
| enable_state = self.stateDB.get(self.stateDB.STATE_DB, 'FAST_RESTART_ENABLE_TABLE|system', 'enable') | ||
| if self.stateDB.keys(self.stateDB.STATE_DB, "FAST_REBOOT|system") or enable_state == 'true': |
There was a problem hiding this comment.
Why is this change needed? The check here was added (#2839) for old systems (like 201911) where new flag was not present. The newer branches (where FAST_RESTART_ENABLE_TABLE flag is present use below changes to delay flex counters:
sonic-net/sonic-swss#1877
#1768
@shlomibitton can you please review if the above analysis is correct?
There was a problem hiding this comment.
Hi @vaibhavhd , this issue is only reproduced from 202205 to 202211. In 202205, there is already a db migrator function version_3_0_5 which changes old flag to new flag. However, in 202211, the same logic is in version_4_0_0. While upgrading from 202205 to 202211, 202205 sets new flag and we have a dump.rdb; in 202211 db_migrator just sets flag to false.
There was a problem hiding this comment.
Hi @stepanblyschak and @arfeigin , could you please provide your comment here?
There was a problem hiding this comment.
@Junchao-Mellanox correct, good observation.
The suggested change for version_4_0_0 is in place and hence also this change in version_4_0_2 is needed to tie it together so it will all work as expected.
There was a problem hiding this comment.
@vaibhavhd As wrote since in version_4_0_0 in the db_migrator according to this PR we are adding new state-db entry for fast-reboot instead of the old version it will be more coherent to check for the newer version when migrating to even newer version (4_0_2) and ignore the former state-db entry.
There was a problem hiding this comment.
I still don't understand why these two lines need to be changed. I understand the change in 4_0_0, but w/o the change in 4_0_2, what is the issue?
I am not saying that this is doing any damage, just that I don't understand why we are fixing this if this is not needed.
There was a problem hiding this comment.
Actually it might as well be doing some damage? For newer systems where we use FAST_RESTART_ENABLE_TABLE|system flag we do not want to call migrate_config_db_flex_counter_delay_status, or do we? I think we do not, as flex counters in these newer systems are handled by:
sonic-net/sonic-swss#1877
#1768
There was a problem hiding this comment.
Hi @vaibhavhd, you are right, this change is not necessary. I have updated code, could you please review again?
arfeigin
left a comment
There was a problem hiding this comment.
LGTM, good catch and solution.
|
@vaibhavhd can you please review feedback and if possible aprrove? |
|
ADO: 25048636 |
…fast reboot (#2948) - What I did Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator. This mismatch causes an incorrect sequence like this: 1. User issue fast-reboot under 202205 2. fast-reboot script set fast reboot flag by command "sonic-db-cli STATE_DB HSET "FAST_RESTART_ENABLE_TABLE|system" "enable" "true" &>/dev/null" 3. system boot to 202211 4. db_migrator found the database version is 3_0_6, it will run 4_0_0, however, it found FAST_REBOOT|system does not exist, and it set FAST_RESTART_ENABLE_TABLE|system enable to false 5. system incorrectly performs cold reboot - How I did it in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration - How to verify it unit test manual test
|
PR for 202211: #2962 |
…2211 via fast reboot (#2962) Backport #2948 to 202211 What I did Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator. In 202205, the db migrator for fast reboot flag handling is 3_0_5: sonic-utilities/scripts/db_migrator.py Line 920 in 56a1ae2 def version_3_0_5(self): However, in master(202211), the db migrator for fast reboot flag handling is 4_0_0: sonic-utilities/scripts/db_migrator.py Line 993 in 7435b1c def version_4_0_0(self): This mismatch causes an incorrect sequence like this: 1. User issue fast-reboot under 202205 2. fast-reboot script set fast reboot flag by command "sonic-db-cli STATE_DB HSET "FAST_RESTART_ENABLE_TABLE|system" "enable" "true" &>/dev/null" 3. system boot to 202211 4. db_migrator found the database version is 3_0_6, it will run 4_0_0, however, it found FAST_REBOOT|system does not exist, and it set FAST_RESTART_ENABLE_TABLE|system enable to false 5. system incorrectly performs cold reboot How I did it in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration How to verify it unit test manual test
What I did
Fix issue while upgrading from 202205 to 202211 via fast reboot. This issue is caused by a mismatch version handling for fast reboot in db_migrator.
In 202205, the db migrator for fast reboot flag handling is 3_0_5:
sonic-utilities/scripts/db_migrator.py
Line 920 in 56a1ae2
However, in master(202211), the db migrator for fast reboot flag handling is 4_0_0:
sonic-utilities/scripts/db_migrator.py
Line 993 in 7435b1c
This mismatch causes an incorrect sequence like this:
How I did it
in db migrator if we see there is FAST_RESTART_ENABLE_TABLE already, we should skip fast reboot flag migration
How to verify it
unit test
manual test
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)