Test case 3 of PFC watchdog against warm-reboot: random storming#837
Merged
wendani merged 20 commits intosonic-net:masterfrom Oct 14, 2019
Merged
Test case 3 of PFC watchdog against warm-reboot: random storming#837wendani merged 20 commits intosonic-net:masterfrom
wendani merged 20 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
which uses cached time for a certain period of time ansible/ansible#22561 Signed-off-by: Wenda Ni <[email protected]>
functional_test_storm.yml Signed-off-by: Wenda Ni <[email protected]>
…bleshooting purpose Signed-off-by: Wenda Ni <[email protected]>
to the target number Signed-off-by: Wenda Ni <[email protected]>
functional_test_storm_perq.yml and functional_test_restore_perq.yml, respectively Add the capability to storm multiple queues of a port Signed-off-by: Wenda Ni <[email protected]>
PFC storm started and detected before warm-reboot On-going storm on warm-reboot emission, and lasts past the warm-reboot finish PFC storm stopped and restored after warm-reboot Signed-off-by: Wenda Ni <[email protected]>
Mar 20 00:40:33.599212 str-a7050-acs-1 ERR syncd#syncd: _brcm_sai_cosq_stat_get:1146 cosq stat get failed with error Invalid parameter (0xfffffffc). Mar 20 00:40:33.599212 str-a7050-acs-1 DEBUG syncd#syncd: brcm_sai_get_queue_stats:724 cosq stat get failed with error -5 for port 1 qid 10 Mar 20 00:40:33.599212 str-a7050-acs-1 NOTICE syncd#syncd: :- setQueueCounterList: Queue oid:0x102150000000b does not has supported counters Signed-off-by: Wenda Ni <[email protected]>
Using include asynchronously with with_items not supported From <ansible/ansible#22716>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
Signed-off-by: Wenda Ni <[email protected]>
PFC storm asynchronously starts at a random time and lasts a random period at fanout Warm-reboot emission Wait for all the PFC storms to finish Verify PFC storm detection and restoration functional Signed-off-by: Wenda Ni <[email protected]>
yxieca
pushed a commit
that referenced
this pull request
Oct 15, 2019
* First test case of PFC watchdog against warm-reboot Signed-off-by: Wenda Ni <[email protected]> * Add more comments for code readability Signed-off-by: Wenda Ni <[email protected]> * Modify output message Signed-off-by: Wenda Ni <[email protected]> * Allow log analyzer to take a specified start marker Signed-off-by: Wenda Ni <[email protected]> * Use lookup('pipe', 'date +%H:%M:%S') in place of ansible_date_time.time, which uses cached time for a certain period of time ansible/ansible#22561 Signed-off-by: Wenda Ni <[email protected]> * Add the flexiblity to not start storm at fanout link partener in running functional_test_storm.yml Signed-off-by: Wenda Ni <[email protected]> * Dump only the current result and summary files for debugging and troubleshooting purpose Signed-off-by: Wenda Ni <[email protected]> * Add the capability to check if the number of exact matches is equal to to the target number Signed-off-by: Wenda Ni <[email protected]> * Split the actual storm and restore tests into functional_test_storm_perq.yml and functional_test_restore_perq.yml, respectively Add the capability to storm multiple queues of a port Signed-off-by: Wenda Ni <[email protected]> * Add test case 2 of PFC watchdog against warm-reboot: PFC storm started and detected before warm-reboot On-going storm on warm-reboot emission, and lasts past the warm-reboot finish PFC storm stopped and restored after warm-reboot Signed-off-by: Wenda Ni <[email protected]> * Ignore trival syncd ERR during the warm-reboot, e.g., Mar 20 00:40:33.599212 str-a7050-acs-1 ERR syncd#syncd: _brcm_sai_cosq_stat_get:1146 cosq stat get failed with error Invalid parameter (0xfffffffc). Mar 20 00:40:33.599212 str-a7050-acs-1 DEBUG syncd#syncd: brcm_sai_get_queue_stats:724 cosq stat get failed with error -5 for port 1 qid 10 Mar 20 00:40:33.599212 str-a7050-acs-1 NOTICE syncd#syncd: :- setQueueCounterList: Queue oid:0x102150000000b does not has supported counters Signed-off-by: Wenda Ni <[email protected]> * Run apswitch action asynchronously Using include asynchronously with with_items not supported From <ansible/ansible#22716> * Add the flexiblity to defer storm start and stop at fanout Signed-off-by: Wenda Ni <[email protected]> * Randomly generate deferred time Signed-off-by: Wenda Ni <[email protected]> * Move actual storming ops to per queue Signed-off-by: Wenda Ni <[email protected]> * Clean debugging symbols Signed-off-by: Wenda Ni <[email protected]> * Test cast 3 of PFC watchdog against warm-reboot PFC storm asynchronously starts at a random time and lasts a random period at fanout Warm-reboot emission Wait for all the PFC storms to finish Verify PFC storm detection and restoration functional Signed-off-by: Wenda Ni <[email protected]> * Specify reboot type to be 'warm-reboot' Signed-off-by: Wenda Ni <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test case 3:
PFC storm asynchronously starts at a random time and lasts a random period at fanout
Warm-reboot emission
Wait for all the PFC storms to finish
Verify PFC storm detection and restoration functional
Tested on regular pfc watchdog without break.
Infrastructure change:
Add the flexiblity to defer the start and stop of PFC storm at Arista fanout
TOFIX: Mlnx fanout
Incremental commits on top of #834
Description of PR
Summary:
Fixes # (issue)
Type of change
Approach
How did you do it?
How did you verify/test it?
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation