Skip to content

[fast reboot] Revert fast-reboot script changes#982

Merged
qiluo-msft merged 1 commit intosonic-net:masterfrom
neethajohn:revert_fast_boot_script
Jun 27, 2019
Merged

[fast reboot] Revert fast-reboot script changes#982
qiluo-msft merged 1 commit intosonic-net:masterfrom
neethajohn:revert_fast_boot_script

Conversation

@neethajohn
Copy link
Contributor

Revert part of the changes made in PR #975. Remove the fast-reboot script and the corresponding changes made for its use.

Type of change

  • [] Bug fix
  • [] Testbed and Framework(new/improvement)
  • Test case(new/improvement)

@qiluo-msft qiluo-msft merged commit a383e46 into sonic-net:master Jun 27, 2019
@neethajohn neethajohn deleted the revert_fast_boot_script branch July 16, 2019 23:08
fraserg-arista pushed a commit to fraserg-arista/sonic-mgmt that referenced this pull request Feb 24, 2026
<!--
Please make sure you've read and understood our contributing guidelines;
https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md

Please provide following information to help code review process a bit
easier:
-->
### Description of PR
<!--
- Please include a summary of the change and which issue is fixed.
- Please also include relevant motivation and context. Where should
reviewer start? background context?
- List any dependencies that are required for this change.
-->

Summary:
Fixes # (issue)
This PR addresses **non‑linear dataplane downtime behavior** observed in
high‑scale BGP IPv6 scenarios when running the port and session flapping
tests. When the number of connections to flap doubled, the dataplane
downtime increased by 450x.

This change refines the tests and helper logic to ensure that downtime
measurements:

- More accurately reflect real control‑plane and data‑plane outage
intervals,
- Scale more predictably with load and iterations, and
- Avoid over‑counting or under‑counting downtime due to measurement
artifacts and overlapping events.

### Type of change

<!--
- Fill x for your type of change.
- e.g.
- [x] Bug fix
-->

- [ x ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
    - [ ] Skipped for non-supported platforms
- [ ] Test case improvement


### Back port request
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505

### Approach
#### What is the motivation for this PR?

While validating high‑scale BGP convergence, flap, and route‑programming
tests, we observed that:

- Dataplane downtime did not scale linearly with:
  - The number of flap iterations,
  - The number of routes or neighbors.

These issues were traced to how the tests were executed sequentially
while the PTF dataplane packet‑filtering/counter state was never cleared
between runs. As a result, masks and counters kept accumulating over
time, so that each subsequent run especially those with a larger number
of ports to flap saw an artificially inflated dataplane downtime.

In other words, the measured non‑linear increase in downtime was caused
by PTF dataplane state rather than actual BGP control‑plane behavior.
The goal of this PR is to:

- Properly reset/clean relevant PTF dataplane state between runs,
- Ensure that measured dataplane downtime reflects only the real BGP and
data‑plane behavior,
- Restore a linear and predictable relationship between test scale
(routes/neighbors/iterations) and observed downtime.

#### How did you do it?

- Added logic to explicitly **clear PTF dataplane state between runs**,
including:
- Flushing or re‑initializing PTF packet filters used for counting
traffic to the prefixes under test.
- Resetting relevant PTF counters so that each run starts with a clean
environment.
- Updated the test flow so that:
- Each scale/iteration configuration first ensures PTF dataplane state
is clean before starting flaps and dataplane measurements.
- Dataplane downtime is computed only from counters and observations
collected **within** the current run, avoiding any contamination from
previous runs.
- Adjusted/factored helper utilities (where appropriate) so that the PTF
cleanup is:
- Centralized and reusable across the convergence, flap, and
route‑programming tests,
- Invoked consistently whenever a new test scenario or iteration is
started.
- Enhanced logging around:
  - When PTF dataplane state is cleared,
- Per‑iteration dataplane downtime measurements after the fix, so it is
easy to verify that:
    - Counters are reset when expected, and
- The resulting downtime scales linearly with the number of
ports/routes/iterations, reflecting actual BGP and dataplane behavior.

#### How did you verify/test it?
- Re‑ran the high‑bgp convergence, flap, and route‑programming tests
with the fixes applied:
  - Topology: `t0-isolated-d2u510s2`
  - Platform: Broadcom Arista-7060X6-64PE-B-C512S2
- Verified that:
- Measured downtime per iteration is stable and scales predictably with
load and iteration count.
- Spurious spikes caused by measurement artifacts are eliminated and
stay within millisecond compared to previous tens of seconds.
 
#### Any platform specific information?

#### Supported testbed topology if it's a new test case?

### Documentation
<!--
(If it's a new feature, new test case)
Did you update documentation/Wiki relevant to your implementation?
Link to the wiki page?
-->

---------

Signed-off-by: Priyansh Tratiya <ptratiya@microsoft.com>
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
* [201811][sairedis][swss] advance sub modules head

Submodule src/sonic-sairedis 18ad5f9..4c75b7f:
  > Fixed conditional operator. (sonic-net#487)

Submodule src/sonic-swss 1e99c93..cd12d48:
  > [teamsyncd]: Add information for LAG membership changes (sonic-net#982)
  > Fix vlan incremental config and add vs test cases (sonic-net#799)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

* [swss] include more swss changes

Submodule src/sonic-swss cd12d48..f44029d:
  > [MirrorOrch]: Init the next hop ip with 0 instead of default constructor (sonic-net#953)
  > [AclOrch]: Fix the acl mirror counter doubled by inactive mirror and active again (sonic-net#952)

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants