Skip to content

Fix ExaBGP v4 hang during announce routes stress tests#16744

Merged
yejianquan merged 2 commits intosonic-net:masterfrom
opcoder0:fix/exabgp-v4-hang
Feb 1, 2025
Merged

Fix ExaBGP v4 hang during announce routes stress tests#16744
yejianquan merged 2 commits intosonic-net:masterfrom
opcoder0:fix/exabgp-v4-hang

Conversation

@opcoder0
Copy link
Copy Markdown
Contributor

Description of PR

The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405
  • 202411

Not applicable to backport since it only applies to docker-ptf PY3 only image.

Approach

What is the motivation for this PR?

The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

How did you do it?

Set exabgp.api.ack to false.

How did you verify/test it?

Ran tests/stress/test_stress_routes.py on 7050cx3 to confirm.

Any platform specific information?

No

Supported testbed topology if it's a new test case?

NA

Documentation

NA

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@opcoder0 opcoder0 requested a review from xwjiang-ms January 31, 2025 07:14
@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Collaborator

@yejianquan yejianquan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@opcoder0
Copy link
Copy Markdown
Contributor Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 16744 in repo sonic-net/sonic-mgmt

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@yejianquan yejianquan merged commit c5048d2 into sonic-net:master Feb 1, 2025
16 checks passed
nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
Description of PR
The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

Summary:
Fixes # (issue)

Not applicable to backport since it only applies to docker-ptf PY3 only image.

Approach
What is the motivation for this PR?
The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

How did you do it?
Set exabgp.api.ack to false.

How did you verify/test it?
Ran tests/stress/test_stress_routes.py on 7050cx3 to confirm.

Any platform specific information?
No

co-authorized by: jianquanye@microsoft.com
bingwang-ms added a commit to Azure/sonic-mgmt.msft that referenced this pull request Dec 10, 2025
This PR is to backport exabgp changes from master branch to 202503.
PR list
- sonic-net/sonic-mgmt#16744
- sonic-net/sonic-mgmt#17427
- sonic-net/sonic-mgmt#16834
- sonic-net/sonic-mgmt#18181
- sonic-net/sonic-mgmt#18476


These changes are required for running test `bgp/test_traffic_shift.py`
auspham pushed a commit to auspham/sonic-mgmt that referenced this pull request Feb 3, 2026
Description of PR
The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

Summary:
Fixes # (issue)

Not applicable to backport since it only applies to docker-ptf PY3 only image.

Approach
What is the motivation for this PR?
The docker-ptf PY3 only image uses ExaBGP v4. The ExaBGP v4+ requires process API to acknowledge messages from ExaBGP else the server blocks after a while and hangs. This causes BGP timeouts in our stress tests. This PR fixes the issue.

How did you do it?
Set exabgp.api.ack to false.

How did you verify/test it?
Ran tests/stress/test_stress_routes.py on 7050cx3 to confirm.

Any platform specific information?
No

co-authorized by: jianquanye@microsoft.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants