Skip to content

zebra: fix Out Of Memory issue when displaying large route tables in JSON#16093

Merged
ton31337 merged 5 commits intoFRRouting:masterfrom
louis-6wind:fix-show-route-memory
Jun 10, 2024
Merged

zebra: fix Out Of Memory issue when displaying large route tables in JSON#16093
ton31337 merged 5 commits intoFRRouting:masterfrom
louis-6wind:fix-show-route-memory

Conversation

@louis-6wind
Copy link
Contributor

The command show ip[v6] route [vrf (all|XX)] json may cause an out-of-memory error when attempting to display large route tables.

Currently, JSON objects are created for every prefix and stored in a table JSON object. Once all the prefixes are collected, the table JSON is dumped, and all associated JSON objects are freed. Since commit 0e2fc3d, table JSON objects for each VRF are stored in a root JSON object. Similarly, the root JSON is dumped, and all related JSON objects are freed.

When the route tables are large, this process allocates a significant amount of memory for all the JSON objects containing the prefixes. In some cases, this leads to a crash of the protocol daemon due to insufficient memory on the machine.

To address this, instead of storing all the prefix JSON objects in underlying JSON objects before displaying them, modify the approach to allocate a JSON object for the first prefix, display it, and then free it. Repeat this process for each subsequent prefix object.

@frrbot frrbot bot added bugfix libfrr tests Topotests, make check, etc zebra labels May 27, 2024
@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from 6465567 to 6e7f73c Compare May 27, 2024 10:02
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious in what kind of setup OOM is happening? 128Mb of memory? My point is if we really need this custom crafted stuff versus reusing what the library offers?

@louis-6wind
Copy link
Contributor Author

louis-6wind commented May 27, 2024

I'm curious in what kind of setup OOM is happening? 128Mb of memory? My point is if we really need this custom crafted stuff versus reusing what the library offers?

Tested with a full-route and "show ip route json". Got a peak of 8.6g, which is not reasonable.

# top -n 1 -p $(pidof zebra) |grep zebra
 1014 root      20   0 9779220   8.6g  11120 R  93.3  36.5   3:36.34 zebra   

@louis-6wind
Copy link
Contributor Author

We already do the same thing for "show bgp XX json"

vty_out(vty, "\"paths\": ");

@ton31337
Copy link
Member

Ah, GiB is not cool ;-)

@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from 6e7f73c to 70058a3 Compare May 28, 2024 08:37
@louis-6wind
Copy link
Contributor Author

@donaldsharp, the patch is improving a little bit the performances.

It was tested with 760K routes.

Before patches

# time vtysh -c 'show ip route json' >/dev/null
% Can't open configuration file /etc/frr/vtysh.conf due to 'No such file or directory'.
Configuration file[/etc/frr/frr.conf] processing failure: 11

real	0m14.506s
user	0m0.366s
sys	0m0.084s

After patches

# time vtysh -c 'show ip route json' >/dev/null
% Can't open configuration file /etc/frr/vtysh.conf due to 'No such file or directory'.
Configuration file[/etc/frr/frr.conf] processing failure: 11

real	0m10.818s
user	0m0.203s
sys	0m0.181s

@donaldsharp
Copy link
Member

tested this code locally and I am seeing significant savings in run time too. Let's get this in.

@ton31337 ton31337 added this to the 10.1 milestone May 29, 2024
@donaldsharp
Copy link
Member

Once CI finishes I'll get this in

@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from 70058a3 to f130e87 Compare June 4, 2024 08:17
Copy link
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix your commits - we can't really have "revert", and "fix", "fix", "fix". please rebase the changes into a series of commits that's meaningful.

@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from f130e87 to 790e52e Compare June 4, 2024 12:33
@louis-6wind
Copy link
Contributor Author

please fix your commits - we can't really have "revert", and "fix", "fix", "fix". please rebase the changes into a series of commits that's meaningful.

"fix" commits were pushed by mistake. Done

@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from 790e52e to 718d130 Compare June 5, 2024 11:07
…command 'show ip route vrf all json'"

This reverts commit 0e2fc3d.

This fix was correct but not optimal for memory consumption at scale.

Signed-off-by: Louis Scalbert <[email protected]>
Add helpers to print json keys in order to prepare the next commits.

Signed-off-by: Louis Scalbert <[email protected]>
0e2fc3d ("vtysh, zebra: Fix malformed json output for multiple vrfs
in command 'show ip route vrf all json'") has been reverted in the
previous commit. Although the fix was correct, it was consuming too muca
memory when displaying large route tables.

A root JSON object was storing all the JSON objects containing the route
tables, each containing their respective prefixes in JSON objects. This
resulted in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine.

To Fix the memory consumption issue for the "show ip[v6] route vrf all
json" command, display the tables one by one and free the memory of each
JSON object after it has been displayed.

Signed-off-by: Louis Scalbert <[email protected]>
When displaying a route table in JSON, a table JSON object is storing
all the prefix JSON objects containing the prefix information. This
results in excessive memory allocation for JSON objects, potentially
leading to an out-of-memory error on the machine with large routing
tables.

To Fix the memory consumption issue for the "show ip[v6] route [vrf XX]
json" command, display the prefixes one by one and free the memory of
each JSON object after it has been displayed.

Signed-off-by: Louis Scalbert <[email protected]>
Check that "show ip route vrf XXX json" and the JSON at key "XXX" of
"show ip route vrf all json" gives the same output.

Signed-off-by: Louis Scalbert <[email protected]>
@louis-6wind louis-6wind force-pushed the fix-show-route-memory branch from 718d130 to 2d6dcc0 Compare June 7, 2024 08:13
Copy link
Member

@ton31337 ton31337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@louis-6wind
Copy link
Contributor Author

ci:rerun

@ton31337 ton31337 merged commit 19c3e0e into FRRouting:master Jun 10, 2024
r12f pushed a commit to Azure/sonic-buildimage-msft that referenced this pull request Jul 30, 2025
…oute commands in scaled setups (#1410)

<!--
Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

** Make sure all your commits include a signature generated with `git
commit -s` **

If this is a bug fix, make sure your description includes "fixes #xxxx",
or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
Fix memory consumption with zebra during the show ipv6 route commands in
scaled setups


##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Ported fix FRRouting/frr#16093 as patch 

#### How to verify it
Run vtysh -c 'show ipv6 route json' in scaled setup and check zebra
memory consumption.

<!--
If PR needs to be backported, then the PR must be tested against the
base branch and the earliest backport release branch and provide tested
image version on these two branches. For example, if the PR is requested
for master, 202211 and 202012, then the requester needs to provide test
results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
Ensure to add label/tag for the feature raised. example - PR#2174 under
sonic-utilities repo. where, Generic Config and Update feature has been
labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on
https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants