Skip to content

Conversation

@rjarry
Copy link
Collaborator

@rjarry rjarry commented Dec 3, 2025

Since commit e954453, deleting interfaces or nexthops causes a call to control_output_poll() in order to drain the packets that may reference the deleted object.

On shutdown, the modules finalization order causes control_output to be finalized before the iface module. When the iface module is finalized, all remaining interfaces are deleted and control_output_poll() accesses the ring that was already "freed".

ERROR: AddressSanitizer: heap-use-after-free on address 0x00016d19d608
at pc 0x00000045901c bp 0xffffca318bd0 sp 0xffffca318be8
READ of size 4 at 0x00016d19d608 thread T0
    #0 0x000000459018 in rte_ring_dequeue_bulk_elem ../subprojects/dpdk/lib/ring/rte_ring_elem.h:375
    https://github.com/DPDK/grout/pull/1 0x000000459018 in rte_ring_dequeue_elem ../subprojects/dpdk/lib/ring/rte_ring_elem.h:471
    https://github.com/DPDK/grout/pull/2 0x000000459018 in rte_ring_dequeue ../subprojects/dpdk/lib/ring/rte_ring.h:496
    https://github.com/DPDK/grout/issues/3 0x000000459018 in control_output_poll ../modules/infra/control/control_output.c:33
    https://github.com/DPDK/grout/issues/4 0x0000004648a0 in event_handler ../modules/infra/control/control_output.c:76
    https://github.com/DPDK/grout/issues/5 0x000000416974 in gr_event_push ../main/event.c:26
    https://github.com/DPDK/grout/issues/6 0x00000046a930 in iface_destroy ../modules/infra/control/iface.c:424
    https://github.com/DPDK/grout/issues/7 0x00000046b138 in iface_fini ../modules/infra/control/iface.c:462
    https://github.com/DPDK/grout/issues/8 0x00000041d63c in modules_fini ../main/module.c:112
    https://github.com/DPDK/grout/issues/9 0x00000041b5a4 in main ../main/main.c:326

Ensure both iface and nexthop are finalized before control_output.

Fixes: e954453

@rjarry rjarry force-pushed the control-output-fini branch from 7236ec3 to 0f9b358 Compare December 3, 2025 21:20
@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

📝 Walkthrough

Walkthrough

Adds a string.h include and changes module_is_child to copy depends_on into a 512-byte local buffer, tokenize on commas with strtok, and fnmatch each token against m->name; asserts input fits the buffer. No public APIs changed. Module dependency declarations updated: control_output "*""graph"; graph removed .depends_on = "iface"; iface "*route""*route,control_output"; nexthop "rcu""rcu,control_output". Test scripts: conditional cleanup in smoke/_init.sh and new smoke/shutdown_test.sh. Critical issue: fixed 512-byte buffer + assert may crash or overflow on long dependency strings.


📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e1deea and 25cea4c.

📒 Files selected for processing (7)
  • main/module.c (2 hunks)
  • modules/infra/control/control_output.c (1 hunks)
  • modules/infra/control/graph.c (0 hunks)
  • modules/infra/control/iface.c (1 hunks)
  • modules/infra/control/nexthop.c (1 hunks)
  • smoke/_init.sh (1 hunks)
  • smoke/shutdown_test.sh (1 hunks)
💤 Files with no reviewable changes (1)
  • modules/infra/control/graph.c
🚧 Files skipped from review as they are similar to previous changes (1)
  • smoke/_init.sh
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{c,h}

⚙️ CodeRabbit configuration file

**/*.{c,h}: - gr_vec_*() functions cannot fail. No need to check their return value.

  • gr_vec_free(x) always sets x = NULL. There is no risk of double free.
  • ec_node_*() functions consume their ec_node arguments. No leaks on error.
  • rte_node->ctx is an uint8_t array of size 16, not a pointer.
  • Never suggest to replace assert() with graceful error checking.
  • We compile with -std=gnu2x. Unnamed parameters in function definitions are valid.

Files:

  • modules/infra/control/iface.c
  • main/module.c
  • modules/infra/control/nexthop.c
  • modules/infra/control/control_output.c
**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: - Don't bother about unquoted shell variables.

Files:

  • smoke/shutdown_test.sh
🧠 Learnings (6)
📚 Learning: 2025-09-22T09:21:51.749Z
Learnt from: rjarry
Repo: DPDK/grout PR: 312
File: modules/ip/control/address.c:102-110
Timestamp: 2025-09-22T09:21:51.749Z
Learning: The rib4_insert() function in modules/ip/control/route.c takes ownership of the nexthop reference by calling nexthop_incref(nh) at the beginning and properly calls nexthop_decref(nh) on any failure paths via the fail: label, so callers don't need to manually decrement the reference count when rib4_insert fails.

Applied to files:

  • modules/infra/control/nexthop.c
📚 Learning: 2025-09-25T07:52:17.403Z
Learnt from: rjarry
Repo: DPDK/grout PR: 312
File: frr/rt_grout.c:355-361
Timestamp: 2025-09-25T07:52:17.403Z
Learning: The nexthop_new() function in FRR's zebra codebase cannot fail and return NULL - it will abort() the process if memory allocation fails, so null checks after calling nexthop_new() are unnecessary.

Applied to files:

  • modules/infra/control/nexthop.c
📚 Learning: 2025-09-22T09:21:51.749Z
Learnt from: rjarry
Repo: DPDK/grout PR: 312
File: modules/ip/control/address.c:102-110
Timestamp: 2025-09-22T09:21:51.749Z
Learning: The rib4_insert() function in the IP module takes ownership of the nexthop reference and automatically calls nexthop_decref(nh) on failure paths, so callers don't need to manually decrement the reference count when rib4_insert fails.

Applied to files:

  • modules/infra/control/nexthop.c
📚 Learning: 2025-11-05T13:55:26.189Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 372
File: smoke/cross_vrf_forward_test.sh:18-18
Timestamp: 2025-11-05T13:55:26.189Z
Learning: In the DPDK/grout codebase, VRF interfaces (named gr-vrf<id>) are automatically created when an interface is added to a non-existing VRF using port_add. The VRF creation is handled automatically by the event system in vrf_netlink.c, so no explicit VRF interface creation commands are needed in test scripts.

Applied to files:

  • smoke/shutdown_test.sh
📚 Learning: 2025-09-05T07:06:51.554Z
Learnt from: rjarry
Repo: DPDK/grout PR: 294
File: modules/policy/cli/meson.build:4-8
Timestamp: 2025-09-05T07:06:51.554Z
Learning: The grout project uses a modular CLI build system where individual modules contribute their CLI sources to a global cli_src variable via `cli_src += files(...)`, and the top-level meson.build file builds the final grcli executable using all collected CLI sources. The grcli executable is defined in the top-level meson.build at lines 122-127.

Applied to files:

  • smoke/shutdown_test.sh
📚 Learning: 2025-10-13T08:37:56.819Z
Learnt from: abhiramnarayana
Repo: DPDK/grout PR: 337
File: docs/grcli-dnat44.1.md:42-56
Timestamp: 2025-10-13T08:37:56.819Z
Learning: In the grout project's man pages, the EXAMPLES sections omit the `grcli` prefix from commands for brevity and readability, showing commands as if executed within a grcli context (e.g., `address show` instead of `grcli address show`). The full command structure with `grcli` is documented in the SYNOPSIS section.

Applied to files:

  • smoke/shutdown_test.sh
🧬 Code graph analysis (1)
smoke/shutdown_test.sh (1)
smoke/_init.sh (1)
  • fail (98-101)
🪛 Shellcheck (0.11.0)
smoke/shutdown_test.sh

[warning] 5-5: Quote this to prevent word splitting.

(SC2046)


[warning] 34-34: grout_pid is referenced but not assigned.

(SC2154)


[warning] 38-38: tmp is referenced but not assigned.

(SC2154)

🔇 Additional comments (5)
modules/infra/control/nexthop.c (1)

694-699: Dependency ordering fix looks correct.

Adding control_output to the dependency list ensures control_output is finalized after nexthop, preventing the use-after-free when nexthop_decref triggers control_output_poll during shutdown.

modules/infra/control/control_output.c (1)

130-135: Correct fix for finalization ordering.

Replacing the wildcard "*" with explicit "graph" dependency ensures control_output is no longer prematurely finalized. Modules that now explicitly depend on control_output (iface, nexthop) will be finalized first, preventing access to the freed ctrlout_ring.

modules/infra/control/iface.c (1)

470-475: Dependency addition prevents use-after-free.

Adding control_output ensures the ring is still valid when iface_destroy triggers GR_EVENT_IFACE_REMOVE, which drains packets that may reference the deleted interface.

main/module.c (1)

62-80: Comma-separated dependency parsing looks correct.

The implementation safely copies to a local buffer (guarded by assert), tokenizes on commas, and matches each dependency token against the module name. This enables modules to declare multiple dependencies like "rcu,control_output".

smoke/shutdown_test.sh (1)

1-38: Test properly exercises the shutdown fix.

The test creates interfaces and nexthops across VRFs, then triggers a graceful shutdown via SIGINT. The grout_pid and tmp variables are defined in the sourced _init.sh script. The cleanup of grcli commands after shutdown prevents attempts to connect to a stopped grout instance.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@maxime-leroy
Copy link
Collaborator

Explain ?

@rjarry
Copy link
Collaborator Author

rjarry commented Dec 3, 2025

Explain ?

See top level description

@maxime-leroy
Copy link
Collaborator

Retag 14.0 ?

@rjarry
Copy link
Collaborator Author

rjarry commented Dec 3, 2025

No, I will tag a 0.14.1 once this is merged.

@rjarry rjarry force-pushed the control-output-fini branch from 0f9b358 to c8d5852 Compare December 4, 2025 08:18
@christophefontaine
Copy link
Collaborator

Tested-By: Christophe Fontaine [email protected]

@grout-bot grout-bot force-pushed the control-output-fini branch from c8d5852 to 3d37a22 Compare December 4, 2025 08:38
@grout-bot grout-bot force-pushed the control-output-fini branch from 3d37a22 to 9b85f3c Compare December 4, 2025 08:55
@rjarry rjarry force-pushed the control-output-fini branch from 9b85f3c to 3e1deea Compare December 4, 2025 09:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
smoke/shutdown_test.sh (1)

5-5: Quote the command substitution.

To prevent potential word splitting, apply this fix:

-. $(dirname $0)/_init.sh
+. "$(dirname $0)"/_init.sh
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d37a22 and 3e1deea.

📒 Files selected for processing (7)
  • main/module.c (2 hunks)
  • modules/infra/control/control_output.c (1 hunks)
  • modules/infra/control/graph.c (0 hunks)
  • modules/infra/control/iface.c (1 hunks)
  • modules/infra/control/nexthop.c (1 hunks)
  • smoke/_init.sh (1 hunks)
  • smoke/shutdown_test.sh (1 hunks)
💤 Files with no reviewable changes (1)
  • modules/infra/control/graph.c
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/infra/control/nexthop.c
  • modules/infra/control/iface.c
🧰 Additional context used
📓 Path-based instructions (2)
**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: - Don't bother about unquoted shell variables.

Files:

  • smoke/_init.sh
  • smoke/shutdown_test.sh
**/*.{c,h}

⚙️ CodeRabbit configuration file

**/*.{c,h}: - gr_vec_*() functions cannot fail. No need to check their return value.

  • gr_vec_free(x) always sets x = NULL. There is no risk of double free.
  • ec_node_*() functions consume their ec_node arguments. No leaks on error.
  • rte_node->ctx is an uint8_t array of size 16, not a pointer.
  • Never suggest to replace assert() with graceful error checking.
  • We compile with -std=gnu2x. Unnamed parameters in function definitions are valid.

Files:

  • main/module.c
  • modules/infra/control/control_output.c
🧠 Learnings (2)
📚 Learning: 2025-11-05T13:55:26.189Z
Learnt from: maxime-leroy
Repo: DPDK/grout PR: 372
File: smoke/cross_vrf_forward_test.sh:18-18
Timestamp: 2025-11-05T13:55:26.189Z
Learning: In the DPDK/grout codebase, VRF interfaces (named gr-vrf<id>) are automatically created when an interface is added to a non-existing VRF using port_add. The VRF creation is handled automatically by the event system in vrf_netlink.c, so no explicit VRF interface creation commands are needed in test scripts.

Applied to files:

  • smoke/_init.sh
  • smoke/shutdown_test.sh
📚 Learning: 2025-10-13T08:37:56.819Z
Learnt from: abhiramnarayana
Repo: DPDK/grout PR: 337
File: docs/grcli-dnat44.1.md:42-56
Timestamp: 2025-10-13T08:37:56.819Z
Learning: In the grout project's man pages, the EXAMPLES sections omit the `grcli` prefix from commands for brevity and readability, showing commands as if executed within a grcli context (e.g., `address show` instead of `grcli address show`). The full command structure with `grcli` is documented in the SYNOPSIS section.

Applied to files:

  • smoke/shutdown_test.sh
🧬 Code graph analysis (1)
smoke/shutdown_test.sh (1)
smoke/_init.sh (1)
  • fail (98-101)
🪛 Shellcheck (0.11.0)
smoke/shutdown_test.sh

[warning] 5-5: Quote this to prevent word splitting.

(SC2046)


[warning] 34-34: grout_pid is referenced but not assigned.

(SC2154)


[warning] 38-38: tmp is referenced but not assigned.

(SC2154)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: build-and-tests (gcc-13, none, debugoptimized, ubuntu-24.04, -Dfrr=enabled, false)
  • GitHub Check: build-and-tests (gcc-14, address, debug, ubuntu-24.04, -Dfrr=enabled, true)
  • GitHub Check: build-and-tests (clang-18, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (clang-16, none, debugoptimized, ubuntu-24.04, false)
  • GitHub Check: build-and-tests (clang-15, none, debugoptimized, ubuntu-22.04, false)
  • GitHub Check: deb
  • GitHub Check: rpm
🔇 Additional comments (4)
modules/infra/control/control_output.c (1)

130-135: Dependency change correctly fixes the finalization order.

The switch from "*" to "graph" combined with iface/nexthop now depending on control_output ensures that interfaces and nexthops are finalized before ctrlout_ring is freed, preventing the use-after-free.

main/module.c (1)

62-80: Implementation correctly enables comma-separated dependencies.

The tokenization logic properly splits depends_on on commas and checks each token via fnmatch. The local buffer copy avoids modifying the original constant string. This enables the multi-dependency declarations needed for the fix (e.g., "*route,control_output").

smoke/_init.sh (1)

45-58: Good defensive handling of socket availability during cleanup.

Wrapping interface deletion in a socket connectivity check ensures cleanup proceeds gracefully whether grout is still running or has crashed, aligning with the PR's shutdown robustness goal.

smoke/shutdown_test.sh (1)

1-38: Test logic is sound for the graceful shutdown scenario.

The script correctly sets up state, sends a shutdown signal to grout, and strips grcli commands from cleanup to avoid post-shutdown invocations. However, note that this test assumes grout was started by _init.sh (i.e., run_grout=true). If run against an external grout instance where run_grout=false, the test will fail because grout_pid won't be defined at line 34. This may be acceptable by design, but consider whether an explicit guard or skip condition would improve robustness.

This reverts commit 2f97bb1.

Signed-off-by: Robin Jarry <[email protected]>
Tested-by: Christophe Fontaine <[email protected]>
Reviewed-by: Maxime Leroy <[email protected]>
Allow specifying multiple patterns separated by commas in the
gr_module.depends_on string.

Signed-off-by: Robin Jarry <[email protected]>
Tested-by: Christophe Fontaine <[email protected]>
Reviewed-by: Maxime Leroy <[email protected]>
Since commit e954453 ("control-output: prevent use-after-free on
object deletion"), deleting interfaces or nexthops causes a call to
control_output_poll() in order to drain the packets that may reference
the deleted object.

On shutdown, the modules finalization order causes control_output to be
finalized before the iface module. When the iface module is finalized,
all remaining interfaces are deleted and control_output_poll() accesses
the ring that was already "freed".

ERROR: AddressSanitizer: heap-use-after-free on address 0x00016d19d608
at pc 0x00000045901c bp 0xffffca318bd0 sp 0xffffca318be8
READ of size 4 at 0x00016d19d608 thread T0
    #0 0x000000459018 in rte_ring_dequeue_bulk_elem ../subprojects/dpdk/lib/ring/rte_ring_elem.h:375
    DPDK#1 0x000000459018 in rte_ring_dequeue_elem ../subprojects/dpdk/lib/ring/rte_ring_elem.h:471
    DPDK#2 0x000000459018 in rte_ring_dequeue ../subprojects/dpdk/lib/ring/rte_ring.h:496
    DPDK#3 0x000000459018 in control_output_poll ../modules/infra/control/control_output.c:33
    DPDK#4 0x0000004648a0 in event_handler ../modules/infra/control/control_output.c:76
    DPDK#5 0x000000416974 in gr_event_push ../main/event.c:26
    DPDK#6 0x00000046a930 in iface_destroy ../modules/infra/control/iface.c:424
    DPDK#7 0x00000046b138 in iface_fini ../modules/infra/control/iface.c:462
    DPDK#8 0x00000041d63c in modules_fini ../main/module.c:112
    DPDK#9 0x00000041b5a4 in main ../main/main.c:326

Ensure both iface and nexthop are finalized before control_output.

Fixes: e954453 ("control-output: prevent use-after-free on object deletion")
Signed-off-by: Robin Jarry <[email protected]>
Tested-by: Christophe Fontaine <[email protected]>
Reviewed-by: Maxime Leroy <[email protected]>
Configure a bunch of stuff and shutdown grout without clearing the stuff
first. Ensure grout shuts down without issues.

In the cleanup() function, only delete interfaces if grout is still
listening to the api socket. This avoids spurious error messages during
cleanup.

Signed-off-by: Robin Jarry <[email protected]>
Reviewed-by: Maxime Leroy <[email protected]>
@grout-bot grout-bot force-pushed the control-output-fini branch from 3e1deea to 25cea4c Compare December 4, 2025 09:59
@maxime-leroy maxime-leroy merged commit a5face0 into DPDK:main Dec 4, 2025
6 checks passed
@rjarry rjarry deleted the control-output-fini branch December 4, 2025 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants