Skip to content

Mgmt frontend problems in zebra#21252

Open
donaldsharp wants to merge 5 commits intoFRRouting:masterfrom
donaldsharp:mgmt_frontend_problems_in_zebra
Open

Mgmt frontend problems in zebra#21252
donaldsharp wants to merge 5 commits intoFRRouting:masterfrom
donaldsharp:mgmt_frontend_problems_in_zebra

Conversation

@donaldsharp
Copy link
Member

See individual commits for more detail:

Move ip import-table, zebra work-queue, zebra zapi-packets, zebra dplane limit and allow-external-route-update to their appropriate place in the mgmt frontend.

Add tests to show things are working better.

@frrbot frrbot bot added tests Topotests, make check, etc zebra labels Mar 18, 2026
@greptile-apps
Copy link

greptile-apps bot commented Mar 18, 2026

Greptile Summary

This PR migrates five zebra configuration commands — ip/ipv6 import-table, zebra work-queue, zebra zapi-packets, zebra dplane limit, and allow-external-route-update — from legacy non-YANG DEFUN/DEFPY handlers in zebra_vty.c into the northbound management frontend (zebra_cli.c / zebra_nb_config.c). Concurrently, the YANG model converts import-kernel-table from a singleton container to a keyed list, enabling multiple AFI/SAFI combinations to coexist. The YANG default values for workqueue-hold-timer (10 ms), zapi-packets (1000), and dplane-queue-limit (200) are consistent with the corresponding C constants (ZEBRA_RIB_PROCESS_HOLD_TIME, ZEBRA_ZAPI_PACKETS_TO_PROCESS, DPLANE_DEFAULT_MAX_QUEUED), so no command semantics are preserved. Topotests are extended with show running verification across all three new code paths.

Key findings from the review:

  • Potential bug in zebra_import_kernel_table_route_map_destroy: when only the route-map leaf is deleted (e.g., via mgmtd), the callback calls zebra_import_kernel_table_apply(dnode->parent, true), but dnode->parent in FRR's northbound diff tree still carries the old (pre-deletion) route-map leaf. This means zebra_import_table is re-called with the route-map intact rather than removing it. The CLI path (no ip/ipv6 import-table) destroys the whole list entry and is not affected, but YANG-level leaf-delete operations would silently fail to clear the route-map.
  • Redundant callbacks: creating a new import-kernel-table list entry with both distance and route-map triggers three independent NB_EV_APPLY callbacks (create + distance modify + route-map modify), each calling zebra_import_table. This is safe due to idempotency but inefficient.
  • Minor CLI asymmetry: IPv4 uses two separate commands for add/remove while IPv6 unifies them under a single [no] command — functionally equivalent but inconsistent.

Confidence Score: 3/5

  • Safe to merge for CLI-driven workflows, but the route-map leaf destroy path via YANG/mgmtd may silently leave stale route-map filters in place.
  • The core CLI migration is well-structured and the YANG/C constant consistency is verified. However, zebra_import_kernel_table_route_map_destroy has a likely logic bug where calling zebra_import_kernel_table_apply with add=true on the diff dnode will re-apply the import table with the old route-map still visible, failing to clear it. This doesn't affect normal CLI usage (which destroys the whole list entry) but would affect YANG-level operations. The redundant callback calls are a robustness concern but not correctness-breaking for the CLI path.
  • zebra/zebra_nb_config.c — specifically the zebra_import_kernel_table_route_map_destroy callback needs attention.

Important Files Changed

Filename Overview
zebra/zebra_nb_config.c Implements previously-stubbed northbound callbacks for workqueue timer, ZAPI packets, dplane queue limit, and import-kernel-table. The create/destroy/modify logic is now functional. One potential bug: zebra_import_kernel_table_route_map_destroy calls apply(dnode, true) which may re-apply with the stale (pre-deletion) route-map still visible via dnode->parent, preventing proper route-map removal via YANG-level operations.
zebra/zebra_cli.c Adds YANG-backed CLI definitions for allow-external-route-update, zebra dplane limit, zebra zapi-packets, zebra work-queue, and ip/ipv6 import-table, along with their corresponding cli_show write functions. Constants (default distance 15, YANG defaults) are consistent with the C-side definitions. Minor asymmetry between IPv4 and IPv6 no import-table command shapes.
yang/frr-zebra.yang Correctly converts import-kernel-table from a container (single entry) to a list keyed by afi-safi and table-id, enabling multiple AFI/SAFI combinations. Adds an appropriate must constraint restricting afi-safi to IPv4/IPv6 unicast/multicast. No issues found.
zebra/zebra_nb.c Updates callback registration from the old per-leaf (table-id modify/destroy) to the new list-entry level (create/destroy) to align with the YANG schema change. Correct and minimal.
zebra/zebra_nb.h Function declarations updated to match the new create/destroy callback signatures. No issues.
zebra/zebra_vty.c Removes all non-YANG CLI implementations that have been migrated to zebra_cli.c. The zebra_ip_config write function is correctly emptied since import-table config output is now handled by the YANG cli_show callbacks.
tests/topotests/mgmt_config/test_config.py Adds expect_show_running helper to reduce boilerplate, extends cleanup_config to remove new config items, refactors existing show-running checks, and adds three new test functions for import-table (with and without route-map) and mgmt frontend smoke tests.
tests/topotests/zebra_rib/test_zebra_import.py Adds check_show_running helper and verifies show running output after import-table add/remove operations, including route-map tests. A missing blank line before test_zebra_mrib_import is a minor PEP 8 nit.
tests/topotests/zebra_ipv6_import_table/test_zebra_ipv6_import_table.py Mirrors the IPv4 test enhancements for IPv6 import-table, including check_show_running and route-map test coverage. Same minor missing blank line before test_zebra_mrib_import.

Sequence Diagram

sequenceDiagram
    participant User
    participant VTY
    participant NB as Northbound (zebra_cli.c)
    participant Config as zebra_nb_config.c
    participant Zebra as zebra_import_table()

    User->>VTY: ip import-table 10 distance 15 route-map FOO
    VTY->>NB: nb_cli_enqueue_change(NB_OP_CREATE, list-entry)
    VTY->>NB: nb_cli_enqueue_change(NB_OP_MODIFY, distance=15)
    VTY->>NB: nb_cli_enqueue_change(NB_OP_MODIFY, route-map=FOO)
    VTY->>NB: nb_cli_apply_changes()
    NB->>Config: zebra_import_kernel_table_create (NB_EV_APPLY)
    Config->>Zebra: zebra_import_table(AFI_IP, SAFI_UNICAST, 10, 15, FOO, add=true)
    NB->>Config: zebra_import_kernel_table_distance_modify (NB_EV_APPLY)
    Config->>Zebra: zebra_import_table(AFI_IP, SAFI_UNICAST, 10, 15, FOO, add=true)
    NB->>Config: zebra_import_kernel_table_route_map_modify (NB_EV_APPLY)
    Config->>Zebra: zebra_import_table(AFI_IP, SAFI_UNICAST, 10, 15, FOO, add=true)

    Note over NB,Zebra: zebra_import_table called 3× (same args)

    User->>VTY: no ip import-table 10
    VTY->>NB: nb_cli_enqueue_change(NB_OP_DESTROY, list-entry)
    VTY->>NB: nb_cli_apply_changes()
    NB->>Config: zebra_import_kernel_table_destroy (NB_EV_APPLY)
    Config->>Zebra: zebra_import_table(AFI_IP, SAFI_UNICAST, 10, 0, NULL, add=false)
Loading

Comments Outside Diff (2)

  1. zebra/zebra_nb_config.c, line 831-862 (link)

    P2 Redundant zebra_import_table calls per transaction

    When a user runs ip import-table 10 distance 15 route-map FOO, three changes are enqueued: NB_OP_CREATE (list entry), NB_OP_MODIFY (distance), and NB_OP_MODIFY (route-map). All three are applied within a single nb_cli_apply_changes transaction, so three separate NB_EV_APPLY callbacks fire:

    1. zebra_import_kernel_table_create → calls zebra_import_kernel_table_apply
    2. zebra_import_kernel_table_distance_modify → calls zebra_import_kernel_table_apply
    3. zebra_import_kernel_table_route_map_modify → calls zebra_import_kernel_table_apply

    Each call invokes zebra_import_table with the same final parameters, resulting in the kernel table being imported three times. Since zebra_import_table is effectively idempotent this is safe, but it is wasteful and could cause transient intermediate states if the distance or route-map child callbacks see incomplete data before create fully applies.

    Consider skipping the zebra_import_table call in create (letting only the leaf callbacks drive the apply) or, conversely, skipping it in the leaf modify/destroy callbacks when the parent is being simultaneously created.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: zebra/zebra_nb_config.c
    Line: 831-862
    
    Comment:
    **Redundant `zebra_import_table` calls per transaction**
    
    When a user runs `ip import-table 10 distance 15 route-map FOO`, three changes are enqueued: `NB_OP_CREATE` (list entry), `NB_OP_MODIFY` (distance), and `NB_OP_MODIFY` (route-map). All three are applied within a single `nb_cli_apply_changes` transaction, so three separate NB_EV_APPLY callbacks fire:
    
    1. `zebra_import_kernel_table_create` → calls `zebra_import_kernel_table_apply`
    2. `zebra_import_kernel_table_distance_modify` → calls `zebra_import_kernel_table_apply`
    3. `zebra_import_kernel_table_route_map_modify` → calls `zebra_import_kernel_table_apply`
    
    Each call invokes `zebra_import_table` with the same final parameters, resulting in the kernel table being imported three times. Since `zebra_import_table` is effectively idempotent this is safe, but it is wasteful and could cause transient intermediate states if the `distance` or `route-map` child callbacks see incomplete data before `create` fully applies.
    
    Consider skipping the `zebra_import_table` call in `create` (letting only the leaf callbacks drive the apply) or, conversely, skipping it in the leaf modify/destroy callbacks when the parent is being simultaneously created.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. zebra/zebra_cli.c, line 595-624 (link)

    P2 Asymmetry between IPv4 and IPv6 no import-table command definitions

    The IPv4 side uses two separate DEFPY_YANG macros — ip_zebra_import_table_distance_cmd (positive form only) and no_ip_zebra_import_table_cmd (negative form only). The IPv6 side unifies both into a single [no] ipv6 import-table command via ipv6_zebra_import_table_distance_cmd.

    This asymmetry is purely cosmetic (both paths call NB_OP_DESTROY on the same list entry xpath), but it means the IPv4 positive command does not accept [no], while the IPv6 one does. If the intent is to follow the same user-facing pattern as IPv6, the IPv4 side could be unified similarly.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: zebra/zebra_cli.c
    Line: 595-624
    
    Comment:
    **Asymmetry between IPv4 and IPv6 `no import-table` command definitions**
    
    The IPv4 side uses two separate `DEFPY_YANG` macros — `ip_zebra_import_table_distance_cmd` (positive form only) and `no_ip_zebra_import_table_cmd` (negative form only). The IPv6 side unifies both into a single `[no] ipv6 import-table` command via `ipv6_zebra_import_table_distance_cmd`.
    
    This asymmetry is purely cosmetic (both paths call `NB_OP_DESTROY` on the same list entry xpath), but it means the IPv4 positive command does not accept `[no]`, while the IPv6 one does. If the intent is to follow the same user-facing pattern as IPv6, the IPv4 side could be unified similarly.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
This is a comment left during a code review.
Path: zebra/zebra_nb_config.c
Line: 239-251

Comment:
**`route_map_destroy` re-applies with stale route-map data**

In FRR's northbound framework, `args->dnode` for a destroy callback points to the **old** (pre-deletion) diff node. Therefore `args->dnode->parent` is the list entry as it existed before the change, meaning the route-map leaf is still present as a child. As a result, `yang_dnode_exists(dnode, "route-map")` returns `true` and the old route-map name is read, so `zebra_import_kernel_table_apply(dnode, true)` re-applies the import table *with the route-map still in effect* instead of removing it.

This path is not exercised by the CLI (`no ip import-table` / `no ipv6 import-table` always destroys the entire list entry via `NB_OP_DESTROY` on the entry xpath, triggering `zebra_import_kernel_table_destroy`), but it is reachable via direct YANG/mgmtd operations that delete only the route-map leaf while keeping the list entry.

The callback needs to call `zebra_import_table` explicitly with `rmap = NULL` to apply the updated state without the route-map:

```c
int zebra_import_kernel_table_route_map_destroy(struct nb_cb_destroy_args *args)
{
	const struct lyd_node *dnode =
		(const struct lyd_node *)args->dnode->parent;
	const char *afi_safi = yang_dnode_get_string(dnode, "afi-safi");
	uint32_t table_id = yang_dnode_get_uint32(dnode, "table-id");
	uint32_t distance = yang_dnode_get_uint32(dnode, "distance");
	afi_t afi;
	safi_t safi;

	if (args->event != NB_EV_APPLY)
		return NB_OK;

	yang_afi_safi_identity2value(afi_safi, &afi, &safi);

	/* Re-apply without the route-map since it is being removed */
	if (zebra_import_table(afi, safi, VRF_DEFAULT, table_id, distance,
			       NULL, true) < 0)
		return NB_ERR;

	return NB_OK;
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: zebra/zebra_nb_config.c
Line: 831-862

Comment:
**Redundant `zebra_import_table` calls per transaction**

When a user runs `ip import-table 10 distance 15 route-map FOO`, three changes are enqueued: `NB_OP_CREATE` (list entry), `NB_OP_MODIFY` (distance), and `NB_OP_MODIFY` (route-map). All three are applied within a single `nb_cli_apply_changes` transaction, so three separate NB_EV_APPLY callbacks fire:

1. `zebra_import_kernel_table_create` → calls `zebra_import_kernel_table_apply`
2. `zebra_import_kernel_table_distance_modify` → calls `zebra_import_kernel_table_apply`
3. `zebra_import_kernel_table_route_map_modify` → calls `zebra_import_kernel_table_apply`

Each call invokes `zebra_import_table` with the same final parameters, resulting in the kernel table being imported three times. Since `zebra_import_table` is effectively idempotent this is safe, but it is wasteful and could cause transient intermediate states if the `distance` or `route-map` child callbacks see incomplete data before `create` fully applies.

Consider skipping the `zebra_import_table` call in `create` (letting only the leaf callbacks drive the apply) or, conversely, skipping it in the leaf modify/destroy callbacks when the parent is being simultaneously created.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: zebra/zebra_cli.c
Line: 595-624

Comment:
**Asymmetry between IPv4 and IPv6 `no import-table` command definitions**

The IPv4 side uses two separate `DEFPY_YANG` macros — `ip_zebra_import_table_distance_cmd` (positive form only) and `no_ip_zebra_import_table_cmd` (negative form only). The IPv6 side unifies both into a single `[no] ipv6 import-table` command via `ipv6_zebra_import_table_distance_cmd`.

This asymmetry is purely cosmetic (both paths call `NB_OP_DESTROY` on the same list entry xpath), but it means the IPv4 positive command does not accept `[no]`, while the IPv6 one does. If the intent is to follow the same user-facing pattern as IPv6, the IPv4 side could be unified similarly.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "tests: Modify tests ..."

Comment on lines 239 to 251
int zebra_import_kernel_table_route_map_destroy(struct nb_cb_destroy_args *args)
{
switch (args->event) {
case NB_EV_VALIDATE:
case NB_EV_PREPARE:
case NB_EV_ABORT:
case NB_EV_APPLY:
/* TODO: implement me. */
break;
}
const struct lyd_node *dnode =
(const struct lyd_node *)args->dnode->parent;

if (args->event != NB_EV_APPLY)
return NB_OK;

if (zebra_import_kernel_table_apply(dnode, true) < 0)
return NB_ERR;

return NB_OK;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 route_map_destroy re-applies with stale route-map data

In FRR's northbound framework, args->dnode for a destroy callback points to the old (pre-deletion) diff node. Therefore args->dnode->parent is the list entry as it existed before the change, meaning the route-map leaf is still present as a child. As a result, yang_dnode_exists(dnode, "route-map") returns true and the old route-map name is read, so zebra_import_kernel_table_apply(dnode, true) re-applies the import table with the route-map still in effect instead of removing it.

This path is not exercised by the CLI (no ip import-table / no ipv6 import-table always destroys the entire list entry via NB_OP_DESTROY on the entry xpath, triggering zebra_import_kernel_table_destroy), but it is reachable via direct YANG/mgmtd operations that delete only the route-map leaf while keeping the list entry.

The callback needs to call zebra_import_table explicitly with rmap = NULL to apply the updated state without the route-map:

int zebra_import_kernel_table_route_map_destroy(struct nb_cb_destroy_args *args)
{
	const struct lyd_node *dnode =
		(const struct lyd_node *)args->dnode->parent;
	const char *afi_safi = yang_dnode_get_string(dnode, "afi-safi");
	uint32_t table_id = yang_dnode_get_uint32(dnode, "table-id");
	uint32_t distance = yang_dnode_get_uint32(dnode, "distance");
	afi_t afi;
	safi_t safi;

	if (args->event != NB_EV_APPLY)
		return NB_OK;

	yang_afi_safi_identity2value(afi_safi, &afi, &safi);

	/* Re-apply without the route-map since it is being removed */
	if (zebra_import_table(afi, safi, VRF_DEFAULT, table_id, distance,
			       NULL, true) < 0)
		return NB_ERR;

	return NB_OK;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_nb_config.c
Line: 239-251

Comment:
**`route_map_destroy` re-applies with stale route-map data**

In FRR's northbound framework, `args->dnode` for a destroy callback points to the **old** (pre-deletion) diff node. Therefore `args->dnode->parent` is the list entry as it existed before the change, meaning the route-map leaf is still present as a child. As a result, `yang_dnode_exists(dnode, "route-map")` returns `true` and the old route-map name is read, so `zebra_import_kernel_table_apply(dnode, true)` re-applies the import table *with the route-map still in effect* instead of removing it.

This path is not exercised by the CLI (`no ip import-table` / `no ipv6 import-table` always destroys the entire list entry via `NB_OP_DESTROY` on the entry xpath, triggering `zebra_import_kernel_table_destroy`), but it is reachable via direct YANG/mgmtd operations that delete only the route-map leaf while keeping the list entry.

The callback needs to call `zebra_import_table` explicitly with `rmap = NULL` to apply the updated state without the route-map:

```c
int zebra_import_kernel_table_route_map_destroy(struct nb_cb_destroy_args *args)
{
	const struct lyd_node *dnode =
		(const struct lyd_node *)args->dnode->parent;
	const char *afi_safi = yang_dnode_get_string(dnode, "afi-safi");
	uint32_t table_id = yang_dnode_get_uint32(dnode, "table-id");
	uint32_t distance = yang_dnode_get_uint32(dnode, "distance");
	afi_t afi;
	safi_t safi;

	if (args->event != NB_EV_APPLY)
		return NB_OK;

	yang_afi_safi_identity2value(afi_safi, &afi, &safi);

	/* Re-apply without the route-map since it is being removed */
	if (zebra_import_table(afi, safi, VRF_DEFAULT, table_id, distance,
			       NULL, true) < 0)
		return NB_ERR;

	return NB_OK;
}
```

How can I resolve this? If you propose a fix, please make it concise.

@donaldsharp
Copy link
Member Author

@choppsv1 -> I need some advice here on what you think is best to do. For the ip import-table.. changes it's like 3-4 lines and the ai wants me to change the approach. When I do that to deduplicate it's several hundred lines of code. Searching the code base the only thing similiar to this problem is the router advertisement cli.

@choppsv1
Copy link
Contributor

@choppsv1 -> I need some advice here on what you think is best to do. For the ip import-table.. changes it's like 3-4 lines and the ai wants me to change the approach. When I do that to deduplicate it's several hundred lines of code. Searching the code base the only thing similiar to this problem is the router advertisement cli.

I'm going through the changes now, then I'll check out greptile's suggestions again.

@choppsv1 choppsv1 self-requested a review March 19, 2026 15:47
@choppsv1
Copy link
Contributor

choppsv1 commented Mar 19, 2026

I think the simple solution to the multiple calls to zebra_import_table (for list node create and for leaf-node modify), is to just have zebra_import_table check if it's already done the work and do-nothing if so. This seems to be easy enough to do since it tracks this info:

int zebra_import_table(afi_t afi, safi_t safi, vrf_id_t vrf_id, uint32_t table_id,
		       uint32_t distance, const char *rmap_name, bool add)
	...
                // ... add case
		zebra_import_table_used[afi][safi][table_id] = 1;
		zebra_import_table_distance[afi][safi][table_id] = distance;
	} else {
                // ... !add case
		zebra_import_table_used[afi][safi][table_id] = 0;
		zebra_import_table_distance[afi][safi][table_id] = ZEBRA_TABLE_DISTANCE_DEFAULT;

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Route the `zebra dplane limit` CLI through the mgmt-fronted
zebra NB path and implement the existing dplane queue limit
NB apply callback so the command keeps its behavior after the move.

Signed-off-by: Donald Sharp <[email protected]>
Move the `zebra zapi-packets` command to fully use the mgmt
frontend.

Signed-off-by: Donald Sharp <[email protected]>
Reroute the `zebra work-queue` commands to actually be
on the mgmt frontend side instead of the zebra side.

Signed-off-by: Donald Sharp <[email protected]>
The `ip import-table ...` commands were not on the mgmt
front end side.  Move them to it.

Signed-off-by: Donald Sharp <[email protected]>
Better test:

a) ip import table
b) allow-external....
c) packet read in values for zebra

Signed-off-by: Donald Sharp <[email protected]>
@donaldsharp donaldsharp force-pushed the mgmt_frontend_problems_in_zebra branch from f60bc5b to 242e105 Compare March 20, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

master size/XL tests Topotests, make check, etc zebra

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants