Skip to content

Conversation

@rjarry
Copy link
Collaborator

@rjarry rjarry commented Oct 21, 2025

  • forward: ensure trace is recorded for ttl exceeded packets
  • loopback: determine eth domain in eth_input node

Summary by CodeRabbit

  • Refactor
    • Streamlined IP forwarding and input processing to reduce redundant operations: unified packet enqueue path to a single decision and preserved tracing, and cached per-packet domain values to avoid repeated lookups, improving efficiency and simplifying control flow.

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

📝 Walkthrough

Walkthrough

Refactors in two IP datapath files. In ip_forward.c, a local edge variable is introduced to select the enqueue destination (TTL_EXCEEDED or OUTPUT) and all enqueues are routed through a single rte_node_enqueue_x1(graph, node, edge, mbuf) path after a common trace/next label. In ip_input.c, a local domain variable caches e->domain per mbuf and subsequent logic and the switch use domain instead of repeated e->domain dereferences.

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "loopback: fix domain detection" accurately describes the domain detection changes to the eth_input node shown in the ip_input.c modifications. However, the changeset also includes significant modifications to ip_forward.c to ensure TTL-exceeded packets are properly traced, which the title does not capture. The title is specific and clear—not vague or misleading—but it addresses only one of the two main objectives outlined in the PR description, making it partially related to the full scope of the changeset.

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8648af1 and 912d36e.

📒 Files selected for processing (2)
  • modules/ip/datapath/ip_forward.c (1 hunks)
  • modules/ip/datapath/ip_input.c (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/ip/datapath/ip_input.c
  • modules/ip/datapath/ip_forward.c

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot requested a review from maxime-leroy October 21, 2025 13:05
Copy link
Collaborator

@aharivel aharivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - this is a good refactoring with safer domain detection

Comment on lines 104 to 106
if (unlikely(eth_in->domain == ETH_DOMAIN_LOOPBACK))
goto next;

if (unlikely(rte_is_multicast_ether_addr(&eth->dst_addr))) {
if (unlikely(eth_in->iface->type == GR_IFACE_TYPE_LOOPBACK)) {
eth_in->domain = ETH_DOMAIN_LOOPBACK;
} else if (unlikely(rte_is_multicast_ether_addr(&eth->dst_addr))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that moving the code fixes anything actually ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to @zeeke, this is not fixing the original issue which was that packets coming from bgpd in gr-loop0 are dropped:

--------- 12:48:56.581331143 cpu 2 ---------
control_input:
loopback_input:
eth_input: 36:17:d0:da:c7:97 > 36:17:d0:da:c7:97 type=IP(0x0800) iface=gr-loop0
ip_input: 192.168.2.1 > 192.168.2.2 ttl=1 proto=TCP(6)
ip_error_ttl_exceeded:
icmp_output:
ip_output: 192.168.2.1 > 192.168.2.1 ttl=64 proto=ICMP(1)
eth_output: aa:c1:ab:ec:80:1e > aa:c1:ab:ec:80:1e type=IP(0x0800) iface=p1
port_output:
port_tx-p1q1:

I don't understand how this can happen. My suspicion was that eth_in->domain was somehow overwritten. I prefer keeping the writing of domain localized in eth_input, but the real bug is elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well: don't fix it if it isn't broken.
+1 on the other commit though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really prefer moving the assignment in eth_input even if it isn't broken. Otherwise, you are checking the value of eth_in->domain without any guarantee that it has indeed been initialized by the parent node (could be port_rx). But still, that does not fix the issue that @zeeke reported.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather fix the eth_domain in the port RX in that case, and ensure all input data are set before entering into eth_input.
If we want to rework that part (which isn't related to the original bug), we should set the vlan demux and set the eth domain before entering into eth_input.
That way, we'd be ready for L2 bridges support, as well as tunneled ethernet (vxlan, or evpn).

@rjarry rjarry marked this pull request as draft October 23, 2025 11:11
@rjarry rjarry marked this pull request as ready for review October 24, 2025 12:59
@maxime-leroy
Copy link
Collaborator

tested-by: Maxime Leroy

@maxime-leroy
Copy link
Collaborator

acked-by: Maxime Leroy

The gr_mbuf_trace_add() call was only reached for packets that passed
the TTL check since packets with TTL <= 1 were enqueued and the loop
continued immediately, skipping the trace recording.

Refactor the code to use a common code path with a goto label where
both TTL exceeded and normally forwarded packets are traced before
being enqueued to their respective edges.

Signed-off-by: Robin Jarry <[email protected]>
Tested-by: Maxime Leroy <[email protected]>
Acked-by: Maxime Leroy <[email protected]>
In commit e5570d2 ("policy: add stateful dynamic source nat
support"), the assignment ip_output_mbuf_data(mbuf)->nh = nh was moved
earlier in the processing path. This assignment overwrites the mbuf
metadata that e points to, since eth_input_mbuf_data and
ip_output_mbuf_data share the same memory area.

As a result, accessing e->domain after this assignment reads garbage
data, causing incorrect routing decisions when checking whether a packet
should be sent to ip_output directly versus being forwarded.

Cache the domain value in a local variable before the mbuf metadata is
overwritten to ensure the correct routing decision is made.

Fixes: e5570d2 ("policy: add stateful dynamic source nat support")
Signed-off-by: Robin Jarry <[email protected]>
Tested-by: Maxime Leroy <[email protected]>
Acked-by: Maxime Leroy <[email protected]>
@maxime-leroy maxime-leroy merged commit 7eb49b4 into DPDK:main Oct 24, 2025
7 checks passed
@rjarry rjarry deleted the loopback-ttl branch October 24, 2025 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants