Skip to content
4 changes: 1 addition & 3 deletions neighsyncd/neighsync.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,7 @@ void NeighSync::onMsg(int nlmsg_type, struct nl_object *obj)
key+= ":";

nl_addr2str(rtnl_neigh_get_dst(neigh), ipStr, MAX_ADDR_SIZE);
/* Ignore IPv6 link-local addresses as neighbors */
if (family == IPV6_NAME && IN6_IS_ADDR_LINKLOCAL(nl_addr_get_binary_addr(rtnl_neigh_get_dst(neigh))))
return;

/* Ignore IPv6 multicast link-local addresses as neighbors */
if (family == IPV6_NAME && IN6_IS_ADDR_MC_LINKLOCAL(nl_addr_get_binary_addr(rtnl_neigh_get_dst(neigh))))
return;
Expand Down
85 changes: 85 additions & 0 deletions orchagent/routeorch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,91 @@ RouteOrch::RouteOrch(DBConnector *db, string tableName, NeighOrch *neighOrch) :
m_syncdRoutes[v6_default_ip_prefix] = IpAddresses();

SWSS_LOG_NOTICE("Create IPv6 default route with packet action drop");

/* All the interfaces have the same MAC address and hence the same
* auto-generated link-local ipv6 address with eui64 interface-id.
* Hence add a single /128 route entry for the link-local interface
* address pointing to the CPU port.
*/
IpPrefix linklocal_prefix = getLinkLocalEui64Addr();

addLinkLocalRouteToMe(gVirtualRouterId, linklocal_prefix);
Copy link
Copy Markdown
Contributor

@nikos-github nikos-github Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add the fe80::/10. Probably with addSubnetRoute after you check it does the right thing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Why do we need the fe80::/10 route? Is it to handle all packets destined to any link-local address(es) belonging to us.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikos-github, I am just thinking what are the other cases where we might need packets with DIP matching fe80::/10 link-local subnet prefix to come to CPU. I don't think we would need any other link-local packets other than fe80.../128 to come to CPU. Am I missing something here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following deployment scenario. Leafs are provisioned but the link local is unknown. For servers to work irrespective of the leaf they are attached to, they can be configured with a default gateway in the fe80 range - eg fe80::1. Traffic hitting the switch will be dropped since fe80::/10 is not in the hw.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Verified that the fe80::/10 subnet route to CPU handles both the scenarios that this PR is raised for (pinging to link-local address and link-local address as nexthop).
Now, we don't need the fe80.../128 route to CPU to be added to hardware since fe80::/10 covers that too. Retained in the code, the function that returns our own interface link-local address with eui64 interface id, for later reference.
@prsunny Since we are no longer adding the fe80.../128 now with this change, i am not currently adding any VS test for the same. Hope that is fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.
@lguohan @prsunny, In SONiC, I see that, by default the CoPP policy traps applied for ip2me routes rate limits the matching packets to 600 pps. Is this expected?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the subnet and ip2me are trapped to the same queue which is currently 600pps. To me, this is expected as we have separate traps for those control packets destined to me, which leaves this trap for packets like ICMP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the control packets like BGP, LACP, LLDP... and the ARP,NDP packets are going to the same queue (4). And the ARP, NDP packets are rate limited to 600pps.
Since, subnet and ip2me are trapped to the same queue and are rate limited to 600 pps, currently there does not seem to be any difference in both the routes.
Anyways, we might change the queue mapping between the subnet and ip2me traps in future. So, it is good to have the /128 route too in the hardware for the link-local address (like in case of global interface addresses). @nikos-github @prsunny , do you agree?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine to have both for reasons of functionality in case the underlying SAI implementation changes on how packets are rate limited for the subnet and the host entries, and for consistency purposes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nikos-github Retained /128 ip2me route too. Thanks.


/* Add fe80::/10 subnet route to forward all link-local packets
* destined to us, to CPU */
IpPrefix default_link_local_prefix("fe80::/10");

addLinkLocalRouteToMe(gVirtualRouterId, default_link_local_prefix);

/* TODO: Add the link-local fe80::/10 route to cpu in every VRF created from
* vrforch::addOperation. */
}

std::string RouteOrch::getLinkLocalEui64Addr(void)
{
SWSS_LOG_ENTER();

string ip_prefix;
const uint8_t *gmac = gMacAddress.getMac();

uint8_t eui64_interface_id[EUI64_INTF_ID_LEN];
char ipv6_ll_addr[INET6_ADDRSTRLEN] = {0};

/* Link-local IPv6 address autogenerated by kernel with eui64 interface-id
* derived from the MAC address of the host interface.
*/
eui64_interface_id[0] = gmac[0] ^ 0x02;
eui64_interface_id[1] = gmac[1];
eui64_interface_id[2] = gmac[2];
eui64_interface_id[3] = 0xff;
eui64_interface_id[4] = 0xfe;
eui64_interface_id[5] = gmac[3];
eui64_interface_id[6] = gmac[4];
eui64_interface_id[7] = gmac[5];

snprintf(ipv6_ll_addr, INET6_ADDRSTRLEN, "fe80::%02x%02x:%02x%02x:%02x%02x:%02x%02x",
eui64_interface_id[0], eui64_interface_id[1], eui64_interface_id[2],
eui64_interface_id[3], eui64_interface_id[4], eui64_interface_id[5],
eui64_interface_id[6], eui64_interface_id[7]);

ip_prefix = string(ipv6_ll_addr);

return ip_prefix;
}

void RouteOrch::addLinkLocalRouteToMe(sai_object_id_t vrf_id, IpPrefix linklocal_prefix)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is identical with addIp2MeRoute (I think it is), please use addIp2MeRoute.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is identical to addIp2MeRoute() function but that is a member function of IntfsOrch class. and we would need an IntfsOrch object instance to call it. Or else, make addIp2MeRoute() function a public static function of IntfOrch class. Or using an unsafe reinterpret_cast<> of 'this' pointer from RouteOrch to IntfsOrch to access addIp2MeRoute().

{
sai_route_entry_t unicast_route_entry;
unicast_route_entry.switch_id = gSwitchId;
unicast_route_entry.vr_id = vrf_id;
copy(unicast_route_entry.destination, linklocal_prefix);
subnet(unicast_route_entry.destination, unicast_route_entry.destination);

sai_attribute_t attr;
vector<sai_attribute_t> attrs;

attr.id = SAI_ROUTE_ENTRY_ATTR_PACKET_ACTION;
attr.value.s32 = SAI_PACKET_ACTION_FORWARD;
attrs.push_back(attr);

Port cpu_port;
gPortsOrch->getCpuPort(cpu_port);

attr.id = SAI_ROUTE_ENTRY_ATTR_NEXT_HOP_ID;
attr.value.oid = cpu_port.m_port_id;
attrs.push_back(attr);

sai_status_t status = sai_route_api->create_route_entry(&unicast_route_entry, (uint32_t)attrs.size(), attrs.data());
if (status != SAI_STATUS_SUCCESS)
{
SWSS_LOG_ERROR("Failed to create link local ipv6 route %s to cpu, rv:%d",
linklocal_prefix.getIp().to_string().c_str(), status);
throw runtime_error("Failed to create link local ipv6 route to cpu.");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw runtime_error here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed the comment.


gCrmOrch->incCrmResUsedCounter(CrmResourceType::CRM_IPV6_ROUTE);

SWSS_LOG_NOTICE("Created link local ipv6 route %s to cpu", linklocal_prefix.to_string().c_str());
}

bool RouteOrch::hasNextHopGroup(const IpAddresses& ipAddresses) const
Expand Down
6 changes: 6 additions & 0 deletions orchagent/routeorch.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
/* Maximum next hop group number */
#define NHGRP_MAX_SIZE 128

/* Length of the Interface Id value in EUI64 format */
#define EUI64_INTF_ID_LEN 8

typedef std::map<IpAddress, sai_object_id_t> NextHopGroupMembers;

struct NextHopGroupEntry
Expand Down Expand Up @@ -84,6 +87,9 @@ class RouteOrch : public Orch, public Subject
bool addRoute(IpPrefix, IpAddresses);
bool removeRoute(IpPrefix);

std::string getLinkLocalEui64Addr(void);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this function called?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now called in routeorch initializing code. Please check the latest code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jipanyang Could you please review the latest changes?

void addLinkLocalRouteToMe(sai_object_id_t vrf_id, IpPrefix linklocal_prefix);

void doTask(Consumer& consumer);
};

Expand Down