Skip to content
Merged
46 changes: 35 additions & 11 deletions doc/dash/dash-sonic-hld.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SONiC-DASH HLD
## High Level Design Document
### Rev 2.2
### Rev 2.4

# Table of Contents

Expand Down Expand Up @@ -51,7 +51,9 @@
| 2.0 | 04/08/2024 | Prince Sunny | Schema updates for PL, PL-NSG, metering |
| 2.1 | 08/22/2024 | Mukesh M Velayudhan | Add local Region ID field in appliance |
| 2.2 | 08/28/2024 | Lawrence Lee | Route table `routing_type` restrictions, delete op behavior |
| 2.3 | 11/7/2024 | Kumaresh Perumal | Update DASH_PA_VALIDATION_TABLE |
| 2.3 | 11/07/2024 | Kumaresh Perumal | Update DASH_PA_VALIDATION_TABLE |
| 2.4 | 02/05/2025 | Prince Sunny | Update DASH_TUNNEL, FNIC, minor clarifications |


# About this Manual
This document provides more detailed design of DASH APIs, DASH orchestration agent, Config and APP DB Schemas and other SONiC buildimage changes required to bring up SONiC image on an appliance card. General DASH HLD can be found at [dash_hld](https://github.com/sonic-net/DASH/tree/main/documentation/general/dash-high-level-design.md).
Expand All @@ -69,6 +71,7 @@ This document provides more detailed design of DASH APIs, DASH orchestration age
| vPORT | VM's NIC. Eni, Vnic, VPort are used interchangeably |
| ST | Service Tunnel |
| PL | Private Link |
| FNIC | Floating NIC |

# 1 Requirements Overview

Expand All @@ -89,6 +92,7 @@ At a high level the following should be supported:
- Telemetry and Monitoring
- Private Link
- Private Link NSG
- Express Route GW Bypass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a config example for express route gw bypass ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add as a next iteration (another PR)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added an example


Phase 2
- Service Tunnel
Expand Down Expand Up @@ -185,6 +189,7 @@ DASH Sonic implementation is targeted for appliance scenarios and must handles m
13. During a bulk operation, if any part/subset of API fails, implementation shall return *error* for the entire API. Sonic implementation shall validate the entire API as pre-checks before applying and return accordingly.
14. Implementation must have flexible memory allocation for ENI and not reserve max scale during initial create (e.g 100k routes). This is to allow oversubscription.
15. Implementation must not have silent failures for APIs. E.g accepting an API from controller, returning success and failing in the backend. This is orthogonal to the idempotency of APIs described above for ADD and Delete operations. Intent is to ensure SDN controller and Sonic implementation is in-sync
16. An ENI can be modeled as FNIC or regular VM at create time only.

## 1.7 ACL requirements

Expand Down Expand Up @@ -312,8 +317,8 @@ Reference Yang model for DASH Vnet is [here](https://github.com/sonic-net/sonic-
```
"DEVICE_METADATA": {
"localhost": {
"subtype": "Appliance",
"type": "SonicHost",
"subtype": "SmartSwitch",
"type": "SonicDpu",
"switch_type": "dpu",
"sub_role": "None"
}
Expand Down Expand Up @@ -368,6 +373,8 @@ DASH_ENI_TABLE:{{eni}}
"v4_meter_policy_id": {{string}} (OPTIONAL)
"v6_meter_policy_id": {{string}} (OPTIONAL)
"disable_fast_path_icmp_flow_redirection": {{bool}} (OPTIONAL)
"floating_nic_mode": {{enabled/disabled}} (OPTIONAL)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was supposed to be "nic_mode" and was supposed to be an enum.
"floatingnic" was supposed to be one of the options, not enabled/disabled.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed to have it as enum

"trusted_vni": {{vni list}} (OPTIONAL)
```
```
key = DASH_ENI_TABLE:eni ; ENI MAC as key
Expand All @@ -382,6 +389,8 @@ pl_underlay_sip = Underlay SIP (ST GW VIP) to be used for all private l
v4_meter_policy_id = IPv4 meter policy ID
v6_meter_policy_id = IPv6 meter policy ID
disable_fast_path_icmp_flow_redirection = Disable handling fast path ICMP flow redirection packets
floating_nic_mode = floating nic mode enabled or disabled. Default is disabled
trusted_vni = list of trusted VNIs for this ENI, 'comma' seperated or "-" for range both inclusive. MSEE VNIs can added here
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename to trusted_vnis to make it clear that we can have multiple?

```

### 3.2.4 TAG
Expand Down Expand Up @@ -482,6 +491,7 @@ DASH_ROUTING_APPLIANCE_TABLE:{{appliance_id}}:
"addresses": {{list of addresses}}
"encap_type": {{encap type}}
"vni": {{vni}}
"region_id": {{local region id}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Region ID is already added to the APPLIANCE table. This is the wrong deprecated Routing Appliance table

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Prince, the routing appliance table is not the appliance table.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching. addressed

```

```
Expand All @@ -490,6 +500,7 @@ key = DASH_ROUTING_APPLIANCE_TABLE:appliance_id; Used for P
addresses = list of addresses used for ECMP across appliances
encap_type = encap type depends on the action_type - {vxlan, nvgre}
vni = vni value associated with the corresponding action.
region_id = local region id
```

### 3.2.8 APPLIANCE
Expand All @@ -499,6 +510,8 @@ DASH_APPLIANCE_TABLE:{{appliance_id}}
"sip": {{ip_address}}
"vm_vni": {{vni}}
"local_region_id": {{region_id}}
"outbound_direction_lookup": {{dst_mac/src_mac}}
Copy link
Copy Markdown
Contributor

@mukeshmv mukeshmv Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need this attribute if we have Floating NIC mode ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if there is fnic mode disabled and need to change the lookup attribute. Basically, aligning with SAI model

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a vni table? The will be multiple VNIs needing this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a VNI table currently.

"trusted_vni": {{vni list}} (OPTIONAL)
```

```
Expand All @@ -507,6 +520,8 @@ key = DASH_APPLIANCE_TABLE:id ; attributes specific for the
sip = source ip address, to be used in encap
vm_vni = VM VNI that is used for setting direction. Also used for inbound encap to VM
local_region_id = Region where this appliance is located
outbound_direction_lookup= dst_mac or src_mac; Default is src_mac. This attribute overrides to dst_mac
trusted_vni = list of global trusted VNIs, 'comma' seperated or "-" for range both inclusive.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trusted_vnis?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we do both ways in the schema and no defined convention.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to vnis

```

### 3.2.9 ROUTE LPM TABLE - OUTBOUND
Expand Down Expand Up @@ -542,6 +557,7 @@ DASH_ROUTE_TABLE:{{group_id}}:{{prefix}}
"metering_policy_en": {{bool}} (OPTIONAL) (OBSOLETED)
"metering_class_or": {{uint32}} (OPTIONAL)
"metering_class_and": {{uint32}} (OPTIONAL)
"tunnel": {{string}} (OPTIONAL)
```

```
Expand All @@ -550,7 +566,7 @@ key = DASH_ROUTE_TABLE:group_id:prefix ; Route route table
action_type = routing_type ; reference to routing type (DEPRECATED)
routing_type = routing_type ; replacement for the deprecated `action_type` field. Must be one of {vnet, vnet_direct, direct, servicetunnel, drop}.
vnet = vnet name ; destination vnet name if routing_type is {vnet, vnet_direct}, a vnet other than eni's vnet means vnet peering
appliance = appliance id ; appliance id if routing_type is {appliance}
appliance = appliance id ; appliance id if routing_type is {appliance} (DEPRECATED, Use tunnel attribute)
overlay_ip = ip_address ; overly_ip to lookup if routing_type is {vnet_direct}, use dst ip from packet if not specified
overlay_sip_prefix = ip_prefix ; overlay ipv6 src ip if routing_type is {servicetunnel}, transform last 32 bits from packet (src ip)
overlay_dip_prefix = ip_prefix ; overlay ipv6 dst ip if routing_type is {servicetunnel}, transform last 32 bits from packet (dst ip)
Expand All @@ -559,6 +575,7 @@ underlay_dip = ip_address ; underlay ipv4 dst ip to o
metering_policy_en = bool ; Metering policy lookup enable (optional), default = false (OBSOLETED). If aggregated or/and bits is 0, metering policy is applied
metering_class_or = uint32 ; Metering class-id 'or' bits
metering_class_and = uint32 ; Metering class-id 'and' bits
tunnel = string ; Nexthop tunnel for ECMP or single nexthop, routing_type is {direct}
```

### 3.2.10 ROUTE RULE TABLE - INBOUND
Expand Down Expand Up @@ -672,14 +689,12 @@ DASH_PA_VALIDATION_TABLE:{{vni}}
```
key = DASH_PA_VALIDATION_TABLE:vni; ENI and VNI as key;
; field = value
addresses = list of addresses used for validating underlay source ip of incoming packets.
addresses = list of prefixes used for validating underlay source ip of incoming packets.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename the field to prefixes as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

```

DASH_PA_VALIDATION_TABLE is used only for PL outbound direction. PA address can be either IPV4 or IPV6.

Total PAs per MSEE would be 64 and if there are 64 MSEEs per region(based on 400G DPU), there would be 4K PA_VALIDATION entries.
DASH_PA_VALIDATION_TABLE is used only for additional PA validation. PA prefix can be either IPV4 or IPV6. Used for fastpath or other explicit PA validation cases

For more scale numbers, please refer to the [doc](https://github.com/sonic-net/DASH/blob/main/documentation/express-route-service/express-route-gateway-bypass.md)
Expected max number of 4K PA_VALIDATION entries. For more scale numbers, please refer to the [doc](https://github.com/sonic-net/DASH/blob/main/documentation/express-route-service/express-route-gateway-bypass.md)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to mention this in scaling requirements too, if missed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or simply move to there, in case inconsistent in future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to scale section and removed from here


### 3.2.14 DASH tunnel table

Expand All @@ -696,10 +711,17 @@ key = DASH_TUNNEL_TABLE:tunnel_name; tunnel name used for r
; field = value
endpoints = list of addresses for ecmp tunnel
encap_type = vxlan or nvgre
vni = vni value for encap
vni = vni value for encap, create only attribute
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mention create only attribute for encap_type as well ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

metering_class_or = uint32
```

DASH_TUNNEL_TABLE shall have one or more endpoints. Encap type, VNI are create only attributes. A change on encap would require deleting and creating new tunnel objects.
One endpoint is treated as single nexthop and comma separated multiple endpoints shall be treated as ECMP nexthop.

For single endpoint, implmentation shall simply create a sai_dash_tunnel object with ```SAI_DASH_TUNNEL_ATTR_DIP=endpoint IP``` and ```SAI_DASH_TUNNEL_ATTR_MAX_MEMBER_SIZE=1```

For ECMP, implementation shall create ```sai_dash_tunnel_member``` and ```sai_dash_tunnel_next_hop``` with appropriate ```SAI_DASH_TUNNEL_ATTR_MAX_MEMBER_SIZE```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention that though members can be updated, at any point the number of members cannot exceed the count provided at Dash tunnel create

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed


### 3.2.15 DASH orchagent (Overlay)

| APP_DB Table | Key | Field | SAI Attributes/*objects* | Comment |
Expand Down Expand Up @@ -988,6 +1010,8 @@ SONiC for DASH shall have a lite swss initialization without the heavy-lift of e
| | SAI_SWITCH_ATTR_TYPE |
| | SAI_SWITCH_ATTR_VXLAN_DEFAULT_PORT |
| | SAI_SWITCH_ATTR_VXLAN_DEFAULT_ROUTER_MAC |
| | SAI_SWITCH_TUNNEL_ATTR_VXLAN_UDP_SPORT |
| | SAI_SWITCH_TUNNEL_ATTR_VXLAN_UDP_SPORT_MASK |

### 3.3.5 Underlay Routing
DASH Appliance shall establish BGP session with the connected Peer and advertise the prefixes (VIP PA). In turn, the Peer (e.g, Network device or SmartSwitches) shall advertise default route to appliance. With two Peers connected, the appliance shall have route with gateway towards both Peers and does ECMP routing. Orchagent install the route and resolves the neighbor (GW) mac and programs the underlay route/nexthop and neighbor.
Expand Down