Skip to content

EVPN VXLAN multihoming: split-horizon for P2MP tunnels#2286

Open
james-jra wants to merge 1 commit into
opencomputeproject:masterfrom
james-jra:jra/evpn-multihoming-p2mp
Open

EVPN VXLAN multihoming: split-horizon for P2MP tunnels#2286
james-jra wants to merge 1 commit into
opencomputeproject:masterfrom
james-jra:jra/evpn-multihoming-p2mp

Conversation

@james-jra
Copy link
Copy Markdown
Contributor

This change adds a new bridge port type to enable split-horizon filtering for P2MP VXLAN tunnels. This provides parity with the existing P2P tunnel model without requiring vendor-specific ACL logic in the NOS.

The existing spec for EVPN VXLAN multihoming (SAI-Proposal-EVPN-Multihoming.md provides an example for configuring split-horizon which targets only P2P tunnels. Vendors using only P2MP tunnels have no equivalent path within the current API.

Split-horizon requires the ability to attach isolation groups at per-remote-VTEP granularity. With P2P tunnels this falls out naturally since each remote VTEP already has its own tunnel object and bridge port. With P2MP tunnels there's a single shared tunnel object, so we need something else to hang the isolation group on.

This change introduces a new bridge port type, SAI_BRIDGE_PORT_TYPE_TUNNEL_TERM_PEER, with an associated attribute SAI_BRIDGE_PORT_ATTR_TUNNEL_TERM_PEER_IP. This bridge port represents decapsulated tunnel-terminated traffic from a specific remote VTEP. It is explicitly rx-only and cannot be used for encap/tx.

A dedicated bridge port type makes it clear that these objects exist solely to deumultiplex inbound decapsulated traffic by source VTEP, giving the NOS a per-peer object to attach isolation groups to. This avoids any ambiguity with the existing SAI_BRIDGE_PORT_TYPE_TUNNEL, which is used for encap (via FDB entries, L2MC groups, next hops), and may still be used for decap in scenarios which do not require per-remote VTEP demux.

The flow for configuring split-horizon with P2MP tunnels then mirrors the P2P case:

  1. Create a P2MP tunnel object.
  2. Create a TUNNEL_TERM_PEER bridge port per remote VTEP, specifying the peer IP.
  3. Attach an isolation group to any peer bridge port whose remote VTEP shares an Ethernet Segment with the local device.
  4. Add the relevant local access ports as members of that isolation group.

Alternative approach - User-managed ACLs

The alternative is to push this into NOS-managed ACLs. This approach was proposed in EVPN Multi Home Support #2084, and uses ACLs to match on SAI_ACL_ENTRY_FIELD_TUNNEL_TERMINATED and SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP to identify traffic from an ES peer, then use SAI_ACL_ENTRY_ATTR_ACTION_SET_ISOLATION_GROUP to block forwarding to access ports representing any ES shared with that ES peer.

ACL support is very varied between vendors, with differing stage support for fields, field/action combinations, and resource constraints. Requiring the NOS to manage ACLs pushes this vendor-specific complexity into the NOS. For example: the match fields here (source IP, tunnel terminated) are properties of the ingress/decap path, while the action (set isolation group) is an egress filtering operation. A vendor whose hardware doesn't support both at the same pipeline stage would require the NOS to use two ACL tables: one ingress ACL to identify ES-peer traffic and stamp metadata (SAI_ACL_ENTRY_ATTR_ACTION_SET_ACL_META_DATA), and one egress ACL to match the metadata and apply the isolation group. The NOS becomes responsible for metadata value allocation as well as ACL table/entry and isolation group management. Other vendors may have different restrictions resulting in vendor-specific code in the NOS.

This approach is also inconsistent with the level of abstraction of other multi-homing scenarios. Specifically for non-designated forwarder, the existing API has attributes like SAI_BRIDGE_PORT_ATTR_TUNNEL_TERM_BUM_TX_DROP which assume the SAI implementation internally understands whether traffic arriving at an egress bridge port was tunnel-terminated. Requiring the NOS to separately manage ACLs that detect and mark tunnel-terminated traffic is a different level of abstraction to this.

The proposed bridge port type allows the vendor to encapsulate any pipeline-specific logic within their SAI implementation and gives the NOS a uniform model that closely mirrors the P2P split-horizon flow.

Signed-off-by: James Andrew <jaandrew@nvidia.com>
@james-jra james-jra force-pushed the jra/evpn-multihoming-p2mp branch from 0f606b3 to 0e18a20 Compare May 28, 2026 08:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant