EVPN VXLAN multihoming: split-horizon for P2MP tunnels#2286
Open
james-jra wants to merge 1 commit into
Open
Conversation
Signed-off-by: James Andrew <jaandrew@nvidia.com>
0f606b3 to
0e18a20
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change adds a new bridge port type to enable split-horizon filtering for P2MP VXLAN tunnels. This provides parity with the existing P2P tunnel model without requiring vendor-specific ACL logic in the NOS.
The existing spec for EVPN VXLAN multihoming (SAI-Proposal-EVPN-Multihoming.md provides an example for configuring split-horizon which targets only P2P tunnels. Vendors using only P2MP tunnels have no equivalent path within the current API.
Split-horizon requires the ability to attach isolation groups at per-remote-VTEP granularity. With P2P tunnels this falls out naturally since each remote VTEP already has its own tunnel object and bridge port. With P2MP tunnels there's a single shared tunnel object, so we need something else to hang the isolation group on.
This change introduces a new bridge port type,
SAI_BRIDGE_PORT_TYPE_TUNNEL_TERM_PEER, with an associated attributeSAI_BRIDGE_PORT_ATTR_TUNNEL_TERM_PEER_IP. This bridge port represents decapsulated tunnel-terminated traffic from a specific remote VTEP. It is explicitly rx-only and cannot be used for encap/tx.A dedicated bridge port type makes it clear that these objects exist solely to deumultiplex inbound decapsulated traffic by source VTEP, giving the NOS a per-peer object to attach isolation groups to. This avoids any ambiguity with the existing
SAI_BRIDGE_PORT_TYPE_TUNNEL, which is used for encap (via FDB entries, L2MC groups, next hops), and may still be used for decap in scenarios which do not require per-remote VTEP demux.The flow for configuring split-horizon with P2MP tunnels then mirrors the P2P case:
TUNNEL_TERM_PEERbridge port per remote VTEP, specifying the peer IP.Alternative approach - User-managed ACLs
The alternative is to push this into NOS-managed ACLs. This approach was proposed in EVPN Multi Home Support #2084, and uses ACLs to match on
SAI_ACL_ENTRY_FIELD_TUNNEL_TERMINATEDandSAI_ACL_ENTRY_ATTR_FIELD_SRC_IPto identify traffic from an ES peer, then useSAI_ACL_ENTRY_ATTR_ACTION_SET_ISOLATION_GROUPto block forwarding to access ports representing any ES shared with that ES peer.ACL support is very varied between vendors, with differing stage support for fields, field/action combinations, and resource constraints. Requiring the NOS to manage ACLs pushes this vendor-specific complexity into the NOS. For example: the match fields here (source IP, tunnel terminated) are properties of the ingress/decap path, while the action (set isolation group) is an egress filtering operation. A vendor whose hardware doesn't support both at the same pipeline stage would require the NOS to use two ACL tables: one ingress ACL to identify ES-peer traffic and stamp metadata (
SAI_ACL_ENTRY_ATTR_ACTION_SET_ACL_META_DATA), and one egress ACL to match the metadata and apply the isolation group. The NOS becomes responsible for metadata value allocation as well as ACL table/entry and isolation group management. Other vendors may have different restrictions resulting in vendor-specific code in the NOS.This approach is also inconsistent with the level of abstraction of other multi-homing scenarios. Specifically for non-designated forwarder, the existing API has attributes like
SAI_BRIDGE_PORT_ATTR_TUNNEL_TERM_BUM_TX_DROPwhich assume the SAI implementation internally understands whether traffic arriving at an egress bridge port was tunnel-terminated. Requiring the NOS to separately manage ACLs that detect and mark tunnel-terminated traffic is a different level of abstraction to this.The proposed bridge port type allows the vendor to encapsulate any pipeline-specific logic within their SAI implementation and gives the NOS a uniform model that closely mirrors the P2P split-horizon flow.