1.. SPDX-License-Identifier: GPL-2.0 2 3================= 4Ethernet Bridging 5================= 6 7Introduction 8============ 9 10The IEEE 802.1Q-2022 (Bridges and Bridged Networks) standard defines the 11operation of bridges in computer networks. A bridge, in the context of this 12standard, is a device that connects two or more network segments and operates 13at the data link layer (Layer 2) of the OSI (Open Systems Interconnection) 14model. The purpose of a bridge is to filter and forward frames between 15different segments based on the destination MAC (Media Access Control) address. 16 17Bridge kAPI 18=========== 19 20Here are some core structures of bridge code. Note that the kAPI is *unstable*, 21and can be changed at any time. 22 23.. kernel-doc:: net/bridge/br_private.h 24 :identifiers: net_bridge_vlan 25 26Bridge uAPI 27=========== 28 29Modern Linux bridge uAPI is accessed via Netlink interface. You can find 30below files where the bridge and bridge port netlink attributes are defined. 31 32Bridge netlink attributes 33------------------------- 34 35.. kernel-doc:: include/uapi/linux/if_link.h 36 :doc: Bridge enum definition 37 38Bridge port netlink attributes 39------------------------------ 40 41.. kernel-doc:: include/uapi/linux/if_link.h 42 :doc: Bridge port enum definition 43 44Bridge sysfs 45------------ 46 47The sysfs interface is deprecated and should not be extended if new 48options are added. 49 50STP 51=== 52 53The STP (Spanning Tree Protocol) implementation in the Linux bridge driver 54is a critical feature that helps prevent loops and broadcast storms in 55Ethernet networks by identifying and disabling redundant links. In a Linux 56bridge context, STP is crucial for network stability and availability. 57 58STP is a Layer 2 protocol that operates at the Data Link Layer of the OSI 59model. It was originally developed as IEEE 802.1D and has since evolved into 60multiple versions, including Rapid Spanning Tree Protocol (RSTP) and 61`Multiple Spanning Tree Protocol (MSTP) 62<https://lore.kernel.org/netdev/20220316150857.2442916-1-tobias@waldekranz.com/>`_. 63 64The 802.1D-2004 removed the original Spanning Tree Protocol, instead 65incorporating the Rapid Spanning Tree Protocol (RSTP). By 2014, all the 66functionality defined by IEEE 802.1D has been incorporated into either 67IEEE 802.1Q (Bridges and Bridged Networks) or IEEE 802.1AC (MAC Service 68Definition). 802.1D has been officially withdrawn in 2022. 69 70Bridge Ports and STP States 71--------------------------- 72 73In the context of STP, bridge ports can be in one of the following states: 74 * Blocking: The port is disabled for data traffic and only listens for 75 BPDUs (Bridge Protocol Data Units) from other devices to determine the 76 network topology. 77 * Listening: The port begins to participate in the STP process and listens 78 for BPDUs. 79 * Learning: The port continues to listen for BPDUs and begins to learn MAC 80 addresses from incoming frames but does not forward data frames. 81 * Forwarding: The port is fully operational and forwards both BPDUs and 82 data frames. 83 * Disabled: The port is administratively disabled and does not participate 84 in the STP process. The data frames forwarding are also disabled. 85 86Root Bridge and Convergence 87--------------------------- 88 89In the context of networking and Ethernet bridging in Linux, the root bridge 90is a designated switch in a bridged network that serves as a reference point 91for the spanning tree algorithm to create a loop-free topology. 92 93Here's how the STP works and root bridge is chosen: 94 1. Bridge Priority: Each bridge running a spanning tree protocol, has a 95 configurable Bridge Priority value. The lower the value, the higher the 96 priority. By default, the Bridge Priority is set to a standard value 97 (e.g., 32768). 98 2. Bridge ID: The Bridge ID is composed of two components: Bridge Priority 99 and the MAC address of the bridge. It uniquely identifies each bridge 100 in the network. The Bridge ID is used to compare the priorities of 101 different bridges. 102 3. Bridge Election: When the network starts, all bridges initially assume 103 that they are the root bridge. They start advertising Bridge Protocol 104 Data Units (BPDU) to their neighbors, containing their Bridge ID and 105 other information. 106 4. BPDU Comparison: Bridges exchange BPDUs to determine the root bridge. 107 Each bridge examines the received BPDUs, including the Bridge Priority 108 and Bridge ID, to determine if it should adjust its own priorities. 109 The bridge with the lowest Bridge ID will become the root bridge. 110 5. Root Bridge Announcement: Once the root bridge is determined, it sends 111 BPDUs with information about the root bridge to all other bridges in the 112 network. This information is used by other bridges to calculate the 113 shortest path to the root bridge and, in doing so, create a loop-free 114 topology. 115 6. Forwarding Ports: After the root bridge is selected and the spanning tree 116 topology is established, each bridge determines which of its ports should 117 be in the forwarding state (used for data traffic) and which should be in 118 the blocking state (used to prevent loops). The root bridge's ports are 119 all in the forwarding state. while other bridges have some ports in the 120 blocking state to avoid loops. 121 7. Root Ports: After the root bridge is selected and the spanning tree 122 topology is established, each non-root bridge processes incoming 123 BPDUs and determines which of its ports provides the shortest path to the 124 root bridge based on the information in the received BPDUs. This port is 125 designated as the root port. And it is in the Forwarding state, allowing 126 it to actively forward network traffic. 127 8. Designated ports: A designated port is the port through which the non-root 128 bridge will forward traffic towards the designated segment. Designated ports 129 are placed in the Forwarding state. All other ports on the non-root 130 bridge that are not designated for specific segments are placed in the 131 Blocking state to prevent network loops. 132 133STP ensures network convergence by calculating the shortest path and disabling 134redundant links. When network topology changes occur (e.g., a link failure), 135STP recalculates the network topology to restore connectivity while avoiding loops. 136 137Proper configuration of STP parameters, such as the bridge priority, can 138influence network performance, path selection and which bridge becomes the 139Root Bridge. 140 141User space STP helper 142--------------------- 143 144The user space STP helper *bridge-stp* is a program to control whether to use 145user mode spanning tree. The ``/sbin/bridge-stp <bridge> <start|stop>`` is 146called by the kernel when STP is enabled/disabled on a bridge 147(via ``brctl stp <bridge> <on|off>`` or ``ip link set <bridge> type bridge 148stp_state <0|1>``). The kernel enables user_stp mode if that command returns 1490, or enables kernel_stp mode if that command returns any other value. 150 151VLAN 152==== 153 154A LAN (Local Area Network) is a network that covers a small geographic area, 155typically within a single building or a campus. LANs are used to connect 156computers, servers, printers, and other networked devices within a localized 157area. LANs can be wired (using Ethernet cables) or wireless (using Wi-Fi). 158 159A VLAN (Virtual Local Area Network) is a logical segmentation of a physical 160network into multiple isolated broadcast domains. VLANs are used to divide 161a single physical LAN into multiple virtual LANs, allowing different groups of 162devices to communicate as if they were on separate physical networks. 163 164Typically there are two VLAN implementations, IEEE 802.1Q and IEEE 802.1ad 165(also known as QinQ). IEEE 802.1Q is a standard for VLAN tagging in Ethernet 166networks. It allows network administrators to create logical VLANs on a 167physical network and tag Ethernet frames with VLAN information, which is 168called *VLAN-tagged frames*. IEEE 802.1ad, commonly known as QinQ or Double 169VLAN, is an extension of the IEEE 802.1Q standard. QinQ allows for the 170stacking of multiple VLAN tags within a single Ethernet frame. The Linux 171bridge supports both the IEEE 802.1Q and `802.1AD 172<https://lore.kernel.org/netdev/1402401565-15423-1-git-send-email-makita.toshiaki@lab.ntt.co.jp/>`_ 173protocol for VLAN tagging. 174 175`VLAN filtering <https://lore.kernel.org/netdev/1360792820-14116-1-git-send-email-vyasevic@redhat.com/>`_ 176on a bridge is disabled by default. After enabling VLAN filtering on a bridge, 177it will start forwarding frames to appropriate destinations based on their 178destination MAC address and VLAN tag (both must match). 179 180Multicast 181========= 182 183The Linux bridge driver has multicast support allowing it to process Internet 184Group Management Protocol (IGMP) or Multicast Listener Discovery (MLD) 185messages, and to efficiently forward multicast data packets. The bridge 186driver supports IGMPv2/IGMPv3 and MLDv1/MLDv2. 187 188Multicast snooping 189------------------ 190 191Multicast snooping is a networking technology that allows network switches 192to intelligently manage multicast traffic within a local area network (LAN). 193 194The switch maintains a multicast group table, which records the association 195between multicast group addresses and the ports where hosts have joined these 196groups. The group table is dynamically updated based on the IGMP/MLD messages 197received. With the multicast group information gathered through snooping, the 198switch optimizes the forwarding of multicast traffic. Instead of blindly 199broadcasting the multicast traffic to all ports, it sends the multicast 200traffic based on the destination MAC address only to ports which have 201subscribed the respective destination multicast group. 202 203When created, the Linux bridge devices have multicast snooping enabled by 204default. It maintains a Multicast forwarding database (MDB) which keeps track 205of port and group relationships. 206 207IGMPv3/MLDv2 EHT support 208------------------------ 209 210The Linux bridge supports IGMPv3/MLDv2 EHT (Explicit Host Tracking), which 211was added by `474ddb37fa3a ("net: bridge: multicast: add EHT allow/block handling") 212<https://lore.kernel.org/netdev/20210120145203.1109140-1-razor@blackwall.org/>`_ 213 214The explicit host tracking enables the device to keep track of each 215individual host that is joined to a particular group or channel. The main 216benefit of the explicit host tracking in IGMP is to allow minimal leave 217latencies when a host leaves a multicast group or channel. 218 219The length of time between a host wanting to leave and a device stopping 220traffic forwarding is called the IGMP leave latency. A device configured 221with IGMPv3 or MLDv2 and explicit tracking can immediately stop forwarding 222traffic if the last host to request to receive traffic from the device 223indicates that it no longer wants to receive traffic. The leave latency 224is thus bound only by the packet transmission latencies in the multiaccess 225network and the processing time in the device. 226 227Other multicast features 228------------------------ 229 230The Linux bridge also supports `per-VLAN multicast snooping 231<https://lore.kernel.org/netdev/20210719170637.435541-1-razor@blackwall.org/>`_, 232which is disabled by default but can be enabled. And `Multicast Router Discovery 233<https://lore.kernel.org/netdev/20190121062628.2710-1-linus.luessing@c0d3.blue/>`_, 234which help identify the location of multicast routers. 235 236Switchdev 237========= 238 239Linux Bridge Switchdev is a feature in the Linux kernel that extends the 240capabilities of the traditional Linux bridge to work more efficiently with 241hardware switches that support switchdev. With Linux Bridge Switchdev, certain 242networking functions like forwarding, filtering, and learning of Ethernet 243frames can be offloaded to a hardware switch. This offloading reduces the 244burden on the Linux kernel and CPU, leading to improved network performance 245and lower latency. 246 247To use Linux Bridge Switchdev, you need hardware switches that support the 248switchdev interface. This means that the switch hardware needs to have the 249necessary drivers and functionality to work in conjunction with the Linux 250kernel. 251 252Please see the :ref:`switchdev` document for more details. 253 254Netfilter 255========= 256 257The bridge netfilter module is a legacy feature that allows to filter bridged 258packets with iptables and ip6tables. Its use is discouraged. Users should 259consider using nftables for packet filtering. 260 261The older ebtables tool is more feature-limited compared to nftables, but 262just like nftables it doesn't need this module either to function. 263 264The br_netfilter module intercepts packets entering the bridge, performs 265minimal sanity tests on ipv4 and ipv6 packets and then pretends that 266these packets are being routed, not bridged. br_netfilter then calls 267the ip and ipv6 netfilter hooks from the bridge layer, i.e. ip(6)tables 268rulesets will also see these packets. 269 270br_netfilter is also the reason for the iptables *physdev* match: 271This match is the only way to reliably tell routed and bridged packets 272apart in an iptables ruleset. 273 274Note that ebtables and nftables will work fine without the br_netfilter module. 275iptables/ip6tables/arptables do not work for bridged traffic because they 276plug in the routing stack. nftables rules in ip/ip6/inet/arp families won't 277see traffic that is forwarded by a bridge either, but that's very much how it 278should be. 279 280Historically the feature set of ebtables was very limited (it still is), 281this module was added to pretend packets are routed and invoke the ipv4/ipv6 282netfilter hooks from the bridge so users had access to the more feature-rich 283iptables matching capabilities (including conntrack). nftables doesn't have 284this limitation, pretty much all features work regardless of the protocol family. 285 286So, br_netfilter is only needed if users, for some reason, need to use 287ip(6)tables to filter packets forwarded by the bridge, or NAT bridged 288traffic. For pure link layer filtering, this module isn't needed. 289 290Other Features 291============== 292 293The Linux bridge also supports `IEEE 802.11 Proxy ARP 294<https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=958501163ddd6ea22a98f94fa0e7ce6d4734e5c4>`_, 295`Media Redundancy Protocol (MRP) 296<https://lore.kernel.org/netdev/20200426132208.3232-1-horatiu.vultur@microchip.com/>`_, 297`Media Redundancy Protocol (MRP) LC mode 298<https://lore.kernel.org/r/20201124082525.273820-1-horatiu.vultur@microchip.com>`_, 299`IEEE 802.1X port authentication 300<https://lore.kernel.org/netdev/20220218155148.2329797-1-schultz.hans+netdev@gmail.com/>`_, 301and `MAC Authentication Bypass (MAB) 302<https://lore.kernel.org/netdev/20221101193922.2125323-2-idosch@nvidia.com/>`_. 303 304FAQ 305=== 306 307What does a bridge do? 308---------------------- 309 310A bridge transparently forwards traffic between multiple network interfaces. 311In plain English this means that a bridge connects two or more physical 312Ethernet networks, to form one larger (logical) Ethernet network. 313 314Is it L3 protocol independent? 315------------------------------ 316 317Yes. The bridge sees all frames, but it *uses* only L2 headers/information. 318As such, the bridging functionality is protocol independent, and there should 319be no trouble forwarding IPX, NetBEUI, IP, IPv6, etc. 320 321Contact Info 322============ 323 324The code is currently maintained by Roopa Prabhu <roopa@nvidia.com> and 325Nikolay Aleksandrov <razor@blackwall.org>. Bridge bugs and enhancements 326are discussed on the linux-netdev mailing list netdev@vger.kernel.org and 327bridge@lists.linux-foundation.org. 328 329The list is open to anyone interested: http://vger.kernel.org/vger-lists.html#netdev 330 331External Links 332============== 333 334The old Documentation for Linux bridging is on: 335https://wiki.linuxfoundation.org/networking/bridge 336