1.. SPDX-License-Identifier: GPL-2.0+ 2 3===================================== 4Meta Platforms Host Network Interface 5===================================== 6 7Firmware Versions 8----------------- 9 10fbnic has three components stored on the flash which are provided in one PLDM 11image: 12 131. fw - The control firmware used to view and modify firmware settings, request 14 firmware actions, and retrieve firmware counters outside of the data path. 15 This is the firmware which fbnic_fw.c interacts with. 162. bootloader - The firmware which validate firmware security and control basic 17 operations including loading and updating the firmware. This is also known 18 as the cmrt firmware. 193. undi - This is the UEFI driver which is based on the Linux driver. 20 21fbnic stores two copies of these three components on flash. This allows fbnic 22to fall back to an older version of firmware automatically in case firmware 23fails to boot. Version information for both is provided as running and stored. 24The undi is only provided in stored as it is not actively running once the Linux 25driver takes over. 26 27devlink dev info provides version information for all three components. In 28addition to the version the hg commit hash of the build is included as a 29separate entry. 30 31Configuration 32------------- 33 34Ringparams (ethtool -g / -G) 35~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 36 37fbnic has two submission (host -> device) rings for every completion 38(device -> host) ring. The three ring objects together form a single 39"queue" as used by higher layer software (a Rx, or a Tx queue). 40 41For Rx the two submission rings are used to pass empty pages to the NIC. 42Ring 0 is the Header Page Queue (HPQ), NIC will use its pages to place 43L2-L4 headers (or full frames if frame is not header-data split). 44Ring 1 is the Payload Page Queue (PPQ) and used for packet payloads. 45The completion ring is used to receive packet notifications / metadata. 46ethtool ``rx`` ringparam maps to the size of the completion ring, 47``rx-mini`` to the HPQ, and ``rx-jumbo`` to the PPQ. 48 49For Tx both submission rings can be used to submit packets, the completion 50ring carries notifications for both. fbnic uses one of the submission 51rings for normal traffic from the stack and the second one for XDP frames. 52ethtool ``tx`` ringparam controls both the size of the submission rings 53and the completion ring. 54 55Every single entry on the HPQ and PPQ (``rx-mini``, ``rx-jumbo``) 56corresponds to 4kB of allocated memory, while entries on the remaining 57rings are in units of descriptors (8B). The ideal ratio of submission 58and completion ring sizes will depend on the workload, as for small packets 59multiple packets will fit into a single page. 60 61Upgrading Firmware 62------------------ 63 64fbnic supports updating firmware using signed PLDM images with devlink dev 65flash. PLDM images are written into the flash. Flashing does not interrupt 66the operation of the device. 67 68On host boot the latest UEFI driver is always used, no explicit activation 69is required. Firmware activation is required to run new control firmware. cmrt 70firmware can only be activated by power cycling the NIC. 71 72Health reporters 73---------------- 74 75fw reporter 76~~~~~~~~~~~ 77 78The ``fw`` health reporter tracks FW crashes. Dumping the reporter will 79show the core dump of the most recent FW crash, and if no FW crash has 80happened since power cycle - a snapshot of the FW memory. Diagnose callback 81shows FW uptime based on the most recently received heartbeat message 82(the crashes are detected by checking if uptime goes down). 83 84otp reporter 85~~~~~~~~~~~~ 86 87OTP memory ("fuses") are used for secure boot and anti-rollback 88protection. The OTP memory is ECC protected, ECC errors indicate 89either manufacturing defect or part deteriorating with age. 90 91Statistics 92---------- 93 94TX MAC Interface 95~~~~~~~~~~~~~~~~ 96 97 - ``ptp_illegal_req``: packets sent to the NIC with PTP request bit set but routed to BMC/FW 98 - ``ptp_good_ts``: packets successfully routed to MAC with PTP request bit set 99 - ``ptp_bad_ts``: packets destined for MAC with PTP request bit set but aborted because of some error (e.g., DMA read error) 100 101TX Extension (TEI) Interface (TTI) 102~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 103 104 - ``tti_cm_drop``: control messages dropped at the TX Extension (TEI) Interface because of credit starvation 105 - ``tti_frame_drop``: packets dropped at the TX Extension (TEI) Interface because of credit starvation 106 - ``tti_tbi_drop``: packets dropped at the TX BMC Interface (TBI) because of credit starvation 107 108RXB (RX Buffer) Enqueue 109~~~~~~~~~~~~~~~~~~~~~~~ 110 111 - ``rxb_integrity_err[i]``: frames enqueued with integrity errors (e.g., multi-bit ECC errors) on RXB input i 112 - ``rxb_mac_err[i]``: frames enqueued with MAC end-of-frame errors (e.g., bad FCS) on RXB input i 113 - ``rxb_parser_err[i]``: frames experienced RPC parser errors 114 - ``rxb_frm_err[i]``: frames experienced signaling errors (e.g., missing end-of-packet/start-of-packet) on RXB input i 115 - ``rxb_drbo[i]_frames``: frames received at RXB input i 116 - ``rxb_drbo[i]_bytes``: bytes received at RXB input i 117 118RXB (RX Buffer) FIFO 119~~~~~~~~~~~~~~~~~~~~ 120 121 - ``rxb_fifo[i]_drop``: transitions into the drop state on RXB pool i 122 - ``rxb_fifo[i]_dropped_frames``: frames dropped on RXB pool i 123 - ``rxb_fifo[i]_ecn``: transitions into the ECN mark state on RXB pool i 124 - ``rxb_fifo[i]_level``: current occupancy of RXB pool i 125 126RXB (RX Buffer) Dequeue 127~~~~~~~~~~~~~~~~~~~~~~~ 128 129 - ``rxb_intf[i]_frames``: frames sent to the output i 130 - ``rxb_intf[i]_bytes``: bytes sent to the output i 131 - ``rxb_pbuf[i]_frames``: frames sent to output i from the perspective of internal packet buffer 132 - ``rxb_pbuf[i]_bytes``: bytes sent to output i from the perspective of internal packet buffer 133 134RPC (Rx parser) 135~~~~~~~~~~~~~~~ 136 137 - ``rpc_unkn_etype``: frames containing unknown EtherType 138 - ``rpc_unkn_ext_hdr``: frames containing unknown IPv6 extension header 139 - ``rpc_ipv4_frag``: frames containing IPv4 fragment 140 - ``rpc_ipv6_frag``: frames containing IPv6 fragment 141 - ``rpc_ipv4_esp``: frames with IPv4 ESP encapsulation 142 - ``rpc_ipv6_esp``: frames with IPv6 ESP encapsulation 143 - ``rpc_tcp_opt_err``: frames which encountered TCP option parsing error 144 - ``rpc_out_of_hdr_err``: frames where header was larger than parsable region 145 - ``ovr_size_err``: oversized frames 146 147Hardware Queues 148~~~~~~~~~~~~~~~ 149 1501. RX DMA Engine: 151 152 - ``rde_[i]_pkt_err``: packets with MAC EOP, RPC parser, RXB truncation, or RDE frame truncation errors. These error are flagged in the packet metadata because of cut-through support but the actual drop happens once PCIE/RDE is reached. 153 - ``rde_[i]_pkt_cq_drop``: packets dropped because RCQ is full 154 - ``rde_[i]_pkt_bdq_drop``: packets dropped because HPQ or PPQ ran out of host buffer 155 156PCIe 157~~~~ 158 159The fbnic driver exposes PCIe hardware performance statistics through debugfs 160(``pcie_stats``). These statistics provide insights into PCIe transaction 161behavior and potential performance bottlenecks. 162 1631. PCIe Transaction Counters: 164 165 These counters track PCIe transaction activity: 166 - ``pcie_ob_rd_tlp``: Outbound read Transaction Layer Packets count 167 - ``pcie_ob_rd_dword``: DWORDs transferred in outbound read transactions 168 - ``pcie_ob_wr_tlp``: Outbound write Transaction Layer Packets count 169 - ``pcie_ob_wr_dword``: DWORDs transferred in outbound write 170 transactions 171 - ``pcie_ob_cpl_tlp``: Outbound completion TLP count 172 - ``pcie_ob_cpl_dword``: DWORDs transferred in outbound completion TLPs 173 1742. PCIe Resource Monitoring: 175 176 These counters indicate PCIe resource exhaustion events: 177 - ``pcie_ob_rd_no_tag``: Read requests dropped due to tag unavailability 178 - ``pcie_ob_rd_no_cpl_cred``: Read requests dropped due to completion 179 credit exhaustion 180 - ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted 181 credit exhaustion 182 183XDP Length Error: 184~~~~~~~~~~~~~~~~~ 185 186For XDP programs without frags support, fbnic tries to make sure that MTU fits 187into a single buffer. If an oversized frame is received and gets fragmented, 188it is dropped and the following netlink counters are updated 189 190 - ``rx-length``: number of frames dropped due to lack of fragmentation 191 support in the attached XDP program 192 - ``rx-errors``: total number of packets with errors received on the interface 193