1.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2.. include:: <isonum.txt> 3 4================ 5Ethtool counters 6================ 7 8:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. 9 10Contents 11======== 12 13- `Overview`_ 14- `Groups`_ 15- `Types`_ 16- `Descriptions`_ 17 18Overview 19======== 20 21There are several counter groups based on where the counter is being counted. In 22addition, each group of counters may have different counter types. 23 24These counter groups are based on which component in a networking setup, 25illustrated below, that they describe:: 26 27 ---------------------------------------- 28 | | 29 ---------------------------------------- ---------------------------------------- | 30 | Hypervisor | | VM | | 31 | | | | | 32 | ------------------- --------------- | | ------------------- --------------- | | 33 | | Ethernet driver | | RDMA driver | | | | Ethernet driver | | RDMA driver | | | 34 | ------------------- --------------- | | ------------------- --------------- | | 35 | | | | | | | | | 36 | ------------------- | | ------------------- | | 37 | | | | | |-- 38 ---------------------------------------- ---------------------------------------- 39 | | 40 ------------- ----------------------------- 41 | | 42 ------ ------ ------ ------ ------ ------ ------ 43 -----| PF |----------------------| VF |-| VF |-| VF |----- --| PF |--- --| PF |--- --| PF |--- 44 | ------ ------ ------ ------ | | ------ | | ------ | | ------ | 45 | | | | | | | | 46 | | | | | | | | 47 | | | | | | | | 48 | eSwitch | | eSwitch | | eSwitch | | eSwitch | 49 ---------------------------------------------------------- ----------- ----------- ----------- 50 ------------------------------------------------------------------------------- 51 | | 52 | | 53 | Uplink (no counters) | 54 ------------------------------------------------------------------------------- 55 --------------------------------------------------------------- 56 | | 57 | | 58 | MPFS (no counters) | 59 --------------------------------------------------------------- 60 | 61 | 62 | Port 63 64Groups 65====== 66 67Ring 68 Software counters populated by the driver stack. 69 70Netdev 71 An aggregation of software ring counters. 72 73vPort counters 74 Traffic counters and drops due to steering or no buffers. May indicate issues 75 with NIC. These counters include Ethernet traffic counters (including Raw 76 Ethernet) and RDMA/RoCE traffic counters. 77 78Physical port counters 79 Counters that collect statistics about the PFs and VFs. May indicate issues 80 with NIC, link, or network. This measuring point holds information on 81 standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and 82 additional counters like flow control, FEC and more. Physical port counters 83 are not exposed to virtual machines. 84 85Priority Port Counters 86 A set of the physical port counters, per priority per port. 87 88Types 89===== 90 91Counters are divided into three types. 92 93Traffic Informative Counters 94 Counters which count traffic. These counters can be used for load estimation 95 or for general debug. 96 97Traffic Acceleration Counters 98 Counters which count traffic that was accelerated by Mellanox driver or by 99 hardware. The counters are an additional layer to the informative counter set, 100 and the same traffic is counted in both informative and acceleration counters. 101 102.. [#accel] Traffic acceleration counter. 103 104Error Counters 105 Increment of these counters might indicate a problem. Each of these counters 106 has an explanation and correction action. 107 108Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool` 109provides more detailed information.:: 110 111 ip –s link show <if-name> 112 ethtool -S <if-name> 113 114Descriptions 115============ 116 117XSK, PTP, and QoS counters that are similar to counters defined previously will 118not be separately listed. For example, `ptp_tx[i]_packets` will not be 119explicitly documented since `tx[i]_packets` describes the behavior of both 120counters, except `ptp_tx[i]_packets` is only counted when precision time 121protocol is used. 122 123Ring / Netdev Counter 124---------------------------- 125The following counters are available per ring or software port. 126 127These counters provide information on the amount of traffic that was accelerated 128by the NIC. The counters are counting the accelerated traffic in addition to the 129standard counters which counts it (i.e. accelerated traffic is counted twice). 130 131The counter names in the table below refers to both ring and port counters. The 132notation for ring counters includes the [i] index without the braces. The 133notation for port counters doesn't include the [i]. A counter name 134`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for 135the software port. 136 137.. flat-table:: Ring / Software Port Counter Table 138 :widths: 2 3 1 139 140 * - Counter 141 - Description 142 - Type 143 144 * - `rx[i]_packets` 145 - The number of packets received on ring i. 146 - Informative 147 148 * - `rx[i]_bytes` 149 - The number of bytes received on ring i. 150 - Informative 151 152 * - `tx[i]_packets` 153 - The number of packets transmitted on ring i. 154 - Informative 155 156 * - `tx[i]_bytes` 157 - The number of bytes transmitted on ring i. 158 - Informative 159 160 * - `tx[i]_recover` 161 - The number of times the SQ was recovered. 162 - Error 163 164 * - `tx[i]_cqes` 165 - Number of CQEs events on SQ issued on ring i. 166 - Informative 167 168 * - `tx[i]_cqe_err` 169 - The number of error CQEs encountered on the SQ for ring i. 170 - Error 171 172 * - `tx[i]_tso_packets` 173 - The number of TSO packets transmitted on ring i [#accel]_. 174 - Acceleration 175 176 * - `tx[i]_tso_bytes` 177 - The number of TSO bytes transmitted on ring i [#accel]_. 178 - Acceleration 179 180 * - `tx[i]_tso_inner_packets` 181 - The number of TSO packets which are indicated to be carry internal 182 encapsulation transmitted on ring i [#accel]_. 183 - Acceleration 184 185 * - `tx[i]_tso_inner_bytes` 186 - The number of TSO bytes which are indicated to be carry internal 187 encapsulation transmitted on ring i [#accel]_. 188 - Acceleration 189 190 * - `rx[i]_gro_packets` 191 - Number of received packets processed using hardware-accelerated GRO. The 192 number of hardware GRO offloaded packets received on ring i. Only true GRO 193 packets are counted: only packets that are in an SKB with a GRO count > 1. 194 - Acceleration 195 196 * - `rx[i]_gro_bytes` 197 - Number of received bytes processed using hardware-accelerated GRO. The 198 number of hardware GRO offloaded bytes received on ring i. Only true GRO 199 packets are counted: only packets that are in an SKB with a GRO count > 1. 200 - Acceleration 201 202 * - `rx[i]_gro_skbs` 203 - The number of GRO SKBs constructed from hardware-accelerated GRO. Only SKBs 204 with a GRO count > 1 are counted. 205 - Informative 206 207 * - `rx[i]_gro_large_hds` 208 - Number of receive packets using hardware-accelerated GRO that have large 209 headers that require additional memory to be allocated. 210 - Informative 211 212 * - `rx[i]_hds_nodata_packets` 213 - Number of header only packets in header/data split mode [#accel]_. 214 - Informative 215 216 * - `rx[i]_hds_nodata_bytes` 217 - Number of bytes for header only packets in header/data split mode 218 [#accel]_. 219 - Informative 220 221 * - `rx[i]_hds_nosplit_packets` 222 - Number of packets that were not split in header/data split mode. A 223 packet will not get split when the hardware does not support its 224 protocol splitting. An example such a protocol is ICMPv4/v6. Currently 225 TCP and UDP with IPv4/IPv6 are supported for header/data split 226 [#accel]_. 227 - Informative 228 229 * - `rx[i]_hds_nosplit_bytes` 230 - Number of bytes for packets that were not split in header/data split 231 mode. A packet will not get split when the hardware does not support its 232 protocol splitting. An example such a protocol is ICMPv4/v6. Currently 233 TCP and UDP with IPv4/IPv6 are supported for header/data split 234 [#accel]_. 235 - Informative 236 237 * - `rx[i]_lro_packets` 238 - The number of LRO packets received on ring i [#accel]_. 239 - Acceleration 240 241 * - `rx[i]_lro_bytes` 242 - The number of LRO bytes received on ring i [#accel]_. 243 - Acceleration 244 245 * - `rx[i]_ecn_mark` 246 - The number of received packets where the ECN mark was turned on. 247 - Informative 248 249 * - `rx_oversize_pkts_buffer` 250 - The number of dropped received packets due to length which arrived to RQ 251 and exceed software buffer size allocated by the device for incoming 252 traffic. It might imply that the device MTU is larger than the software 253 buffers size. 254 - Error 255 256 * - `rx_oversize_pkts_sw_drop` 257 - Number of received packets dropped in software because the CQE data is 258 larger than the MTU size. 259 - Error 260 261 * - `rx[i]_csum_unnecessary` 262 - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_. 263 - Acceleration 264 265 * - `rx[i]_csum_unnecessary_inner` 266 - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY` 267 on ring i [#accel]_. 268 - Acceleration 269 270 * - `rx[i]_csum_none` 271 - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_. 272 - Acceleration 273 274 * - `rx[i]_csum_complete` 275 - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_. 276 - Acceleration 277 278 * - `rx[i]_csum_complete_tail` 279 - Number of received packets that had checksum calculation computed, 280 potentially needed padding, and were able to do so with 281 `CHECKSUM_PARTIAL`. 282 - Informative 283 284 * - `rx[i]_csum_complete_tail_slow` 285 - Number of received packets that need padding larger than eight bytes for 286 the checksum. 287 - Informative 288 289 * - `tx[i]_csum_partial` 290 - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_. 291 - Acceleration 292 293 * - `tx[i]_csum_partial_inner` 294 - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on 295 ring i [#accel]_. 296 - Acceleration 297 298 * - `tx[i]_csum_none` 299 - Packets transmitted with no hardware checksum acceleration on ring i. 300 - Informative 301 302 * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_ 303 - Events where SQ was full on ring i. If this counter is increased, check 304 the amount of buffers allocated for transmission. 305 - Informative 306 307 * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_ 308 - Events where SQ was full and has become not full on ring i. 309 - Informative 310 311 * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_ 312 - Packets transmitted that were dropped due to DMA mapping failure on 313 ring i. If this counter is increased, check the amount of buffers 314 allocated for transmission. 315 - Error 316 317 * - `tx[i]_nop` 318 - The number of nop WQEs (empty WQEs) inserted to the SQ (related to 319 ring i) due to the reach of the end of the cyclic buffer. When reaching 320 near to the end of cyclic buffer the driver may add those empty WQEs to 321 avoid handling a state the a WQE start in the end of the queue and ends 322 in the beginning of the queue. This is a normal condition. 323 - Informative 324 325 * - `tx[i]_timestamps` 326 - Transmitted packets that were hardware timestamped at the device's DMA 327 layer. 328 - Informative 329 330 * - `tx[i]_added_vlan_packets` 331 - The number of packets sent where vlan tag insertion was offloaded to the 332 hardware. 333 - Acceleration 334 335 * - `rx[i]_removed_vlan_packets` 336 - The number of packets received where vlan tag stripping was offloaded to 337 the hardware. 338 - Acceleration 339 340 * - `rx[i]_wqe_err` 341 - The number of wrong opcodes received on ring i. 342 - Error 343 344 * - `rx[i]_mpwqe_frag` 345 - The number of WQEs that failed to allocate compound page and hence 346 fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this 347 counter raise, it may suggest that there is no enough memory for large 348 pages, the driver allocated fragmented pages. This is not abnormal 349 condition. 350 - Informative 351 352 * - `rx[i]_mpwqe_filler_cqes` 353 - The number of filler CQEs events that were issued on ring i. 354 - Informative 355 356 * - `rx[i]_mpwqe_filler_strides` 357 - The number of strides consumed by filler CQEs on ring i. 358 - Informative 359 360 * - `tx[i]_mpwqe_blks` 361 - The number of send blocks processed from Multi-Packet WQEs (mpwqe). 362 - Informative 363 364 * - `tx[i]_mpwqe_pkts` 365 - The number of send packets processed from Multi-Packet WQEs (mpwqe). 366 - Informative 367 368 * - `rx[i]_cqe_compress_blks` 369 - The number of receive blocks with CQE compression on ring i [#accel]_. 370 - Acceleration 371 372 * - `rx[i]_cqe_compress_pkts` 373 - The number of receive packets with CQE compression on ring i [#accel]_. 374 - Acceleration 375 376 * - `rx[i]_arfs_add` 377 - The number of aRFS flow rules added to the device for direct RQ steering 378 on ring i [#accel]_. 379 - Acceleration 380 381 * - `rx[i]_arfs_request_in` 382 - Number of flow rules that have been requested to move into ring i for 383 direct RQ steering [#accel]_. 384 - Acceleration 385 386 * - `rx[i]_arfs_request_out` 387 - Number of flow rules that have been requested to move out of ring i [#accel]_. 388 - Acceleration 389 390 * - `rx[i]_arfs_expired` 391 - Number of flow rules that have been expired and removed [#accel]_. 392 - Acceleration 393 394 * - `rx[i]_arfs_err` 395 - Number of flow rules that failed to be added to the flow table. 396 - Error 397 398 * - `rx[i]_recover` 399 - The number of times the RQ was recovered. 400 - Error 401 402 * - `tx[i]_xmit_more` 403 - The number of packets sent with `xmit_more` indication set on the skbuff 404 (no doorbell). 405 - Acceleration 406 407 * - `ch[i]_poll` 408 - The number of invocations of NAPI poll of channel i. 409 - Informative 410 411 * - `ch[i]_arm` 412 - The number of times the NAPI poll function completed and armed the 413 completion queues on channel i. 414 - Informative 415 416 * - `ch[i]_aff_change` 417 - The number of times the NAPI poll function explicitly stopped execution 418 on a CPU due to a change in affinity, on channel i. 419 - Informative 420 421 * - `ch[i]_events` 422 - The number of hard interrupt events on the completion queues of channel i. 423 - Informative 424 425 * - `ch[i]_eq_rearm` 426 - The number of times the EQ was recovered. 427 - Error 428 429 * - `ch[i]_force_irq` 430 - Number of times NAPI is triggered by XSK wakeups by posting a NOP to 431 ICOSQ. 432 - Acceleration 433 434 * - `rx[i]_congst_umr` 435 - The number of times an outstanding UMR request is delayed due to 436 congestion, on ring i. 437 - Informative 438 439 * - `rx_pp_alloc_fast` 440 - Number of successful fast path allocations. 441 - Informative 442 443 * - `rx_pp_alloc_slow` 444 - Number of slow path order-0 allocations. 445 - Informative 446 447 * - `rx_pp_alloc_slow_high_order` 448 - Number of slow path high order allocations. 449 - Informative 450 451 * - `rx_pp_alloc_empty` 452 - Counter is incremented when ptr ring is empty, so a slow path allocation 453 was forced. 454 - Informative 455 456 * - `rx_pp_alloc_refill` 457 - Counter is incremented when an allocation which triggered a refill of the 458 cache. 459 - Informative 460 461 * - `rx_pp_alloc_waive` 462 - Counter is incremented when pages obtained from the ptr ring that cannot 463 be added to the cache due to a NUMA mismatch. 464 - Informative 465 466 * - `rx_pp_recycle_cached` 467 - Counter is incremented when recycling placed page in the page pool cache. 468 - Informative 469 470 * - `rx_pp_recycle_cache_full` 471 - Counter is incremented when page pool cache was full. 472 - Informative 473 474 * - `rx_pp_recycle_ring` 475 - Counter is incremented when page placed into the ptr ring. 476 - Informative 477 478 * - `rx_pp_recycle_ring_full` 479 - Counter is incremented when page released from page pool because the ptr 480 ring was full. 481 - Informative 482 483 * - `rx_pp_recycle_released_ref` 484 - Counter is incremented when page released (and not recycled) because 485 refcnt > 1. 486 - Informative 487 488 * - `rx[i]_xsk_buff_alloc_err` 489 - The number of times allocating an skb or XSK buffer failed in the XSK RQ 490 context. 491 - Error 492 493 * - `rx[i]_xdp_tx_xmit` 494 - The number of packets forwarded back to the port due to XDP program 495 `XDP_TX` action (bouncing). these packets are not counted by other 496 software counters. These packets are counted by physical port and vPort 497 counters. 498 - Informative 499 500 * - `rx[i]_xdp_tx_mpwqe` 501 - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by 502 the netdev during the RQ context. 503 - Acceleration 504 505 * - `rx[i]_xdp_tx_inlnw` 506 - Number of WQE data segments transmitted where the data could be inlined 507 in the WQE and then `XDP_TX`-ed during the RQ context. 508 - Acceleration 509 510 * - `rx[i]_xdp_tx_nops` 511 - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ. 512 - Acceleration 513 514 * - `rx[i]_xdp_tx_full` 515 - The number of packets that should have been forwarded back to the port 516 due to `XDP_TX` action but were dropped due to full tx queue. These packets 517 are not counted by other software counters. These packets are counted by 518 physical port and vPort counters. You may open more rx queues and spread 519 traffic rx over all queues and/or increase rx ring size. 520 - Error 521 522 * - `rx[i]_xdp_tx_err` 523 - The number of times an `XDP_TX` error such as frame too long and frame 524 too short occurred on `XDP_TX` ring of RX ring. 525 - Error 526 527 * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_ 528 - The number of completions received on the CQ of the `XDP_TX` ring. 529 - Informative 530 531 * - `rx[i]_xdp_drop` 532 - The number of packets dropped due to XDP program `XDP_DROP` action. these 533 packets are not counted by other software counters. These packets are 534 counted by physical port and vPort counters. 535 - Informative 536 537 * - `rx[i]_xdp_redirect` 538 - The number of times an XDP redirect action was triggered on ring i. 539 - Acceleration 540 541 * - `tx[i]_xdp_xmit` 542 - The number of packets redirected to the interface(due to XDP redirect). 543 These packets are not counted by other software counters. These packets 544 are counted by physical port and vPort counters. 545 - Informative 546 547 * - `tx[i]_xdp_full` 548 - The number of packets redirected to the interface(due to XDP redirect), 549 but were dropped due to full tx queue. these packets are not counted by 550 other software counters. you may enlarge tx queues. 551 - Informative 552 553 * - `tx[i]_xdp_mpwqe` 554 - Number of multi-packet WQEs offloaded onto the NIC that were 555 `XDP_REDIRECT`-ed from other netdevs. 556 - Acceleration 557 558 * - `tx[i]_xdp_inlnw` 559 - Number of WQE data segments where the data could be inlined in the WQE 560 where the data segments were `XDP_REDIRECT`-ed from other netdevs. 561 - Acceleration 562 563 * - `tx[i]_xdp_nops` 564 - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were 565 `XDP_REDIRECT`-ed from other netdevs. 566 - Acceleration 567 568 * - `tx[i]_xdp_err` 569 - The number of packets redirected to the interface(due to XDP redirect) 570 but were dropped due to error such as frame too long and frame too short. 571 - Error 572 573 * - `tx[i]_xdp_cqes` 574 - The number of completions received for packets redirected to the 575 interface(due to XDP redirect) on the CQ. 576 - Informative 577 578 * - `tx[i]_xsk_xmit` 579 - The number of packets transmitted using XSK zerocopy functionality. 580 - Acceleration 581 582 * - `tx[i]_xsk_mpwqe` 583 - Number of multi-packet WQEs offloaded onto the NIC that were 584 `XDP_REDIRECT`-ed from other netdevs. 585 - Acceleration 586 587 * - `tx[i]_xsk_inlnw` 588 - Number of WQE data segments where the data could be inlined in the WQE 589 that are transmitted using XSK zerocopy. 590 - Acceleration 591 592 * - `tx[i]_xsk_full` 593 - Number of times doorbell is rung in XSK zerocopy mode when SQ is full. 594 - Error 595 596 * - `tx[i]_xsk_err` 597 - Number of errors that occurred in XSK zerocopy mode such as if the data 598 size is larger than the MTU size. 599 - Error 600 601 * - `tx[i]_xsk_cqes` 602 - Number of CQEs processed in XSK zerocopy mode. 603 - Acceleration 604 605 * - `tx_tls_ctx` 606 - Number of TLS TX HW offload contexts added to device for encryption. 607 - Acceleration 608 609 * - `tx_tls_del` 610 - Number of TLS TX HW offload contexts removed from device (connection 611 closed). 612 - Acceleration 613 614 * - `tx_tls_pool_alloc` 615 - Number of times a unit of work is successfully allocated in the TLS HW 616 offload pool. 617 - Acceleration 618 619 * - `tx_tls_pool_free` 620 - Number of times a unit of work is freed in the TLS HW offload pool. 621 - Acceleration 622 623 * - `rx_tls_ctx` 624 - Number of TLS RX HW offload contexts added to device for decryption. 625 - Acceleration 626 627 * - `rx_tls_del` 628 - Number of TLS RX HW offload contexts deleted from device (connection has 629 finished). 630 - Acceleration 631 632 * - `rx[i]_tls_decrypted_packets` 633 - Number of successfully decrypted RX packets which were part of a TLS 634 stream. 635 - Acceleration 636 637 * - `rx[i]_tls_decrypted_bytes` 638 - Number of TLS payload bytes in RX packets which were successfully 639 decrypted. 640 - Acceleration 641 642 * - `rx[i]_tls_resync_req_pkt` 643 - Number of received TLS packets with a resync request. 644 - Acceleration 645 646 * - `rx[i]_tls_resync_req_start` 647 - Number of times the TLS async resync request was started. 648 - Acceleration 649 650 * - `rx[i]_tls_resync_req_end` 651 - Number of times the TLS async resync request properly ended with 652 providing the HW tracked tcp-seq. 653 - Acceleration 654 655 * - `rx[i]_tls_resync_req_skip` 656 - Number of times the TLS async resync request procedure was started but 657 not properly ended. 658 - Error 659 660 * - `rx[i]_tls_resync_res_ok` 661 - Number of times the TLS resync response call to the driver was 662 successfully handled. 663 - Acceleration 664 665 * - `rx[i]_tls_resync_res_retry` 666 - Number of times the TLS resync response call to the driver was 667 reattempted when ICOSQ is full. 668 - Error 669 670 * - `rx[i]_tls_resync_res_skip` 671 - Number of times the TLS resync response call to the driver was terminated 672 unsuccessfully. 673 - Error 674 675 * - `rx[i]_tls_err` 676 - Number of times when CQE TLS offload was problematic. 677 - Error 678 679 * - `tx[i]_tls_encrypted_packets` 680 - The number of send packets that are TLS encrypted by the kernel. 681 - Acceleration 682 683 * - `tx[i]_tls_encrypted_bytes` 684 - The number of send bytes that are TLS encrypted by the kernel. 685 - Acceleration 686 687 * - `tx[i]_tls_ooo` 688 - Number of times out of order TLS SQE fragments were handled on ring i. 689 - Acceleration 690 691 * - `tx[i]_tls_dump_packets` 692 - Number of TLS decrypted packets copied over from NIC over DMA. 693 - Acceleration 694 695 * - `tx[i]_tls_dump_bytes` 696 - Number of TLS decrypted bytes copied over from NIC over DMA. 697 - Acceleration 698 699 * - `tx[i]_tls_resync_bytes` 700 - Number of TLS bytes requested to be resynchronized in order to be 701 decrypted. 702 - Acceleration 703 704 * - `tx[i]_tls_skip_no_sync_data` 705 - Number of TLS send data that can safely be skipped / do not need to be 706 decrypted. 707 - Acceleration 708 709 * - `tx[i]_tls_drop_no_sync_data` 710 - Number of TLS send data that were dropped due to retransmission of TLS 711 data. 712 - Acceleration 713 714 * - `ptp_cq[i]_abort` 715 - Number of times a CQE has to be skipped in precision time protocol due to 716 a skew between the port timestamp and CQE timestamp being greater than 717 128 seconds. 718 - Error 719 720 * - `ptp_cq[i]_abort_abs_diff_ns` 721 - Accumulation of time differences between the port timestamp and CQE 722 timestamp when the difference is greater than 128 seconds in precision 723 time protocol. 724 - Error 725 726 * - `ptp_cq[i]_late_cqe` 727 - Number of times a CQE has been delivered on the PTP timestamping CQ when 728 the CQE was not expected since a certain amount of time had elapsed where 729 the device typically ensures not posting the CQE. 730 - Error 731 732 * - `ptp_cq[i]_lost_cqe` 733 - Number of times a CQE is expected to not be delivered on the PTP 734 timestamping CQE by the device due to a time delta elapsing. If such a 735 CQE is somehow delivered, `ptp_cq[i]_late_cqe` is incremented. 736 - Error 737 738.. [#ring_global] The corresponding ring and global counters do not share the 739 same name (i.e. do not follow the common naming scheme). 740 741vPort Counters 742-------------- 743Counters on the NIC port that is connected to a eSwitch. 744 745.. flat-table:: vPort Counter Table 746 :widths: 2 3 1 747 748 * - Counter 749 - Description 750 - Type 751 752 * - `rx_vport_unicast_packets` 753 - Unicast packets received, steered to a port including Raw Ethernet 754 QP/DPDK traffic, excluding RDMA traffic. 755 - Informative 756 757 * - `rx_vport_unicast_bytes` 758 - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK 759 traffic, excluding RDMA traffic. 760 - Informative 761 762 * - `tx_vport_unicast_packets` 763 - Unicast packets transmitted, steered from a port including Raw Ethernet 764 QP/DPDK traffic, excluding RDMA traffic. 765 - Informative 766 767 * - `tx_vport_unicast_bytes` 768 - Unicast bytes transmitted, steered from a port including Raw Ethernet 769 QP/DPDK traffic, excluding RDMA traffic. 770 - Informative 771 772 * - `rx_vport_multicast_packets` 773 - Multicast packets received, steered to a port including Raw Ethernet 774 QP/DPDK traffic, excluding RDMA traffic. 775 - Informative 776 777 * - `rx_vport_multicast_bytes` 778 - Multicast bytes received, steered to a port including Raw Ethernet 779 QP/DPDK traffic, excluding RDMA traffic. 780 - Informative 781 782 * - `tx_vport_multicast_packets` 783 - Multicast packets transmitted, steered from a port including Raw Ethernet 784 QP/DPDK traffic, excluding RDMA traffic. 785 - Informative 786 787 * - `tx_vport_multicast_bytes` 788 - Multicast bytes transmitted, steered from a port including Raw Ethernet 789 QP/DPDK traffic, excluding RDMA traffic. 790 - Informative 791 792 * - `rx_vport_broadcast_packets` 793 - Broadcast packets received, steered to a port including Raw Ethernet 794 QP/DPDK traffic, excluding RDMA traffic. 795 - Informative 796 797 * - `rx_vport_broadcast_bytes` 798 - Broadcast bytes received, steered to a port including Raw Ethernet 799 QP/DPDK traffic, excluding RDMA traffic. 800 - Informative 801 802 * - `tx_vport_broadcast_packets` 803 - Broadcast packets transmitted, steered from a port including Raw Ethernet 804 QP/DPDK traffic, excluding RDMA traffic. 805 - Informative 806 807 * - `tx_vport_broadcast_bytes` 808 - Broadcast bytes transmitted, steered from a port including Raw Ethernet 809 QP/DPDK traffic, excluding RDMA traffic. 810 - Informative 811 812 * - `rx_vport_rdma_unicast_packets` 813 - RDMA unicast packets received, steered to a port (counters counts 814 RoCE/UD/RC traffic) [#accel]_. 815 - Acceleration 816 817 * - `rx_vport_rdma_unicast_bytes` 818 - RDMA unicast bytes received, steered to a port (counters counts 819 RoCE/UD/RC traffic) [#accel]_. 820 - Acceleration 821 822 * - `tx_vport_rdma_unicast_packets` 823 - RDMA unicast packets transmitted, steered from a port (counters counts 824 RoCE/UD/RC traffic) [#accel]_. 825 - Acceleration 826 827 * - `tx_vport_rdma_unicast_bytes` 828 - RDMA unicast bytes transmitted, steered from a port (counters counts 829 RoCE/UD/RC traffic) [#accel]_. 830 - Acceleration 831 832 * - `rx_vport_rdma_multicast_packets` 833 - RDMA multicast packets received, steered to a port (counters counts 834 RoCE/UD/RC traffic) [#accel]_. 835 - Acceleration 836 837 * - `rx_vport_rdma_multicast_bytes` 838 - RDMA multicast bytes received, steered to a port (counters counts 839 RoCE/UD/RC traffic) [#accel]_. 840 - Acceleration 841 842 * - `tx_vport_rdma_multicast_packets` 843 - RDMA multicast packets transmitted, steered from a port (counters counts 844 RoCE/UD/RC traffic) [#accel]_. 845 - Acceleration 846 847 * - `tx_vport_rdma_multicast_bytes` 848 - RDMA multicast bytes transmitted, steered from a port (counters counts 849 RoCE/UD/RC traffic) [#accel]_. 850 - Acceleration 851 852 * - `vport_loopback_packets` 853 - Unicast, multicast and broadcast packets that were loop-back (received 854 and transmitted), IB/Eth [#accel]_. 855 - Acceleration 856 857 * - `vport_loopback_bytes` 858 - Unicast, multicast and broadcast bytes that were loop-back (received 859 and transmitted), IB/Eth [#accel]_. 860 - Acceleration 861 862 * - `rx_steer_missed_packets` 863 - Number of packets that was received by the NIC, however was discarded 864 because it did not match any flow in the NIC flow table. 865 - Error 866 867 * - `rx_packets` 868 - Representor only: packets received, that were handled by the hypervisor. 869 - Informative 870 871 * - `rx_bytes` 872 - Representor only: bytes received, that were handled by the hypervisor. 873 - Informative 874 875 * - `tx_packets` 876 - Representor only: packets transmitted, that were handled by the 877 hypervisor. 878 - Informative 879 880 * - `tx_bytes` 881 - Representor only: bytes transmitted, that were handled by the hypervisor. 882 - Informative 883 884 * - `dev_internal_queue_oob` 885 - The number of dropped packets due to lack of receive WQEs for an internal 886 device RQ. 887 - Error 888 889Physical Port Counters 890---------------------- 891The physical port counters are the counters on the external port connecting the 892adapter to the network. This measuring point holds information on standardized 893counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters 894like flow control, FEC and more. 895 896.. flat-table:: Physical Port Counter Table 897 :widths: 2 3 1 898 899 * - Counter 900 - Description 901 - Type 902 903 * - `rx_packets_phy` 904 - The number of packets received on the physical port. This counter doesn’t 905 include packets that were discarded due to FCS, frame size and similar 906 errors. 907 - Informative 908 909 * - `tx_packets_phy` 910 - The number of packets transmitted on the physical port. 911 - Informative 912 913 * - `rx_bytes_phy` 914 - The number of bytes received on the physical port, including Ethernet 915 header and FCS. 916 - Informative 917 918 * - `tx_bytes_phy` 919 - The number of bytes transmitted on the physical port. 920 - Informative 921 922 * - `rx_multicast_phy` 923 - The number of multicast packets received on the physical port. 924 - Informative 925 926 * - `tx_multicast_phy` 927 - The number of multicast packets transmitted on the physical port. 928 - Informative 929 930 * - `rx_broadcast_phy` 931 - The number of broadcast packets received on the physical port. 932 - Informative 933 934 * - `tx_broadcast_phy` 935 - The number of broadcast packets transmitted on the physical port. 936 - Informative 937 938 * - `rx_crc_errors_phy` 939 - The number of dropped received packets due to FCS (Frame Check Sequence) 940 error on the physical port. If this counter is increased in high rate, 941 check the link quality using `rx_symbol_error_phy` and 942 `rx_corrected_bits_phy` counters below. 943 - Error 944 945 * - `rx_in_range_len_errors_phy` 946 - The number of received packets dropped due to length/type errors on a 947 physical port. 948 - Error 949 950 * - `rx_out_of_range_len_phy` 951 - The number of received packets dropped due to length greater than allowed 952 on a physical port. If this counter is increasing, it implies that the 953 peer connected to the adapter has a larger MTU configured. Using same MTU 954 configuration shall resolve this issue. 955 - Error 956 957 * - `rx_oversize_pkts_phy` 958 - The number of dropped received packets due to length which exceed MTU 959 size on a physical port. If this counter is increasing, it implies that 960 the peer connected to the adapter has a larger MTU configured. Using same 961 MTU configuration shall resolve this issue. 962 - Error 963 964 * - `rx_symbol_err_phy` 965 - The number of received packets dropped due to physical coding errors 966 (symbol errors) on a physical port. 967 - Error 968 969 * - `rx_mac_control_phy` 970 - The number of MAC control packets received on the physical port. 971 - Informative 972 973 * - `tx_mac_control_phy` 974 - The number of MAC control packets transmitted on the physical port. 975 - Informative 976 977 * - `rx_pause_ctrl_phy` 978 - The number of link layer pause packets received on a physical port. If 979 this counter is increasing, it implies that the network is congested and 980 cannot absorb the traffic coming from to the adapter. 981 - Informative 982 983 * - `tx_pause_ctrl_phy` 984 - The number of link layer pause packets transmitted on a physical port. If 985 this counter is increasing, it implies that the NIC is congested and 986 cannot absorb the traffic coming from the network. 987 - Informative 988 989 * - `rx_unsupported_op_phy` 990 - The number of MAC control packets received with unsupported opcode on a 991 physical port. 992 - Error 993 994 * - `rx_discards_phy` 995 - The number of received packets dropped due to lack of buffers on a 996 physical port. If this counter is increasing, it implies that the adapter 997 is congested and cannot absorb the traffic coming from the network. 998 - Error 999 1000 * - `tx_discards_phy` 1001 - The number of packets which were discarded on transmission, even no 1002 errors were detected. the drop might occur due to link in down state, 1003 head of line drop, pause from the network, etc. 1004 - Error 1005 1006 * - `tx_errors_phy` 1007 - The number of transmitted packets dropped due to a length which exceed 1008 MTU size on a physical port. 1009 - Error 1010 1011 * - `rx_undersize_pkts_phy` 1012 - The number of received packets dropped due to length which is shorter 1013 than 64 bytes on a physical port. If this counter is increasing, it 1014 implies that the peer connected to the adapter has a non-standard MTU 1015 configured or malformed packet had arrived. 1016 - Error 1017 1018 * - `rx_fragments_phy` 1019 - The number of received packets dropped due to a length which is shorter 1020 than 64 bytes and has FCS error on a physical port. If this counter is 1021 increasing, it implies that the peer connected to the adapter has a 1022 non-standard MTU configured. 1023 - Error 1024 1025 * - `rx_jabbers_phy` 1026 - The number of received packets d due to a length which is longer than 64 1027 bytes and had FCS error on a physical port. 1028 - Error 1029 1030 * - `rx_64_bytes_phy` 1031 - The number of packets received on the physical port with size of 64 bytes. 1032 - Informative 1033 1034 * - `rx_65_to_127_bytes_phy` 1035 - The number of packets received on the physical port with size of 65 to 1036 127 bytes. 1037 - Informative 1038 1039 * - `rx_128_to_255_bytes_phy` 1040 - The number of packets received on the physical port with size of 128 to 1041 255 bytes. 1042 - Informative 1043 1044 * - `rx_256_to_511_bytes_phy` 1045 - The number of packets received on the physical port with size of 256 to 1046 512 bytes. 1047 - Informative 1048 1049 * - `rx_512_to_1023_bytes_phy` 1050 - The number of packets received on the physical port with size of 512 to 1051 1023 bytes. 1052 - Informative 1053 1054 * - `rx_1024_to_1518_bytes_phy` 1055 - The number of packets received on the physical port with size of 1024 to 1056 1518 bytes. 1057 - Informative 1058 1059 * - `rx_1519_to_2047_bytes_phy` 1060 - The number of packets received on the physical port with size of 1519 to 1061 2047 bytes. 1062 - Informative 1063 1064 * - `rx_2048_to_4095_bytes_phy` 1065 - The number of packets received on the physical port with size of 2048 to 1066 4095 bytes. 1067 - Informative 1068 1069 * - `rx_4096_to_8191_bytes_phy` 1070 - The number of packets received on the physical port with size of 4096 to 1071 8191 bytes. 1072 - Informative 1073 1074 * - `rx_8192_to_10239_bytes_phy` 1075 - The number of packets received on the physical port with size of 8192 to 1076 10239 bytes. 1077 - Informative 1078 1079 * - `link_down_events_phy` 1080 - The number of times where the link operative state changed to down. In 1081 case this counter is increasing it may imply on port flapping. You may 1082 need to replace the cable/transceiver. 1083 - Error 1084 1085 * - `rx_out_of_buffer` 1086 - Number of times receive queue had no software buffers allocated for the 1087 adapter's incoming traffic. 1088 - Error 1089 1090 * - `module_bus_stuck` 1091 - The number of times that module's I\ :sup:`2`\C bus (data or clock) 1092 short-wire was detected. You may need to replace the cable/transceiver. 1093 - Error 1094 1095 * - `module_high_temp` 1096 - The number of times that the module temperature was too high. If this 1097 issue persist, you may need to check the ambient temperature or replace 1098 the cable/transceiver module. 1099 - Error 1100 1101 * - `module_bad_shorted` 1102 - The number of times that the module cables were shorted. You may need to 1103 replace the cable/transceiver module. 1104 - Error 1105 1106 * - `module_unplug` 1107 - The number of times that module was ejected. 1108 - Informative 1109 1110 * - `rx_buffer_passed_thres_phy` 1111 - The number of events where the port receive buffer was over 85% full. 1112 - Informative 1113 1114 * - `tx_pause_storm_warning_events` 1115 - The number of times the device was sending pauses for a long period of 1116 time. 1117 - Informative 1118 1119 * - `tx_pause_storm_error_events` 1120 - The number of times the device was sending pauses for a long period of 1121 time, reaching time out and disabling transmission of pause frames. on 1122 the period where pause frames were disabled, drop could have been 1123 occurred. 1124 - Error 1125 1126 * - `rx[i]_buff_alloc_err` 1127 - Failed to allocate a buffer to received packet (or SKB) on ring i. 1128 - Error 1129 1130 * - `rx_bits_phy` 1131 - This counter provides information on the total amount of traffic that 1132 could have been received and can be used as a guideline to measure the 1133 ratio of errored traffic in `rx_pcs_symbol_err_phy` and 1134 `rx_corrected_bits_phy`. 1135 - Informative 1136 1137 * - `rx_pcs_symbol_err_phy` 1138 - This counter counts the number of symbol errors that wasn’t corrected by 1139 FEC correction algorithm or that FEC algorithm was not active on this 1140 interface. If this counter is increasing, it implies that the link 1141 between the NIC and the network is suffering from high BER, and that 1142 traffic is lost. You may need to replace the cable/transceiver. The error 1143 rate is the number of `rx_pcs_symbol_err_phy` divided by the number of 1144 `rx_bits_phy` on a specific time frame. 1145 - Error 1146 1147 * - `rx_corrected_bits_phy` 1148 - The number of corrected bits on this port according to active FEC 1149 (RS/FC). If this counter is increasing, it implies that the link between 1150 the NIC and the network is suffering from high BER. The corrected bit 1151 rate is the number of `rx_corrected_bits_phy` divided by the number of 1152 `rx_bits_phy` on a specific time frame. 1153 - Error 1154 1155 * - `rx_err_lane_[l]_phy` 1156 - This counter counts the number of physical raw errors per lane l index. 1157 The counter counts errors before FEC corrections. If this counter is 1158 increasing, it implies that the link between the NIC and the network is 1159 suffering from high BER, and that traffic might be lost. You may need to 1160 replace the cable/transceiver. Please check in accordance with 1161 `rx_corrected_bits_phy`. 1162 - Error 1163 1164 * - `rx_global_pause` 1165 - The number of pause packets received on the physical port. If this 1166 counter is increasing, it implies that the network is congested and 1167 cannot absorb the traffic coming from the adapter. Note: This counter is 1168 only enabled when global pause mode is enabled. 1169 - Informative 1170 1171 * - `rx_global_pause_duration` 1172 - The duration of pause received (in microSec) on the physical port. The 1173 counter represents the time the port did not send any traffic. If this 1174 counter is increasing, it implies that the network is congested and 1175 cannot absorb the traffic coming from the adapter. Note: This counter is 1176 only enabled when global pause mode is enabled. 1177 - Informative 1178 1179 * - `tx_global_pause` 1180 - The number of pause packets transmitted on a physical port. If this 1181 counter is increasing, it implies that the adapter is congested and 1182 cannot absorb the traffic coming from the network. Note: This counter is 1183 only enabled when global pause mode is enabled. 1184 - Informative 1185 1186 * - `tx_global_pause_duration` 1187 - The duration of pause transmitter (in microSec) on the physical port. 1188 Note: This counter is only enabled when global pause mode is enabled. 1189 - Informative 1190 1191 * - `rx_global_pause_transition` 1192 - The number of times a transition from Xoff to Xon on the physical port 1193 has occurred. Note: This counter is only enabled when global pause mode 1194 is enabled. 1195 - Informative 1196 1197 * - `rx_if_down_packets` 1198 - The number of received packets that were dropped due to interface down. 1199 - Informative 1200 1201Priority Port Counters 1202---------------------- 1203The following counters are physical port counters that are counted per L2 1204priority (0-7). 1205 1206**Note:** `p` in the counter name represents the priority. 1207 1208.. flat-table:: Priority Port Counter Table 1209 :widths: 2 3 1 1210 1211 * - Counter 1212 - Description 1213 - Type 1214 1215 * - `rx_prio[p]_bytes` 1216 - The number of bytes received with priority p on the physical port. 1217 - Informative 1218 1219 * - `rx_prio[p]_packets` 1220 - The number of packets received with priority p on the physical port. 1221 - Informative 1222 1223 * - `tx_prio[p]_bytes` 1224 - The number of bytes transmitted on priority p on the physical port. 1225 - Informative 1226 1227 * - `tx_prio[p]_packets` 1228 - The number of packets transmitted on priority p on the physical port. 1229 - Informative 1230 1231 * - `rx_prio[p]_pause` 1232 - The number of pause packets received with priority p on a physical port. 1233 If this counter is increasing, it implies that the network is congested 1234 and cannot absorb the traffic coming from the adapter. Note: This counter 1235 is available only if PFC was enabled on priority p. 1236 - Informative 1237 1238 * - `rx_prio[p]_pause_duration` 1239 - The duration of pause received (in microSec) on priority p on the 1240 physical port. The counter represents the time the port did not send any 1241 traffic on this priority. If this counter is increasing, it implies that 1242 the network is congested and cannot absorb the traffic coming from the 1243 adapter. Note: This counter is available only if PFC was enabled on 1244 priority p. 1245 - Informative 1246 1247 * - `rx_prio[p]_pause_transition` 1248 - The number of times a transition from Xoff to Xon on priority p on the 1249 physical port has occurred. Note: This counter is available only if PFC 1250 was enabled on priority p. 1251 - Informative 1252 1253 * - `tx_prio[p]_pause` 1254 - The number of pause packets transmitted on priority p on a physical port. 1255 If this counter is increasing, it implies that the adapter is congested 1256 and cannot absorb the traffic coming from the network. Note: This counter 1257 is available only if PFC was enabled on priority p. 1258 - Informative 1259 1260 * - `tx_prio[p]_pause_duration` 1261 - The duration of pause transmitter (in microSec) on priority p on the 1262 physical port. Note: This counter is available only if PFC was enabled on 1263 priority p. 1264 - Informative 1265 1266 * - `rx_prio[p]_buf_discard` 1267 - The number of packets discarded by device due to lack of per host receive 1268 buffers. 1269 - Informative 1270 1271 * - `rx_prio[p]_cong_discard` 1272 - The number of packets discarded by device due to per host congestion. 1273 - Informative 1274 1275 * - `rx_prio[p]_marked` 1276 - The number of packets ecn marked by device due to per host congestion. 1277 - Informative 1278 1279 * - `rx_prio[p]_discards` 1280 - The number of packets discarded by device due to lack of receive buffers. 1281 - Informative 1282 1283Device Counters 1284--------------- 1285.. flat-table:: Device Counter Table 1286 :widths: 2 3 1 1287 1288 * - Counter 1289 - Description 1290 - Type 1291 1292 * - `rx_pci_signal_integrity` 1293 - Counts physical layer PCIe signal integrity errors, the number of 1294 transitions to recovery due to Framing errors and CRC (dlp and tlp). If 1295 this counter is raising, try moving the adapter card to a different slot 1296 to rule out a bad PCI slot. Validate that you are running with the latest 1297 firmware available and latest server BIOS version. 1298 - Error 1299 1300 * - `tx_pci_signal_integrity` 1301 - Counts physical layer PCIe signal integrity errors, the number of 1302 transition to recovery initiated by the other side (moving to recovery 1303 due to getting TS/EIEOS). If this counter is raising, try moving the 1304 adapter card to a different slot to rule out a bad PCI slot. Validate 1305 that you are running with the latest firmware available and latest server 1306 BIOS version. 1307 - Error 1308 1309 * - `outbound_pci_buffer_overflow` 1310 - The number of packets dropped due to pci buffer overflow. If this counter 1311 is raising in high rate, it might indicate that the receive traffic rate 1312 for a host is larger than the PCIe bus and therefore a congestion occurs. 1313 - Informative 1314 1315 * - `outbound_pci_stalled_rd` 1316 - The percentage (in the range 0...100) of time within the last second that 1317 the NIC had outbound non-posted reads requests but could not perform the 1318 operation due to insufficient posted credits. 1319 - Informative 1320 1321 * - `outbound_pci_stalled_wr` 1322 - The percentage (in the range 0...100) of time within the last second that 1323 the NIC had outbound posted writes requests but could not perform the 1324 operation due to insufficient posted credits. 1325 - Informative 1326 1327 * - `outbound_pci_stalled_rd_events` 1328 - The number of seconds where `outbound_pci_stalled_rd` was above 30%. 1329 - Informative 1330 1331 * - `outbound_pci_stalled_wr_events` 1332 - The number of seconds where `outbound_pci_stalled_wr` was above 30%. 1333 - Informative 1334 1335 * - `dev_out_of_buffer` 1336 - The number of times the device owned queue had not enough buffers 1337 allocated. 1338 - Error 1339