xref: /linux/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst (revision 6e7fd890f1d6ac83805409e9c346240de2705584)
1.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
2.. include:: <isonum.txt>
3
4================
5Ethtool counters
6================
7
8:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
9
10Contents
11========
12
13- `Overview`_
14- `Groups`_
15- `Types`_
16- `Descriptions`_
17
18Overview
19========
20
21There are several counter groups based on where the counter is being counted. In
22addition, each group of counters may have different counter types.
23
24These counter groups are based on which component in a networking setup,
25illustrated below, that they describe::
26
27                                                  ----------------------------------------
28                                                  |                                      |
29    ----------------------------------------    ---------------------------------------- |
30    |              Hypervisor              |    |                  VM                  | |
31    |                                      |    |                                      | |
32    | -------------------  --------------- |    | -------------------  --------------- | |
33    | | Ethernet driver |  | RDMA driver | |    | | Ethernet driver |  | RDMA driver | | |
34    | -------------------  --------------- |    | -------------------  --------------- | |
35    |           |                 |        |    |           |                 |        | |
36    |           -------------------        |    |           -------------------        | |
37    |                   |                  |    |                   |                  |--
38    ----------------------------------------    ----------------------------------------
39                        |                                           |
40            -------------               -----------------------------
41            |                           |
42         ------                      ------ ------ ------         ------      ------      ------
43    -----| PF |----------------------| VF |-| VF |-| VF |-----  --| PF |--- --| PF |--- --| PF |---
44    |    ------                      ------ ------ ------    |  | ------  | | ------  | | ------  |
45    |                                                        |  |         | |         | |         |
46    |                                                        |  |         | |         | |         |
47    |                                                        |  |         | |         | |         |
48    | eSwitch                                                |  | eSwitch | | eSwitch | | eSwitch |
49    ----------------------------------------------------------  ----------- ----------- -----------
50               -------------------------------------------------------------------------------
51               |                                                                             |
52               |                                                                             |
53               | Uplink (no counters)                                                        |
54               -------------------------------------------------------------------------------
55                       ---------------------------------------------------------------
56                       |                                                             |
57                       |                                                             |
58                       | MPFS (no counters)                                          |
59                       ---------------------------------------------------------------
60                                                     |
61                                                     |
62                                                     | Port
63
64Groups
65======
66
67Ring
68  Software counters populated by the driver stack.
69
70Netdev
71  An aggregation of software ring counters.
72
73vPort counters
74  Traffic counters and drops due to steering or no buffers. May indicate issues
75  with NIC. These counters include Ethernet traffic counters (including Raw
76  Ethernet) and RDMA/RoCE traffic counters.
77
78Physical port counters
79  Counters that collect statistics about the PFs and VFs. May indicate issues
80  with NIC, link, or network. This measuring point holds information on
81  standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and
82  additional counters like flow control, FEC and more. Physical port counters
83  are not exposed to virtual machines.
84
85Priority Port Counters
86  A set of the physical port counters, per priority per port.
87
88Types
89=====
90
91Counters are divided into three types.
92
93Traffic Informative Counters
94  Counters which count traffic. These counters can be used for load estimation
95  or for general debug.
96
97Traffic Acceleration Counters
98  Counters which count traffic that was accelerated by Mellanox driver or by
99  hardware. The counters are an additional layer to the informative counter set,
100  and the same traffic is counted in both informative and acceleration counters.
101
102.. [#accel] Traffic acceleration counter.
103
104Error Counters
105  Increment of these counters might indicate a problem. Each of these counters
106  has an explanation and correction action.
107
108Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool`
109provides more detailed information.::
110
111    ip –s link show <if-name>
112    ethtool -S <if-name>
113
114Descriptions
115============
116
117XSK, PTP, and QoS counters that are similar to counters defined previously will
118not be separately listed. For example, `ptp_tx[i]_packets` will not be
119explicitly documented since `tx[i]_packets` describes the behavior of both
120counters, except `ptp_tx[i]_packets` is only counted when precision time
121protocol is used.
122
123Ring / Netdev Counter
124----------------------------
125The following counters are available per ring or software port.
126
127These counters provide information on the amount of traffic that was accelerated
128by the NIC. The counters are counting the accelerated traffic in addition to the
129standard counters which counts it (i.e. accelerated traffic is counted twice).
130
131The counter names in the table below refers to both ring and port counters. The
132notation for ring counters includes the [i] index without the braces. The
133notation for port counters doesn't include the [i]. A counter name
134`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for
135the software port.
136
137.. flat-table:: Ring / Software Port Counter Table
138   :widths: 2 3 1
139
140   * - Counter
141     - Description
142     - Type
143
144   * - `rx[i]_packets`
145     - The number of packets received on ring i.
146     - Informative
147
148   * - `rx[i]_bytes`
149     - The number of bytes received on ring i.
150     - Informative
151
152   * - `tx[i]_packets`
153     - The number of packets transmitted on ring i.
154     - Informative
155
156   * - `tx[i]_bytes`
157     - The number of bytes transmitted on ring i.
158     - Informative
159
160   * - `tx[i]_recover`
161     - The number of times the SQ was recovered.
162     - Error
163
164   * - `tx[i]_cqes`
165     - Number of CQEs events on SQ issued on ring i.
166     - Informative
167
168   * - `tx[i]_cqe_err`
169     - The number of error CQEs encountered on the SQ for ring i.
170     - Error
171
172   * - `tx[i]_tso_packets`
173     - The number of TSO packets transmitted on ring i [#accel]_.
174     - Acceleration
175
176   * - `tx[i]_tso_bytes`
177     - The number of TSO bytes transmitted on ring i [#accel]_.
178     - Acceleration
179
180   * - `tx[i]_tso_inner_packets`
181     - The number of TSO packets which are indicated to be carry internal
182       encapsulation transmitted on ring i [#accel]_.
183     - Acceleration
184
185   * - `tx[i]_tso_inner_bytes`
186     - The number of TSO bytes which are indicated to be carry internal
187       encapsulation transmitted on ring i [#accel]_.
188     - Acceleration
189
190   * - `rx[i]_gro_packets`
191     - Number of received packets processed using hardware-accelerated GRO. The
192       number of hardware GRO offloaded packets received on ring i. Only true GRO
193       packets are counted: only packets that are in an SKB with a GRO count > 1.
194     - Acceleration
195
196   * - `rx[i]_gro_bytes`
197     - Number of received bytes processed using hardware-accelerated GRO. The
198       number of hardware GRO offloaded bytes received on ring i. Only true GRO
199       packets are counted: only packets that are in an SKB with a GRO count > 1.
200     - Acceleration
201
202   * - `rx[i]_gro_skbs`
203     - The number of GRO SKBs constructed from hardware-accelerated GRO. Only SKBs
204       with a GRO count > 1 are counted.
205     - Informative
206
207   * - `rx[i]_gro_large_hds`
208     - Number of receive packets using hardware-accelerated GRO that have large
209       headers that require additional memory to be allocated.
210     - Informative
211
212   * - `rx[i]_hds_nodata_packets`
213     - Number of header only packets in header/data split mode [#accel]_.
214     - Informative
215
216   * - `rx[i]_hds_nodata_bytes`
217     - Number of bytes for header only packets in header/data split mode
218       [#accel]_.
219     - Informative
220
221   * - `rx[i]_lro_packets`
222     - The number of LRO packets received on ring i [#accel]_.
223     - Acceleration
224
225   * - `rx[i]_lro_bytes`
226     - The number of LRO bytes received on ring i [#accel]_.
227     - Acceleration
228
229   * - `rx[i]_ecn_mark`
230     - The number of received packets where the ECN mark was turned on.
231     - Informative
232
233   * - `rx_oversize_pkts_buffer`
234     - The number of dropped received packets due to length which arrived to RQ
235       and exceed software buffer size allocated by the device for incoming
236       traffic. It might imply that the device MTU is larger than the software
237       buffers size.
238     - Error
239
240   * - `rx_oversize_pkts_sw_drop`
241     - Number of received packets dropped in software because the CQE data is
242       larger than the MTU size.
243     - Error
244
245   * - `rx[i]_csum_unnecessary`
246     - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_.
247     - Acceleration
248
249   * - `rx[i]_csum_unnecessary_inner`
250     - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY`
251       on ring i [#accel]_.
252     - Acceleration
253
254   * - `rx[i]_csum_none`
255     - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_.
256     - Acceleration
257
258   * - `rx[i]_csum_complete`
259     - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_.
260     - Acceleration
261
262   * - `rx[i]_csum_complete_tail`
263     - Number of received packets that had checksum calculation computed,
264       potentially needed padding, and were able to do so with
265       `CHECKSUM_PARTIAL`.
266     - Informative
267
268   * - `rx[i]_csum_complete_tail_slow`
269     - Number of received packets that need padding larger than eight bytes for
270       the checksum.
271     - Informative
272
273   * - `tx[i]_csum_partial`
274     - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_.
275     - Acceleration
276
277   * - `tx[i]_csum_partial_inner`
278     - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on
279       ring i [#accel]_.
280     - Acceleration
281
282   * - `tx[i]_csum_none`
283     - Packets transmitted with no hardware checksum acceleration on ring i.
284     - Informative
285
286   * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_
287     - Events where SQ was full on ring i. If this counter is increased, check
288       the amount of buffers allocated for transmission.
289     - Informative
290
291   * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_
292     - Events where SQ was full and has become not full on ring i.
293     - Informative
294
295   * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_
296     - Packets transmitted that were dropped due to DMA mapping failure on
297       ring i. If this counter is increased, check the amount of buffers
298       allocated for transmission.
299     - Error
300
301   * - `tx[i]_nop`
302     - The number of nop WQEs (empty WQEs) inserted to the SQ (related to
303       ring i) due to the reach of the end of the cyclic buffer. When reaching
304       near to the end of cyclic buffer the driver may add those empty WQEs to
305       avoid handling a state the a WQE start in the end of the queue and ends
306       in the beginning of the queue. This is a normal condition.
307     - Informative
308
309   * - `tx[i]_timestamps`
310     - Transmitted packets that were hardware timestamped at the device's DMA
311       layer.
312     - Informative
313
314   * - `tx[i]_added_vlan_packets`
315     - The number of packets sent where vlan tag insertion was offloaded to the
316       hardware.
317     - Acceleration
318
319   * - `rx[i]_removed_vlan_packets`
320     - The number of packets received where vlan tag stripping was offloaded to
321       the hardware.
322     - Acceleration
323
324   * - `rx[i]_wqe_err`
325     - The number of wrong opcodes received on ring i.
326     - Error
327
328   * - `rx[i]_mpwqe_frag`
329     - The number of WQEs that failed to allocate compound page and hence
330       fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this
331       counter raise, it may suggest that there is no enough memory for large
332       pages, the driver allocated fragmented pages. This is not abnormal
333       condition.
334     - Informative
335
336   * - `rx[i]_mpwqe_filler_cqes`
337     - The number of filler CQEs events that were issued on ring i.
338     - Informative
339
340   * - `rx[i]_mpwqe_filler_strides`
341     - The number of strides consumed by filler CQEs on ring i.
342     - Informative
343
344   * - `tx[i]_mpwqe_blks`
345     - The number of send blocks processed from Multi-Packet WQEs (mpwqe).
346     - Informative
347
348   * - `tx[i]_mpwqe_pkts`
349     - The number of send packets processed from Multi-Packet WQEs (mpwqe).
350     - Informative
351
352   * - `rx[i]_cqe_compress_blks`
353     - The number of receive blocks with CQE compression on ring i [#accel]_.
354     - Acceleration
355
356   * - `rx[i]_cqe_compress_pkts`
357     - The number of receive packets with CQE compression on ring i [#accel]_.
358     - Acceleration
359
360   * - `rx[i]_arfs_add`
361     - The number of aRFS flow rules added to the device for direct RQ steering
362       on ring i [#accel]_.
363     - Acceleration
364
365   * - `rx[i]_arfs_request_in`
366     - Number of flow rules that have been requested to move into ring i for
367       direct RQ steering [#accel]_.
368     - Acceleration
369
370   * - `rx[i]_arfs_request_out`
371     - Number of flow rules that have been requested to move out of ring i [#accel]_.
372     - Acceleration
373
374   * - `rx[i]_arfs_expired`
375     - Number of flow rules that have been expired and removed [#accel]_.
376     - Acceleration
377
378   * - `rx[i]_arfs_err`
379     - Number of flow rules that failed to be added to the flow table.
380     - Error
381
382   * - `rx[i]_recover`
383     - The number of times the RQ was recovered.
384     - Error
385
386   * - `tx[i]_xmit_more`
387     - The number of packets sent with `xmit_more` indication set on the skbuff
388       (no doorbell).
389     - Acceleration
390
391   * - `ch[i]_poll`
392     - The number of invocations of NAPI poll of channel i.
393     - Informative
394
395   * - `ch[i]_arm`
396     - The number of times the NAPI poll function completed and armed the
397       completion queues on channel i.
398     - Informative
399
400   * - `ch[i]_aff_change`
401     - The number of times the NAPI poll function explicitly stopped execution
402       on a CPU due to a change in affinity, on channel i.
403     - Informative
404
405   * - `ch[i]_events`
406     - The number of hard interrupt events on the completion queues of channel i.
407     - Informative
408
409   * - `ch[i]_eq_rearm`
410     - The number of times the EQ was recovered.
411     - Error
412
413   * - `ch[i]_force_irq`
414     - Number of times NAPI is triggered by XSK wakeups by posting a NOP to
415       ICOSQ.
416     - Acceleration
417
418   * - `rx[i]_congst_umr`
419     - The number of times an outstanding UMR request is delayed due to
420       congestion, on ring i.
421     - Informative
422
423   * - `rx_pp_alloc_fast`
424     - Number of successful fast path allocations.
425     - Informative
426
427   * - `rx_pp_alloc_slow`
428     - Number of slow path order-0 allocations.
429     - Informative
430
431   * - `rx_pp_alloc_slow_high_order`
432     - Number of slow path high order allocations.
433     - Informative
434
435   * - `rx_pp_alloc_empty`
436     - Counter is incremented when ptr ring is empty, so a slow path allocation
437       was forced.
438     - Informative
439
440   * - `rx_pp_alloc_refill`
441     - Counter is incremented when an allocation which triggered a refill of the
442       cache.
443     - Informative
444
445   * - `rx_pp_alloc_waive`
446     - Counter is incremented when pages obtained from the ptr ring that cannot
447       be added to the cache due to a NUMA mismatch.
448     - Informative
449
450   * - `rx_pp_recycle_cached`
451     - Counter is incremented when recycling placed page in the page pool cache.
452     - Informative
453
454   * - `rx_pp_recycle_cache_full`
455     - Counter is incremented when page pool cache was full.
456     - Informative
457
458   * - `rx_pp_recycle_ring`
459     - Counter is incremented when page placed into the ptr ring.
460     - Informative
461
462   * - `rx_pp_recycle_ring_full`
463     - Counter is incremented when page released from page pool because the ptr
464       ring was full.
465     - Informative
466
467   * - `rx_pp_recycle_released_ref`
468     - Counter is incremented when page released (and not recycled) because
469       refcnt > 1.
470     - Informative
471
472   * - `rx[i]_xsk_buff_alloc_err`
473     - The number of times allocating an skb or XSK buffer failed in the XSK RQ
474       context.
475     - Error
476
477   * - `rx[i]_xdp_tx_xmit`
478     - The number of packets forwarded back to the port due to XDP program
479       `XDP_TX` action (bouncing). these packets are not counted by other
480       software counters. These packets are counted by physical port and vPort
481       counters.
482     - Informative
483
484   * - `rx[i]_xdp_tx_mpwqe`
485     - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by
486       the netdev during the RQ context.
487     - Acceleration
488
489   * - `rx[i]_xdp_tx_inlnw`
490     - Number of WQE data segments transmitted where the data could be inlined
491       in the WQE and then `XDP_TX`-ed during the RQ context.
492     - Acceleration
493
494   * - `rx[i]_xdp_tx_nops`
495     - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ.
496     - Acceleration
497
498   * - `rx[i]_xdp_tx_full`
499     - The number of packets that should have been forwarded back to the port
500       due to `XDP_TX` action but were dropped due to full tx queue. These packets
501       are not counted by other software counters. These packets are counted by
502       physical port and vPort counters. You may open more rx queues and spread
503       traffic rx over all queues and/or increase rx ring size.
504     - Error
505
506   * - `rx[i]_xdp_tx_err`
507     - The number of times an `XDP_TX` error such as frame too long and frame
508       too short occurred on `XDP_TX` ring of RX ring.
509     - Error
510
511   * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_
512     - The number of completions received on the CQ of the `XDP_TX` ring.
513     - Informative
514
515   * - `rx[i]_xdp_drop`
516     - The number of packets dropped due to XDP program `XDP_DROP` action. these
517       packets are not counted by other software counters. These packets are
518       counted by physical port and vPort counters.
519     - Informative
520
521   * - `rx[i]_xdp_redirect`
522     - The number of times an XDP redirect action was triggered on ring i.
523     - Acceleration
524
525   * - `tx[i]_xdp_xmit`
526     - The number of packets redirected to the interface(due to XDP redirect).
527       These packets are not counted by other software counters. These packets
528       are counted by physical port and vPort counters.
529     - Informative
530
531   * - `tx[i]_xdp_full`
532     - The number of packets redirected to the interface(due to XDP redirect),
533       but were dropped due to full tx queue. these packets are not counted by
534       other software counters. you may enlarge tx queues.
535     - Informative
536
537   * - `tx[i]_xdp_mpwqe`
538     - Number of multi-packet WQEs offloaded onto the NIC that were
539       `XDP_REDIRECT`-ed from other netdevs.
540     - Acceleration
541
542   * - `tx[i]_xdp_inlnw`
543     - Number of WQE data segments where the data could be inlined in the WQE
544       where the data segments were `XDP_REDIRECT`-ed from other netdevs.
545     - Acceleration
546
547   * - `tx[i]_xdp_nops`
548     - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were
549       `XDP_REDIRECT`-ed from other netdevs.
550     - Acceleration
551
552   * - `tx[i]_xdp_err`
553     - The number of packets redirected to the interface(due to XDP redirect)
554       but were dropped due to error such as frame too long and frame too short.
555     - Error
556
557   * - `tx[i]_xdp_cqes`
558     - The number of completions received for packets redirected to the
559       interface(due to XDP redirect) on the CQ.
560     - Informative
561
562   * - `tx[i]_xsk_xmit`
563     - The number of packets transmitted using XSK zerocopy functionality.
564     - Acceleration
565
566   * - `tx[i]_xsk_mpwqe`
567     - Number of multi-packet WQEs offloaded onto the NIC that were
568       `XDP_REDIRECT`-ed from other netdevs.
569     - Acceleration
570
571   * - `tx[i]_xsk_inlnw`
572     - Number of WQE data segments where the data could be inlined in the WQE
573       that are transmitted using XSK zerocopy.
574     - Acceleration
575
576   * - `tx[i]_xsk_full`
577     - Number of times doorbell is rung in XSK zerocopy mode when SQ is full.
578     - Error
579
580   * - `tx[i]_xsk_err`
581     - Number of errors that occurred in XSK zerocopy mode such as if the data
582       size is larger than the MTU size.
583     - Error
584
585   * - `tx[i]_xsk_cqes`
586     - Number of CQEs processed in XSK zerocopy mode.
587     - Acceleration
588
589   * - `tx_tls_ctx`
590     - Number of TLS TX HW offload contexts added to device for encryption.
591     - Acceleration
592
593   * - `tx_tls_del`
594     - Number of TLS TX HW offload contexts removed from device (connection
595       closed).
596     - Acceleration
597
598   * - `tx_tls_pool_alloc`
599     - Number of times a unit of work is successfully allocated in the TLS HW
600       offload pool.
601     - Acceleration
602
603   * - `tx_tls_pool_free`
604     - Number of times a unit of work is freed in the TLS HW offload pool.
605     - Acceleration
606
607   * - `rx_tls_ctx`
608     - Number of TLS RX HW offload contexts added to device for decryption.
609     - Acceleration
610
611   * - `rx_tls_del`
612     - Number of TLS RX HW offload contexts deleted from device (connection has
613       finished).
614     - Acceleration
615
616   * - `rx[i]_tls_decrypted_packets`
617     - Number of successfully decrypted RX packets which were part of a TLS
618       stream.
619     - Acceleration
620
621   * - `rx[i]_tls_decrypted_bytes`
622     - Number of TLS payload bytes in RX packets which were successfully
623       decrypted.
624     - Acceleration
625
626   * - `rx[i]_tls_resync_req_pkt`
627     - Number of received TLS packets with a resync request.
628     - Acceleration
629
630   * - `rx[i]_tls_resync_req_start`
631     - Number of times the TLS async resync request was started.
632     - Acceleration
633
634   * - `rx[i]_tls_resync_req_end`
635     - Number of times the TLS async resync request properly ended with
636       providing the HW tracked tcp-seq.
637     - Acceleration
638
639   * - `rx[i]_tls_resync_req_skip`
640     - Number of times the TLS async resync request procedure was started but
641       not properly ended.
642     - Error
643
644   * - `rx[i]_tls_resync_res_ok`
645     - Number of times the TLS resync response call to the driver was
646       successfully handled.
647     - Acceleration
648
649   * - `rx[i]_tls_resync_res_retry`
650     - Number of times the TLS resync response call to the driver was
651       reattempted when ICOSQ is full.
652     - Error
653
654   * - `rx[i]_tls_resync_res_skip`
655     - Number of times the TLS resync response call to the driver was terminated
656       unsuccessfully.
657     - Error
658
659   * - `rx[i]_tls_err`
660     - Number of times when CQE TLS offload was problematic.
661     - Error
662
663   * - `tx[i]_tls_encrypted_packets`
664     - The number of send packets that are TLS encrypted by the kernel.
665     - Acceleration
666
667   * - `tx[i]_tls_encrypted_bytes`
668     - The number of send bytes that are TLS encrypted by the kernel.
669     - Acceleration
670
671   * - `tx[i]_tls_ooo`
672     - Number of times out of order TLS SQE fragments were handled on ring i.
673     - Acceleration
674
675   * - `tx[i]_tls_dump_packets`
676     - Number of TLS decrypted packets copied over from NIC over DMA.
677     - Acceleration
678
679   * - `tx[i]_tls_dump_bytes`
680     - Number of TLS decrypted bytes copied over from NIC over DMA.
681     - Acceleration
682
683   * - `tx[i]_tls_resync_bytes`
684     - Number of TLS bytes requested to be resynchronized in order to be
685       decrypted.
686     - Acceleration
687
688   * - `tx[i]_tls_skip_no_sync_data`
689     - Number of TLS send data that can safely be skipped / do not need to be
690       decrypted.
691     - Acceleration
692
693   * - `tx[i]_tls_drop_no_sync_data`
694     - Number of TLS send data that were dropped due to retransmission of TLS
695       data.
696     - Acceleration
697
698   * - `ptp_cq[i]_abort`
699     - Number of times a CQE has to be skipped in precision time protocol due to
700       a skew between the port timestamp and CQE timestamp being greater than
701       128 seconds.
702     - Error
703
704   * - `ptp_cq[i]_abort_abs_diff_ns`
705     - Accumulation of time differences between the port timestamp and CQE
706       timestamp when the difference is greater than 128 seconds in precision
707       time protocol.
708     - Error
709
710   * - `ptp_cq[i]_late_cqe`
711     - Number of times a CQE has been delivered on the PTP timestamping CQ when
712       the CQE was not expected since a certain amount of time had elapsed where
713       the device typically ensures not posting the CQE.
714     - Error
715
716   * - `ptp_cq[i]_lost_cqe`
717     - Number of times a CQE is expected to not be delivered on the PTP
718       timestamping CQE by the device due to a time delta elapsing. If such a
719       CQE is somehow delivered, `ptp_cq[i]_late_cqe` is incremented.
720     - Error
721
722.. [#ring_global] The corresponding ring and global counters do not share the
723                  same name (i.e. do not follow the common naming scheme).
724
725vPort Counters
726--------------
727Counters on the NIC port that is connected to a eSwitch.
728
729.. flat-table:: vPort Counter Table
730   :widths: 2 3 1
731
732   * - Counter
733     - Description
734     - Type
735
736   * - `rx_vport_unicast_packets`
737     - Unicast packets received, steered to a port including Raw Ethernet
738       QP/DPDK traffic, excluding RDMA traffic.
739     - Informative
740
741   * - `rx_vport_unicast_bytes`
742     - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK
743       traffic, excluding RDMA traffic.
744     - Informative
745
746   * - `tx_vport_unicast_packets`
747     - Unicast packets transmitted, steered from a port including Raw Ethernet
748       QP/DPDK traffic, excluding RDMA traffic.
749     - Informative
750
751   * - `tx_vport_unicast_bytes`
752     - Unicast bytes transmitted, steered from a port including Raw Ethernet
753       QP/DPDK traffic, excluding RDMA traffic.
754     - Informative
755
756   * - `rx_vport_multicast_packets`
757     - Multicast packets received, steered to a port including Raw Ethernet
758       QP/DPDK traffic, excluding RDMA traffic.
759     - Informative
760
761   * - `rx_vport_multicast_bytes`
762     - Multicast bytes received, steered to a port including Raw Ethernet
763       QP/DPDK traffic, excluding RDMA traffic.
764     - Informative
765
766   * - `tx_vport_multicast_packets`
767     - Multicast packets transmitted, steered from a port including Raw Ethernet
768       QP/DPDK traffic, excluding RDMA traffic.
769     - Informative
770
771   * - `tx_vport_multicast_bytes`
772     - Multicast bytes transmitted, steered from a port including Raw Ethernet
773       QP/DPDK traffic, excluding RDMA traffic.
774     - Informative
775
776   * - `rx_vport_broadcast_packets`
777     - Broadcast packets received, steered to a port including Raw Ethernet
778       QP/DPDK traffic, excluding RDMA traffic.
779     - Informative
780
781   * - `rx_vport_broadcast_bytes`
782     - Broadcast bytes received, steered to a port including Raw Ethernet
783       QP/DPDK traffic, excluding RDMA traffic.
784     - Informative
785
786   * - `tx_vport_broadcast_packets`
787     - Broadcast packets transmitted, steered from a port including Raw Ethernet
788       QP/DPDK traffic, excluding RDMA traffic.
789     - Informative
790
791   * - `tx_vport_broadcast_bytes`
792     - Broadcast bytes transmitted, steered from a port including Raw Ethernet
793       QP/DPDK traffic, excluding RDMA traffic.
794     - Informative
795
796   * - `rx_vport_rdma_unicast_packets`
797     - RDMA unicast packets received, steered to a port (counters counts
798       RoCE/UD/RC traffic) [#accel]_.
799     - Acceleration
800
801   * - `rx_vport_rdma_unicast_bytes`
802     - RDMA unicast bytes received, steered to a port (counters counts
803       RoCE/UD/RC traffic) [#accel]_.
804     - Acceleration
805
806   * - `tx_vport_rdma_unicast_packets`
807     - RDMA unicast packets transmitted, steered from a port (counters counts
808       RoCE/UD/RC traffic) [#accel]_.
809     - Acceleration
810
811   * - `tx_vport_rdma_unicast_bytes`
812     - RDMA unicast bytes transmitted, steered from a port (counters counts
813       RoCE/UD/RC traffic) [#accel]_.
814     - Acceleration
815
816   * - `rx_vport_rdma_multicast_packets`
817     - RDMA multicast packets received, steered to a port (counters counts
818       RoCE/UD/RC traffic) [#accel]_.
819     - Acceleration
820
821   * - `rx_vport_rdma_multicast_bytes`
822     - RDMA multicast bytes received, steered to a port (counters counts
823       RoCE/UD/RC traffic) [#accel]_.
824     - Acceleration
825
826   * - `tx_vport_rdma_multicast_packets`
827     - RDMA multicast packets transmitted, steered from a port (counters counts
828       RoCE/UD/RC traffic) [#accel]_.
829     - Acceleration
830
831   * - `tx_vport_rdma_multicast_bytes`
832     - RDMA multicast bytes transmitted, steered from a port (counters counts
833       RoCE/UD/RC traffic) [#accel]_.
834     - Acceleration
835
836   * - `vport_loopback_packets`
837     - Unicast, multicast and broadcast packets that were loop-back (received
838       and transmitted), IB/Eth  [#accel]_.
839     - Acceleration
840
841   * - `vport_loopback_bytes`
842     - Unicast, multicast and broadcast bytes that were loop-back (received
843       and transmitted), IB/Eth  [#accel]_.
844     - Acceleration
845
846   * - `rx_steer_missed_packets`
847     - Number of packets that was received by the NIC, however was discarded
848       because it did not match any flow in the NIC flow table.
849     - Error
850
851   * - `rx_packets`
852     - Representor only: packets received, that were handled by the hypervisor.
853     - Informative
854
855   * - `rx_bytes`
856     - Representor only: bytes received, that were handled by the hypervisor.
857     - Informative
858
859   * - `tx_packets`
860     - Representor only: packets transmitted, that were handled by the
861       hypervisor.
862     - Informative
863
864   * - `tx_bytes`
865     - Representor only: bytes transmitted, that were handled by the hypervisor.
866     - Informative
867
868   * - `dev_internal_queue_oob`
869     - The number of dropped packets due to lack of receive WQEs for an internal
870       device RQ.
871     - Error
872
873Physical Port Counters
874----------------------
875The physical port counters are the counters on the external port connecting the
876adapter to the network. This measuring point holds information on standardized
877counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters
878like flow control, FEC and more.
879
880.. flat-table:: Physical Port Counter Table
881   :widths: 2 3 1
882
883   * - Counter
884     - Description
885     - Type
886
887   * - `rx_packets_phy`
888     - The number of packets received on the physical port. This counter doesn’t
889       include packets that were discarded due to FCS, frame size and similar
890       errors.
891     - Informative
892
893   * - `tx_packets_phy`
894     - The number of packets transmitted on the physical port.
895     - Informative
896
897   * - `rx_bytes_phy`
898     - The number of bytes received on the physical port, including Ethernet
899       header and FCS.
900     - Informative
901
902   * - `tx_bytes_phy`
903     - The number of bytes transmitted on the physical port.
904     - Informative
905
906   * - `rx_multicast_phy`
907     - The number of multicast packets received on the physical port.
908     - Informative
909
910   * - `tx_multicast_phy`
911     - The number of multicast packets transmitted on the physical port.
912     - Informative
913
914   * - `rx_broadcast_phy`
915     - The number of broadcast packets received on the physical port.
916     - Informative
917
918   * - `tx_broadcast_phy`
919     - The number of broadcast packets transmitted on the physical port.
920     - Informative
921
922   * - `rx_crc_errors_phy`
923     - The number of dropped received packets due to FCS (Frame Check Sequence)
924       error on the physical port. If this counter is increased in high rate,
925       check the link quality using `rx_symbol_error_phy` and
926       `rx_corrected_bits_phy` counters below.
927     - Error
928
929   * - `rx_in_range_len_errors_phy`
930     - The number of received packets dropped due to length/type errors on a
931       physical port.
932     - Error
933
934   * - `rx_out_of_range_len_phy`
935     - The number of received packets dropped due to length greater than allowed
936       on a physical port. If this counter is increasing, it implies that the
937       peer connected to the adapter has a larger MTU configured. Using same MTU
938       configuration shall resolve this issue.
939     - Error
940
941   * - `rx_oversize_pkts_phy`
942     - The number of dropped received packets due to length which exceed MTU
943       size on a physical port. If this counter is increasing, it implies that
944       the peer connected to the adapter has a larger MTU configured. Using same
945       MTU configuration shall resolve this issue.
946     - Error
947
948   * - `rx_symbol_err_phy`
949     - The number of received packets dropped due to physical coding errors
950       (symbol errors) on a physical port.
951     - Error
952
953   * - `rx_mac_control_phy`
954     - The number of MAC control packets received on the physical port.
955     - Informative
956
957   * - `tx_mac_control_phy`
958     - The number of MAC control packets transmitted on the physical port.
959     - Informative
960
961   * - `rx_pause_ctrl_phy`
962     - The number of link layer pause packets received on a physical port. If
963       this counter is increasing, it implies that the network is congested and
964       cannot absorb the traffic coming from to the adapter.
965     - Informative
966
967   * - `tx_pause_ctrl_phy`
968     - The number of link layer pause packets transmitted on a physical port. If
969       this counter is increasing, it implies that the NIC is congested and
970       cannot absorb the traffic coming from the network.
971     - Informative
972
973   * - `rx_unsupported_op_phy`
974     - The number of MAC control packets received with unsupported opcode on a
975       physical port.
976     - Error
977
978   * - `rx_discards_phy`
979     - The number of received packets dropped due to lack of buffers on a
980       physical port. If this counter is increasing, it implies that the adapter
981       is congested and cannot absorb the traffic coming from the network.
982     - Error
983
984   * - `tx_discards_phy`
985     - The number of packets which were discarded on transmission, even no
986       errors were detected. the drop might occur due to link in down state,
987       head of line drop, pause from the network, etc.
988     - Error
989
990   * - `tx_errors_phy`
991     - The number of transmitted packets dropped due to a length which exceed
992       MTU size on a physical port.
993     - Error
994
995   * - `rx_undersize_pkts_phy`
996     - The number of received packets dropped due to length which is shorter
997       than 64 bytes on a physical port. If this counter is increasing, it
998       implies that the peer connected to the adapter has a non-standard MTU
999       configured or malformed packet had arrived.
1000     - Error
1001
1002   * - `rx_fragments_phy`
1003     - The number of received packets dropped due to a length which is shorter
1004       than 64 bytes and has FCS error on a physical port. If this counter is
1005       increasing, it implies that the peer connected to the adapter has a
1006       non-standard MTU configured.
1007     - Error
1008
1009   * - `rx_jabbers_phy`
1010     - The number of received packets d due to a length which is longer than 64
1011       bytes and had FCS error on a physical port.
1012     - Error
1013
1014   * - `rx_64_bytes_phy`
1015     - The number of packets received on the physical port with size of 64 bytes.
1016     - Informative
1017
1018   * - `rx_65_to_127_bytes_phy`
1019     - The number of packets received on the physical port with size of 65 to
1020       127 bytes.
1021     - Informative
1022
1023   * - `rx_128_to_255_bytes_phy`
1024     - The number of packets received on the physical port with size of 128 to
1025       255 bytes.
1026     - Informative
1027
1028   * - `rx_256_to_511_bytes_phy`
1029     - The number of packets received on the physical port with size of 256 to
1030       512 bytes.
1031     - Informative
1032
1033   * - `rx_512_to_1023_bytes_phy`
1034     - The number of packets received on the physical port with size of 512 to
1035       1023 bytes.
1036     - Informative
1037
1038   * - `rx_1024_to_1518_bytes_phy`
1039     - The number of packets received on the physical port with size of 1024 to
1040       1518 bytes.
1041     - Informative
1042
1043   * - `rx_1519_to_2047_bytes_phy`
1044     - The number of packets received on the physical port with size of 1519 to
1045       2047 bytes.
1046     - Informative
1047
1048   * - `rx_2048_to_4095_bytes_phy`
1049     - The number of packets received on the physical port with size of 2048 to
1050       4095 bytes.
1051     - Informative
1052
1053   * - `rx_4096_to_8191_bytes_phy`
1054     - The number of packets received on the physical port with size of 4096 to
1055       8191 bytes.
1056     - Informative
1057
1058   * - `rx_8192_to_10239_bytes_phy`
1059     - The number of packets received on the physical port with size of 8192 to
1060       10239 bytes.
1061     - Informative
1062
1063   * - `link_down_events_phy`
1064     - The number of times where the link operative state changed to down. In
1065       case this counter is increasing it may imply on port flapping. You may
1066       need to replace the cable/transceiver.
1067     - Error
1068
1069   * - `rx_out_of_buffer`
1070     - Number of times receive queue had no software buffers allocated for the
1071       adapter's incoming traffic.
1072     - Error
1073
1074   * - `module_bus_stuck`
1075     - The number of times that module's I\ :sup:`2`\C bus (data or clock)
1076       short-wire was detected. You may need to replace the cable/transceiver.
1077     - Error
1078
1079   * - `module_high_temp`
1080     - The number of times that the module temperature was too high. If this
1081       issue persist, you may need to check the ambient temperature or replace
1082       the cable/transceiver module.
1083     - Error
1084
1085   * - `module_bad_shorted`
1086     - The number of times that the module cables were shorted. You may need to
1087       replace the cable/transceiver module.
1088     - Error
1089
1090   * - `module_unplug`
1091     - The number of times that module was ejected.
1092     - Informative
1093
1094   * - `rx_buffer_passed_thres_phy`
1095     - The number of events where the port receive buffer was over 85% full.
1096     - Informative
1097
1098   * - `tx_pause_storm_warning_events`
1099     - The number of times the device was sending pauses for a long period of
1100       time.
1101     - Informative
1102
1103   * - `tx_pause_storm_error_events`
1104     - The number of times the device was sending pauses for a long period of
1105       time, reaching time out and disabling transmission of pause frames. on
1106       the period where pause frames were disabled, drop could have been
1107       occurred.
1108     - Error
1109
1110   * - `rx[i]_buff_alloc_err`
1111     - Failed to allocate a buffer to received packet (or SKB) on ring i.
1112     - Error
1113
1114   * - `rx_bits_phy`
1115     - This counter provides information on the total amount of traffic that
1116       could have been received and can be used as a guideline to measure the
1117       ratio of errored traffic in `rx_pcs_symbol_err_phy` and
1118       `rx_corrected_bits_phy`.
1119     - Informative
1120
1121   * - `rx_pcs_symbol_err_phy`
1122     - This counter counts the number of symbol errors that wasn’t corrected by
1123       FEC correction algorithm or that FEC algorithm was not active on this
1124       interface. If this counter is increasing, it implies that the link
1125       between the NIC and the network is suffering from high BER, and that
1126       traffic is lost. You may need to replace the cable/transceiver. The error
1127       rate is the number of `rx_pcs_symbol_err_phy` divided by the number of
1128       `rx_bits_phy` on a specific time frame.
1129     - Error
1130
1131   * - `rx_corrected_bits_phy`
1132     - The number of corrected bits on this port according to active FEC
1133       (RS/FC). If this counter is increasing, it implies that the link between
1134       the NIC and the network is suffering from high BER. The corrected bit
1135       rate is the number of `rx_corrected_bits_phy` divided by the number of
1136       `rx_bits_phy` on a specific time frame.
1137     - Error
1138
1139   * - `rx_err_lane_[l]_phy`
1140     - This counter counts the number of physical raw errors per lane l index.
1141       The counter counts errors before FEC corrections. If this counter is
1142       increasing, it implies that the link between the NIC and the network is
1143       suffering from high BER, and that traffic might be lost. You may need to
1144       replace the cable/transceiver. Please check in accordance with
1145       `rx_corrected_bits_phy`.
1146     - Error
1147
1148   * - `rx_global_pause`
1149     - The number of pause packets received on the physical port. If this
1150       counter is increasing, it implies that the network is congested and
1151       cannot absorb the traffic coming from the adapter. Note: This counter is
1152       only enabled when global pause mode is enabled.
1153     - Informative
1154
1155   * - `rx_global_pause_duration`
1156     - The duration of pause received (in microSec) on the physical port. The
1157       counter represents the time the port did not send any traffic. If this
1158       counter is increasing, it implies that the network is congested and
1159       cannot absorb the traffic coming from the adapter. Note: This counter is
1160       only enabled when global pause mode is enabled.
1161     - Informative
1162
1163   * - `tx_global_pause`
1164     - The number of pause packets transmitted on a physical port. If this
1165       counter is increasing, it implies that the adapter is congested and
1166       cannot absorb the traffic coming from the network. Note: This counter is
1167       only enabled when global pause mode is enabled.
1168     - Informative
1169
1170   * - `tx_global_pause_duration`
1171     - The duration of pause transmitter (in microSec) on the physical port.
1172       Note: This counter is only enabled when global pause mode is enabled.
1173     - Informative
1174
1175   * - `rx_global_pause_transition`
1176     - The number of times a transition from Xoff to Xon on the physical port
1177       has occurred. Note: This counter is only enabled when global pause mode
1178       is enabled.
1179     - Informative
1180
1181   * - `rx_if_down_packets`
1182     - The number of received packets that were dropped due to interface down.
1183     - Informative
1184
1185Priority Port Counters
1186----------------------
1187The following counters are physical port counters that are counted per L2
1188priority (0-7).
1189
1190**Note:** `p` in the counter name represents the priority.
1191
1192.. flat-table:: Priority Port Counter Table
1193   :widths: 2 3 1
1194
1195   * - Counter
1196     - Description
1197     - Type
1198
1199   * - `rx_prio[p]_bytes`
1200     - The number of bytes received with priority p on the physical port.
1201     - Informative
1202
1203   * - `rx_prio[p]_packets`
1204     - The number of packets received with priority p on the physical port.
1205     - Informative
1206
1207   * - `tx_prio[p]_bytes`
1208     - The number of bytes transmitted on priority p on the physical port.
1209     - Informative
1210
1211   * - `tx_prio[p]_packets`
1212     - The number of packets transmitted on priority p on the physical port.
1213     - Informative
1214
1215   * - `rx_prio[p]_pause`
1216     - The number of pause packets received with priority p on a physical port.
1217       If this counter is increasing, it implies that the network is congested
1218       and cannot absorb the traffic coming from the adapter. Note: This counter
1219       is available only if PFC was enabled on priority p.
1220     - Informative
1221
1222   * - `rx_prio[p]_pause_duration`
1223     - The duration of pause received (in microSec) on priority p on the
1224       physical port. The counter represents the time the port did not send any
1225       traffic on this priority. If this counter is increasing, it implies that
1226       the network is congested and cannot absorb the traffic coming from the
1227       adapter. Note: This counter is available only if PFC was enabled on
1228       priority p.
1229     - Informative
1230
1231   * - `rx_prio[p]_pause_transition`
1232     - The number of times a transition from Xoff to Xon on priority p on the
1233       physical port has occurred. Note: This counter is available only if PFC
1234       was enabled on priority p.
1235     - Informative
1236
1237   * - `tx_prio[p]_pause`
1238     - The number of pause packets transmitted on priority p on a physical port.
1239       If this counter is increasing, it implies that the adapter is congested
1240       and cannot absorb the traffic coming from the network. Note: This counter
1241       is available only if PFC was enabled on priority p.
1242     - Informative
1243
1244   * - `tx_prio[p]_pause_duration`
1245     - The duration of pause transmitter (in microSec) on priority p on the
1246       physical port. Note: This counter is available only if PFC was enabled on
1247       priority p.
1248     - Informative
1249
1250   * - `rx_prio[p]_buf_discard`
1251     - The number of packets discarded by device due to lack of per host receive
1252       buffers.
1253     - Informative
1254
1255   * - `rx_prio[p]_cong_discard`
1256     - The number of packets discarded by device due to per host congestion.
1257     - Informative
1258
1259   * - `rx_prio[p]_marked`
1260     - The number of packets ecn marked by device due to per host congestion.
1261     - Informative
1262
1263   * - `rx_prio[p]_discards`
1264     - The number of packets discarded by device due to lack of receive buffers.
1265     - Informative
1266
1267Device Counters
1268---------------
1269.. flat-table:: Device Counter Table
1270   :widths: 2 3 1
1271
1272   * - Counter
1273     - Description
1274     - Type
1275
1276   * - `rx_pci_signal_integrity`
1277     - Counts physical layer PCIe signal integrity errors, the number of
1278       transitions to recovery due to Framing errors and CRC (dlp and tlp). If
1279       this counter is raising, try moving the adapter card to a different slot
1280       to rule out a bad PCI slot. Validate that you are running with the latest
1281       firmware available and latest server BIOS version.
1282     - Error
1283
1284   * - `tx_pci_signal_integrity`
1285     - Counts physical layer PCIe signal integrity errors, the number of
1286       transition to recovery initiated by the other side (moving to recovery
1287       due to getting TS/EIEOS). If this counter is raising, try moving the
1288       adapter card to a different slot to rule out a bad PCI slot. Validate
1289       that you are running with the latest firmware available and latest server
1290       BIOS version.
1291     - Error
1292
1293   * - `outbound_pci_buffer_overflow`
1294     - The number of packets dropped due to pci buffer overflow. If this counter
1295       is raising in high rate, it might indicate that the receive traffic rate
1296       for a host is larger than the PCIe bus and therefore a congestion occurs.
1297     - Informative
1298
1299   * - `outbound_pci_stalled_rd`
1300     - The percentage (in the range 0...100) of time within the last second that
1301       the NIC had outbound non-posted reads requests but could not perform the
1302       operation due to insufficient posted credits.
1303     - Informative
1304
1305   * - `outbound_pci_stalled_wr`
1306     - The percentage (in the range 0...100) of time within the last second that
1307       the NIC had outbound posted writes requests but could not perform the
1308       operation due to insufficient posted credits.
1309     - Informative
1310
1311   * - `outbound_pci_stalled_rd_events`
1312     - The number of seconds where `outbound_pci_stalled_rd` was above 30%.
1313     - Informative
1314
1315   * - `outbound_pci_stalled_wr_events`
1316     - The number of seconds where `outbound_pci_stalled_wr` was above 30%.
1317     - Informative
1318
1319   * - `dev_out_of_buffer`
1320     - The number of times the device owned queue had not enough buffers
1321       allocated.
1322     - Error
1323