xref: /linux/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counters.rst (revision 001821b0e79716c4e17c71d8e053a23599a7a508)
1.. SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
2.. include:: <isonum.txt>
3
4================
5Ethtool counters
6================
7
8:Copyright: |copy| 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
9
10Contents
11========
12
13- `Overview`_
14- `Groups`_
15- `Types`_
16- `Descriptions`_
17
18Overview
19========
20
21There are several counter groups based on where the counter is being counted. In
22addition, each group of counters may have different counter types.
23
24These counter groups are based on which component in a networking setup,
25illustrated below, that they describe::
26
27                                                  ----------------------------------------
28                                                  |                                      |
29    ----------------------------------------    ---------------------------------------- |
30    |              Hypervisor              |    |                  VM                  | |
31    |                                      |    |                                      | |
32    | -------------------  --------------- |    | -------------------  --------------- | |
33    | | Ethernet driver |  | RDMA driver | |    | | Ethernet driver |  | RDMA driver | | |
34    | -------------------  --------------- |    | -------------------  --------------- | |
35    |           |                 |        |    |           |                 |        | |
36    |           -------------------        |    |           -------------------        | |
37    |                   |                  |    |                   |                  |--
38    ----------------------------------------    ----------------------------------------
39                        |                                           |
40            -------------               -----------------------------
41            |                           |
42         ------                      ------ ------ ------         ------      ------      ------
43    -----| PF |----------------------| VF |-| VF |-| VF |-----  --| PF |--- --| PF |--- --| PF |---
44    |    ------                      ------ ------ ------    |  | ------  | | ------  | | ------  |
45    |                                                        |  |         | |         | |         |
46    |                                                        |  |         | |         | |         |
47    |                                                        |  |         | |         | |         |
48    | eSwitch                                                |  | eSwitch | | eSwitch | | eSwitch |
49    ----------------------------------------------------------  ----------- ----------- -----------
50               -------------------------------------------------------------------------------
51               |                                                                             |
52               |                                                                             |
53               | Uplink (no counters)                                                        |
54               -------------------------------------------------------------------------------
55                       ---------------------------------------------------------------
56                       |                                                             |
57                       |                                                             |
58                       | MPFS (no counters)                                          |
59                       ---------------------------------------------------------------
60                                                     |
61                                                     |
62                                                     | Port
63
64Groups
65======
66
67Ring
68  Software counters populated by the driver stack.
69
70Netdev
71  An aggregation of software ring counters.
72
73vPort counters
74  Traffic counters and drops due to steering or no buffers. May indicate issues
75  with NIC. These counters include Ethernet traffic counters (including Raw
76  Ethernet) and RDMA/RoCE traffic counters.
77
78Physical port counters
79  Counters that collect statistics about the PFs and VFs. May indicate issues
80  with NIC, link, or network. This measuring point holds information on
81  standardized counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and
82  additional counters like flow control, FEC and more. Physical port counters
83  are not exposed to virtual machines.
84
85Priority Port Counters
86  A set of the physical port counters, per priority per port.
87
88Types
89=====
90
91Counters are divided into three types.
92
93Traffic Informative Counters
94  Counters which count traffic. These counters can be used for load estimation
95  or for general debug.
96
97Traffic Acceleration Counters
98  Counters which count traffic that was accelerated by Mellanox driver or by
99  hardware. The counters are an additional layer to the informative counter set,
100  and the same traffic is counted in both informative and acceleration counters.
101
102.. [#accel] Traffic acceleration counter.
103
104Error Counters
105  Increment of these counters might indicate a problem. Each of these counters
106  has an explanation and correction action.
107
108Statistic can be fetched via the `ip link` or `ethtool` commands. `ethtool`
109provides more detailed information.::
110
111    ip –s link show <if-name>
112    ethtool -S <if-name>
113
114Descriptions
115============
116
117XSK, PTP, and QoS counters that are similar to counters defined previously will
118not be separately listed. For example, `ptp_tx[i]_packets` will not be
119explicitly documented since `tx[i]_packets` describes the behavior of both
120counters, except `ptp_tx[i]_packets` is only counted when precision time
121protocol is used.
122
123Ring / Netdev Counter
124----------------------------
125The following counters are available per ring or software port.
126
127These counters provide information on the amount of traffic that was accelerated
128by the NIC. The counters are counting the accelerated traffic in addition to the
129standard counters which counts it (i.e. accelerated traffic is counted twice).
130
131The counter names in the table below refers to both ring and port counters. The
132notation for ring counters includes the [i] index without the braces. The
133notation for port counters doesn't include the [i]. A counter name
134`rx[i]_packets` will be printed as `rx0_packets` for ring 0 and `rx_packets` for
135the software port.
136
137.. flat-table:: Ring / Software Port Counter Table
138   :widths: 2 3 1
139
140   * - Counter
141     - Description
142     - Type
143
144   * - `rx[i]_packets`
145     - The number of packets received on ring i.
146     - Informative
147
148   * - `rx[i]_bytes`
149     - The number of bytes received on ring i.
150     - Informative
151
152   * - `tx[i]_packets`
153     - The number of packets transmitted on ring i.
154     - Informative
155
156   * - `tx[i]_bytes`
157     - The number of bytes transmitted on ring i.
158     - Informative
159
160   * - `tx[i]_recover`
161     - The number of times the SQ was recovered.
162     - Error
163
164   * - `tx[i]_cqes`
165     - Number of CQEs events on SQ issued on ring i.
166     - Informative
167
168   * - `tx[i]_cqe_err`
169     - The number of error CQEs encountered on the SQ for ring i.
170     - Error
171
172   * - `tx[i]_tso_packets`
173     - The number of TSO packets transmitted on ring i [#accel]_.
174     - Acceleration
175
176   * - `tx[i]_tso_bytes`
177     - The number of TSO bytes transmitted on ring i [#accel]_.
178     - Acceleration
179
180   * - `tx[i]_tso_inner_packets`
181     - The number of TSO packets which are indicated to be carry internal
182       encapsulation transmitted on ring i [#accel]_.
183     - Acceleration
184
185   * - `tx[i]_tso_inner_bytes`
186     - The number of TSO bytes which are indicated to be carry internal
187       encapsulation transmitted on ring i [#accel]_.
188     - Acceleration
189
190   * - `rx[i]_gro_packets`
191     - Number of received packets processed using hardware-accelerated GRO. The
192       number of hardware GRO offloaded packets received on ring i.
193     - Acceleration
194
195   * - `rx[i]_gro_bytes`
196     - Number of received bytes processed using hardware-accelerated GRO. The
197       number of hardware GRO offloaded bytes received on ring i.
198     - Acceleration
199
200   * - `rx[i]_gro_skbs`
201     - The number of receive SKBs constructed while performing
202       hardware-accelerated GRO.
203     - Informative
204
205   * - `rx[i]_gro_match_packets`
206     - Number of received packets processed using hardware-accelerated GRO that
207       met the flow table match criteria.
208     - Informative
209
210   * - `rx[i]_gro_large_hds`
211     - Number of receive packets using hardware-accelerated GRO that have large
212       headers that require additional memory to be allocated.
213     - Informative
214
215   * - `rx[i]_lro_packets`
216     - The number of LRO packets received on ring i [#accel]_.
217     - Acceleration
218
219   * - `rx[i]_lro_bytes`
220     - The number of LRO bytes received on ring i [#accel]_.
221     - Acceleration
222
223   * - `rx[i]_ecn_mark`
224     - The number of received packets where the ECN mark was turned on.
225     - Informative
226
227   * - `rx_oversize_pkts_buffer`
228     - The number of dropped received packets due to length which arrived to RQ
229       and exceed software buffer size allocated by the device for incoming
230       traffic. It might imply that the device MTU is larger than the software
231       buffers size.
232     - Error
233
234   * - `rx_oversize_pkts_sw_drop`
235     - Number of received packets dropped in software because the CQE data is
236       larger than the MTU size.
237     - Error
238
239   * - `rx[i]_csum_unnecessary`
240     - Packets received with a `CHECKSUM_UNNECESSARY` on ring i [#accel]_.
241     - Acceleration
242
243   * - `rx[i]_csum_unnecessary_inner`
244     - Packets received with inner encapsulation with a `CHECKSUM_UNNECESSARY`
245       on ring i [#accel]_.
246     - Acceleration
247
248   * - `rx[i]_csum_none`
249     - Packets received with a `CHECKSUM_NONE` on ring i [#accel]_.
250     - Acceleration
251
252   * - `rx[i]_csum_complete`
253     - Packets received with a `CHECKSUM_COMPLETE` on ring i [#accel]_.
254     - Acceleration
255
256   * - `rx[i]_csum_complete_tail`
257     - Number of received packets that had checksum calculation computed,
258       potentially needed padding, and were able to do so with
259       `CHECKSUM_PARTIAL`.
260     - Informative
261
262   * - `rx[i]_csum_complete_tail_slow`
263     - Number of received packets that need padding larger than eight bytes for
264       the checksum.
265     - Informative
266
267   * - `tx[i]_csum_partial`
268     - Packets transmitted with a `CHECKSUM_PARTIAL` on ring i [#accel]_.
269     - Acceleration
270
271   * - `tx[i]_csum_partial_inner`
272     - Packets transmitted with inner encapsulation with a `CHECKSUM_PARTIAL` on
273       ring i [#accel]_.
274     - Acceleration
275
276   * - `tx[i]_csum_none`
277     - Packets transmitted with no hardware checksum acceleration on ring i.
278     - Informative
279
280   * - `tx[i]_stopped` / `tx_queue_stopped` [#ring_global]_
281     - Events where SQ was full on ring i. If this counter is increased, check
282       the amount of buffers allocated for transmission.
283     - Informative
284
285   * - `tx[i]_wake` / `tx_queue_wake` [#ring_global]_
286     - Events where SQ was full and has become not full on ring i.
287     - Informative
288
289   * - `tx[i]_dropped` / `tx_queue_dropped` [#ring_global]_
290     - Packets transmitted that were dropped due to DMA mapping failure on
291       ring i. If this counter is increased, check the amount of buffers
292       allocated for transmission.
293     - Error
294
295   * - `tx[i]_nop`
296     - The number of nop WQEs (empty WQEs) inserted to the SQ (related to
297       ring i) due to the reach of the end of the cyclic buffer. When reaching
298       near to the end of cyclic buffer the driver may add those empty WQEs to
299       avoid handling a state the a WQE start in the end of the queue and ends
300       in the beginning of the queue. This is a normal condition.
301     - Informative
302
303   * - `tx[i]_timestamps`
304     - Transmitted packets that were hardware timestamped at the device's DMA
305       layer.
306     - Informative
307
308   * - `tx[i]_added_vlan_packets`
309     - The number of packets sent where vlan tag insertion was offloaded to the
310       hardware.
311     - Acceleration
312
313   * - `rx[i]_removed_vlan_packets`
314     - The number of packets received where vlan tag stripping was offloaded to
315       the hardware.
316     - Acceleration
317
318   * - `rx[i]_wqe_err`
319     - The number of wrong opcodes received on ring i.
320     - Error
321
322   * - `rx[i]_mpwqe_frag`
323     - The number of WQEs that failed to allocate compound page and hence
324       fragmented MPWQE’s (Multi Packet WQEs) were used on ring i. If this
325       counter raise, it may suggest that there is no enough memory for large
326       pages, the driver allocated fragmented pages. This is not abnormal
327       condition.
328     - Informative
329
330   * - `rx[i]_mpwqe_filler_cqes`
331     - The number of filler CQEs events that were issued on ring i.
332     - Informative
333
334   * - `rx[i]_mpwqe_filler_strides`
335     - The number of strides consumed by filler CQEs on ring i.
336     - Informative
337
338   * - `tx[i]_mpwqe_blks`
339     - The number of send blocks processed from Multi-Packet WQEs (mpwqe).
340     - Informative
341
342   * - `tx[i]_mpwqe_pkts`
343     - The number of send packets processed from Multi-Packet WQEs (mpwqe).
344     - Informative
345
346   * - `rx[i]_cqe_compress_blks`
347     - The number of receive blocks with CQE compression on ring i [#accel]_.
348     - Acceleration
349
350   * - `rx[i]_cqe_compress_pkts`
351     - The number of receive packets with CQE compression on ring i [#accel]_.
352     - Acceleration
353
354   * - `rx[i]_arfs_add`
355     - The number of aRFS flow rules added to the device for direct RQ steering
356       on ring i [#accel]_.
357     - Acceleration
358
359   * - `rx[i]_arfs_request_in`
360     - Number of flow rules that have been requested to move into ring i for
361       direct RQ steering [#accel]_.
362     - Acceleration
363
364   * - `rx[i]_arfs_request_out`
365     - Number of flow rules that have been requested to move out of ring i [#accel]_.
366     - Acceleration
367
368   * - `rx[i]_arfs_expired`
369     - Number of flow rules that have been expired and removed [#accel]_.
370     - Acceleration
371
372   * - `rx[i]_arfs_err`
373     - Number of flow rules that failed to be added to the flow table.
374     - Error
375
376   * - `rx[i]_recover`
377     - The number of times the RQ was recovered.
378     - Error
379
380   * - `tx[i]_xmit_more`
381     - The number of packets sent with `xmit_more` indication set on the skbuff
382       (no doorbell).
383     - Acceleration
384
385   * - `ch[i]_poll`
386     - The number of invocations of NAPI poll of channel i.
387     - Informative
388
389   * - `ch[i]_arm`
390     - The number of times the NAPI poll function completed and armed the
391       completion queues on channel i.
392     - Informative
393
394   * - `ch[i]_aff_change`
395     - The number of times the NAPI poll function explicitly stopped execution
396       on a CPU due to a change in affinity, on channel i.
397     - Informative
398
399   * - `ch[i]_events`
400     - The number of hard interrupt events on the completion queues of channel i.
401     - Informative
402
403   * - `ch[i]_eq_rearm`
404     - The number of times the EQ was recovered.
405     - Error
406
407   * - `ch[i]_force_irq`
408     - Number of times NAPI is triggered by XSK wakeups by posting a NOP to
409       ICOSQ.
410     - Acceleration
411
412   * - `rx[i]_congst_umr`
413     - The number of times an outstanding UMR request is delayed due to
414       congestion, on ring i.
415     - Informative
416
417   * - `rx_pp_alloc_fast`
418     - Number of successful fast path allocations.
419     - Informative
420
421   * - `rx_pp_alloc_slow`
422     - Number of slow path order-0 allocations.
423     - Informative
424
425   * - `rx_pp_alloc_slow_high_order`
426     - Number of slow path high order allocations.
427     - Informative
428
429   * - `rx_pp_alloc_empty`
430     - Counter is incremented when ptr ring is empty, so a slow path allocation
431       was forced.
432     - Informative
433
434   * - `rx_pp_alloc_refill`
435     - Counter is incremented when an allocation which triggered a refill of the
436       cache.
437     - Informative
438
439   * - `rx_pp_alloc_waive`
440     - Counter is incremented when pages obtained from the ptr ring that cannot
441       be added to the cache due to a NUMA mismatch.
442     - Informative
443
444   * - `rx_pp_recycle_cached`
445     - Counter is incremented when recycling placed page in the page pool cache.
446     - Informative
447
448   * - `rx_pp_recycle_cache_full`
449     - Counter is incremented when page pool cache was full.
450     - Informative
451
452   * - `rx_pp_recycle_ring`
453     - Counter is incremented when page placed into the ptr ring.
454     - Informative
455
456   * - `rx_pp_recycle_ring_full`
457     - Counter is incremented when page released from page pool because the ptr
458       ring was full.
459     - Informative
460
461   * - `rx_pp_recycle_released_ref`
462     - Counter is incremented when page released (and not recycled) because
463       refcnt > 1.
464     - Informative
465
466   * - `rx[i]_xsk_buff_alloc_err`
467     - The number of times allocating an skb or XSK buffer failed in the XSK RQ
468       context.
469     - Error
470
471   * - `rx[i]_xdp_tx_xmit`
472     - The number of packets forwarded back to the port due to XDP program
473       `XDP_TX` action (bouncing). these packets are not counted by other
474       software counters. These packets are counted by physical port and vPort
475       counters.
476     - Informative
477
478   * - `rx[i]_xdp_tx_mpwqe`
479     - Number of multi-packet WQEs transmitted by the netdev and `XDP_TX`-ed by
480       the netdev during the RQ context.
481     - Acceleration
482
483   * - `rx[i]_xdp_tx_inlnw`
484     - Number of WQE data segments transmitted where the data could be inlined
485       in the WQE and then `XDP_TX`-ed during the RQ context.
486     - Acceleration
487
488   * - `rx[i]_xdp_tx_nops`
489     - Number of NOP WQEBBs (WQE building blocks) received posted to the XDP SQ.
490     - Acceleration
491
492   * - `rx[i]_xdp_tx_full`
493     - The number of packets that should have been forwarded back to the port
494       due to `XDP_TX` action but were dropped due to full tx queue. These packets
495       are not counted by other software counters. These packets are counted by
496       physical port and vPort counters. You may open more rx queues and spread
497       traffic rx over all queues and/or increase rx ring size.
498     - Error
499
500   * - `rx[i]_xdp_tx_err`
501     - The number of times an `XDP_TX` error such as frame too long and frame
502       too short occurred on `XDP_TX` ring of RX ring.
503     - Error
504
505   * - `rx[i]_xdp_tx_cqes` / `rx_xdp_tx_cqe` [#ring_global]_
506     - The number of completions received on the CQ of the `XDP_TX` ring.
507     - Informative
508
509   * - `rx[i]_xdp_drop`
510     - The number of packets dropped due to XDP program `XDP_DROP` action. these
511       packets are not counted by other software counters. These packets are
512       counted by physical port and vPort counters.
513     - Informative
514
515   * - `rx[i]_xdp_redirect`
516     - The number of times an XDP redirect action was triggered on ring i.
517     - Acceleration
518
519   * - `tx[i]_xdp_xmit`
520     - The number of packets redirected to the interface(due to XDP redirect).
521       These packets are not counted by other software counters. These packets
522       are counted by physical port and vPort counters.
523     - Informative
524
525   * - `tx[i]_xdp_full`
526     - The number of packets redirected to the interface(due to XDP redirect),
527       but were dropped due to full tx queue. these packets are not counted by
528       other software counters. you may enlarge tx queues.
529     - Informative
530
531   * - `tx[i]_xdp_mpwqe`
532     - Number of multi-packet WQEs offloaded onto the NIC that were
533       `XDP_REDIRECT`-ed from other netdevs.
534     - Acceleration
535
536   * - `tx[i]_xdp_inlnw`
537     - Number of WQE data segments where the data could be inlined in the WQE
538       where the data segments were `XDP_REDIRECT`-ed from other netdevs.
539     - Acceleration
540
541   * - `tx[i]_xdp_nops`
542     - Number of NOP WQEBBs (WQE building blocks) posted to the SQ that were
543       `XDP_REDIRECT`-ed from other netdevs.
544     - Acceleration
545
546   * - `tx[i]_xdp_err`
547     - The number of packets redirected to the interface(due to XDP redirect)
548       but were dropped due to error such as frame too long and frame too short.
549     - Error
550
551   * - `tx[i]_xdp_cqes`
552     - The number of completions received for packets redirected to the
553       interface(due to XDP redirect) on the CQ.
554     - Informative
555
556   * - `tx[i]_xsk_xmit`
557     - The number of packets transmitted using XSK zerocopy functionality.
558     - Acceleration
559
560   * - `tx[i]_xsk_mpwqe`
561     - Number of multi-packet WQEs offloaded onto the NIC that were
562       `XDP_REDIRECT`-ed from other netdevs.
563     - Acceleration
564
565   * - `tx[i]_xsk_inlnw`
566     - Number of WQE data segments where the data could be inlined in the WQE
567       that are transmitted using XSK zerocopy.
568     - Acceleration
569
570   * - `tx[i]_xsk_full`
571     - Number of times doorbell is rung in XSK zerocopy mode when SQ is full.
572     - Error
573
574   * - `tx[i]_xsk_err`
575     - Number of errors that occurred in XSK zerocopy mode such as if the data
576       size is larger than the MTU size.
577     - Error
578
579   * - `tx[i]_xsk_cqes`
580     - Number of CQEs processed in XSK zerocopy mode.
581     - Acceleration
582
583   * - `tx_tls_ctx`
584     - Number of TLS TX HW offload contexts added to device for encryption.
585     - Acceleration
586
587   * - `tx_tls_del`
588     - Number of TLS TX HW offload contexts removed from device (connection
589       closed).
590     - Acceleration
591
592   * - `tx_tls_pool_alloc`
593     - Number of times a unit of work is successfully allocated in the TLS HW
594       offload pool.
595     - Acceleration
596
597   * - `tx_tls_pool_free`
598     - Number of times a unit of work is freed in the TLS HW offload pool.
599     - Acceleration
600
601   * - `rx_tls_ctx`
602     - Number of TLS RX HW offload contexts added to device for decryption.
603     - Acceleration
604
605   * - `rx_tls_del`
606     - Number of TLS RX HW offload contexts deleted from device (connection has
607       finished).
608     - Acceleration
609
610   * - `rx[i]_tls_decrypted_packets`
611     - Number of successfully decrypted RX packets which were part of a TLS
612       stream.
613     - Acceleration
614
615   * - `rx[i]_tls_decrypted_bytes`
616     - Number of TLS payload bytes in RX packets which were successfully
617       decrypted.
618     - Acceleration
619
620   * - `rx[i]_tls_resync_req_pkt`
621     - Number of received TLS packets with a resync request.
622     - Acceleration
623
624   * - `rx[i]_tls_resync_req_start`
625     - Number of times the TLS async resync request was started.
626     - Acceleration
627
628   * - `rx[i]_tls_resync_req_end`
629     - Number of times the TLS async resync request properly ended with
630       providing the HW tracked tcp-seq.
631     - Acceleration
632
633   * - `rx[i]_tls_resync_req_skip`
634     - Number of times the TLS async resync request procedure was started but
635       not properly ended.
636     - Error
637
638   * - `rx[i]_tls_resync_res_ok`
639     - Number of times the TLS resync response call to the driver was
640       successfully handled.
641     - Acceleration
642
643   * - `rx[i]_tls_resync_res_retry`
644     - Number of times the TLS resync response call to the driver was
645       reattempted when ICOSQ is full.
646     - Error
647
648   * - `rx[i]_tls_resync_res_skip`
649     - Number of times the TLS resync response call to the driver was terminated
650       unsuccessfully.
651     - Error
652
653   * - `rx[i]_tls_err`
654     - Number of times when CQE TLS offload was problematic.
655     - Error
656
657   * - `tx[i]_tls_encrypted_packets`
658     - The number of send packets that are TLS encrypted by the kernel.
659     - Acceleration
660
661   * - `tx[i]_tls_encrypted_bytes`
662     - The number of send bytes that are TLS encrypted by the kernel.
663     - Acceleration
664
665   * - `tx[i]_tls_ooo`
666     - Number of times out of order TLS SQE fragments were handled on ring i.
667     - Acceleration
668
669   * - `tx[i]_tls_dump_packets`
670     - Number of TLS decrypted packets copied over from NIC over DMA.
671     - Acceleration
672
673   * - `tx[i]_tls_dump_bytes`
674     - Number of TLS decrypted bytes copied over from NIC over DMA.
675     - Acceleration
676
677   * - `tx[i]_tls_resync_bytes`
678     - Number of TLS bytes requested to be resynchronized in order to be
679       decrypted.
680     - Acceleration
681
682   * - `tx[i]_tls_skip_no_sync_data`
683     - Number of TLS send data that can safely be skipped / do not need to be
684       decrypted.
685     - Acceleration
686
687   * - `tx[i]_tls_drop_no_sync_data`
688     - Number of TLS send data that were dropped due to retransmission of TLS
689       data.
690     - Acceleration
691
692   * - `ptp_cq[i]_abort`
693     - Number of times a CQE has to be skipped in precision time protocol due to
694       a skew between the port timestamp and CQE timestamp being greater than
695       128 seconds.
696     - Error
697
698   * - `ptp_cq[i]_abort_abs_diff_ns`
699     - Accumulation of time differences between the port timestamp and CQE
700       timestamp when the difference is greater than 128 seconds in precision
701       time protocol.
702     - Error
703
704   * - `ptp_cq[i]_late_cqe`
705     - Number of times a CQE has been delivered on the PTP timestamping CQ when
706       the CQE was not expected since a certain amount of time had elapsed where
707       the device typically ensures not posting the CQE.
708     - Error
709
710   * - `ptp_cq[i]_lost_cqe`
711     - Number of times a CQE is expected to not be delivered on the PTP
712       timestamping CQE by the device due to a time delta elapsing. If such a
713       CQE is somehow delivered, `ptp_cq[i]_late_cqe` is incremented.
714     - Error
715
716.. [#ring_global] The corresponding ring and global counters do not share the
717                  same name (i.e. do not follow the common naming scheme).
718
719vPort Counters
720--------------
721Counters on the NIC port that is connected to a eSwitch.
722
723.. flat-table:: vPort Counter Table
724   :widths: 2 3 1
725
726   * - Counter
727     - Description
728     - Type
729
730   * - `rx_vport_unicast_packets`
731     - Unicast packets received, steered to a port including Raw Ethernet
732       QP/DPDK traffic, excluding RDMA traffic.
733     - Informative
734
735   * - `rx_vport_unicast_bytes`
736     - Unicast bytes received, steered to a port including Raw Ethernet QP/DPDK
737       traffic, excluding RDMA traffic.
738     - Informative
739
740   * - `tx_vport_unicast_packets`
741     - Unicast packets transmitted, steered from a port including Raw Ethernet
742       QP/DPDK traffic, excluding RDMA traffic.
743     - Informative
744
745   * - `tx_vport_unicast_bytes`
746     - Unicast bytes transmitted, steered from a port including Raw Ethernet
747       QP/DPDK traffic, excluding RDMA traffic.
748     - Informative
749
750   * - `rx_vport_multicast_packets`
751     - Multicast packets received, steered to a port including Raw Ethernet
752       QP/DPDK traffic, excluding RDMA traffic.
753     - Informative
754
755   * - `rx_vport_multicast_bytes`
756     - Multicast bytes received, steered to a port including Raw Ethernet
757       QP/DPDK traffic, excluding RDMA traffic.
758     - Informative
759
760   * - `tx_vport_multicast_packets`
761     - Multicast packets transmitted, steered from a port including Raw Ethernet
762       QP/DPDK traffic, excluding RDMA traffic.
763     - Informative
764
765   * - `tx_vport_multicast_bytes`
766     - Multicast bytes transmitted, steered from a port including Raw Ethernet
767       QP/DPDK traffic, excluding RDMA traffic.
768     - Informative
769
770   * - `rx_vport_broadcast_packets`
771     - Broadcast packets received, steered to a port including Raw Ethernet
772       QP/DPDK traffic, excluding RDMA traffic.
773     - Informative
774
775   * - `rx_vport_broadcast_bytes`
776     - Broadcast bytes received, steered to a port including Raw Ethernet
777       QP/DPDK traffic, excluding RDMA traffic.
778     - Informative
779
780   * - `tx_vport_broadcast_packets`
781     - Broadcast packets transmitted, steered from a port including Raw Ethernet
782       QP/DPDK traffic, excluding RDMA traffic.
783     - Informative
784
785   * - `tx_vport_broadcast_bytes`
786     - Broadcast bytes transmitted, steered from a port including Raw Ethernet
787       QP/DPDK traffic, excluding RDMA traffic.
788     - Informative
789
790   * - `rx_vport_rdma_unicast_packets`
791     - RDMA unicast packets received, steered to a port (counters counts
792       RoCE/UD/RC traffic) [#accel]_.
793     - Acceleration
794
795   * - `rx_vport_rdma_unicast_bytes`
796     - RDMA unicast bytes received, steered to a port (counters counts
797       RoCE/UD/RC traffic) [#accel]_.
798     - Acceleration
799
800   * - `tx_vport_rdma_unicast_packets`
801     - RDMA unicast packets transmitted, steered from a port (counters counts
802       RoCE/UD/RC traffic) [#accel]_.
803     - Acceleration
804
805   * - `tx_vport_rdma_unicast_bytes`
806     - RDMA unicast bytes transmitted, steered from a port (counters counts
807       RoCE/UD/RC traffic) [#accel]_.
808     - Acceleration
809
810   * - `rx_vport_rdma_multicast_packets`
811     - RDMA multicast packets received, steered to a port (counters counts
812       RoCE/UD/RC traffic) [#accel]_.
813     - Acceleration
814
815   * - `rx_vport_rdma_multicast_bytes`
816     - RDMA multicast bytes received, steered to a port (counters counts
817       RoCE/UD/RC traffic) [#accel]_.
818     - Acceleration
819
820   * - `tx_vport_rdma_multicast_packets`
821     - RDMA multicast packets transmitted, steered from a port (counters counts
822       RoCE/UD/RC traffic) [#accel]_.
823     - Acceleration
824
825   * - `tx_vport_rdma_multicast_bytes`
826     - RDMA multicast bytes transmitted, steered from a port (counters counts
827       RoCE/UD/RC traffic) [#accel]_.
828     - Acceleration
829
830   * - `vport_loopback_packets`
831     - Unicast, multicast and broadcast packets that were loop-back (received
832       and transmitted), IB/Eth  [#accel]_.
833     - Acceleration
834
835   * - `vport_loopback_bytes`
836     - Unicast, multicast and broadcast bytes that were loop-back (received
837       and transmitted), IB/Eth  [#accel]_.
838     - Acceleration
839
840   * - `rx_steer_missed_packets`
841     - Number of packets that was received by the NIC, however was discarded
842       because it did not match any flow in the NIC flow table.
843     - Error
844
845   * - `rx_packets`
846     - Representor only: packets received, that were handled by the hypervisor.
847     - Informative
848
849   * - `rx_bytes`
850     - Representor only: bytes received, that were handled by the hypervisor.
851     - Informative
852
853   * - `tx_packets`
854     - Representor only: packets transmitted, that were handled by the
855       hypervisor.
856     - Informative
857
858   * - `tx_bytes`
859     - Representor only: bytes transmitted, that were handled by the hypervisor.
860     - Informative
861
862   * - `dev_internal_queue_oob`
863     - The number of dropped packets due to lack of receive WQEs for an internal
864       device RQ.
865     - Error
866
867Physical Port Counters
868----------------------
869The physical port counters are the counters on the external port connecting the
870adapter to the network. This measuring point holds information on standardized
871counters like IEEE 802.3, RFC2863, RFC 2819, RFC 3635 and additional counters
872like flow control, FEC and more.
873
874.. flat-table:: Physical Port Counter Table
875   :widths: 2 3 1
876
877   * - Counter
878     - Description
879     - Type
880
881   * - `rx_packets_phy`
882     - The number of packets received on the physical port. This counter doesn’t
883       include packets that were discarded due to FCS, frame size and similar
884       errors.
885     - Informative
886
887   * - `tx_packets_phy`
888     - The number of packets transmitted on the physical port.
889     - Informative
890
891   * - `rx_bytes_phy`
892     - The number of bytes received on the physical port, including Ethernet
893       header and FCS.
894     - Informative
895
896   * - `tx_bytes_phy`
897     - The number of bytes transmitted on the physical port.
898     - Informative
899
900   * - `rx_multicast_phy`
901     - The number of multicast packets received on the physical port.
902     - Informative
903
904   * - `tx_multicast_phy`
905     - The number of multicast packets transmitted on the physical port.
906     - Informative
907
908   * - `rx_broadcast_phy`
909     - The number of broadcast packets received on the physical port.
910     - Informative
911
912   * - `tx_broadcast_phy`
913     - The number of broadcast packets transmitted on the physical port.
914     - Informative
915
916   * - `rx_crc_errors_phy`
917     - The number of dropped received packets due to FCS (Frame Check Sequence)
918       error on the physical port. If this counter is increased in high rate,
919       check the link quality using `rx_symbol_error_phy` and
920       `rx_corrected_bits_phy` counters below.
921     - Error
922
923   * - `rx_in_range_len_errors_phy`
924     - The number of received packets dropped due to length/type errors on a
925       physical port.
926     - Error
927
928   * - `rx_out_of_range_len_phy`
929     - The number of received packets dropped due to length greater than allowed
930       on a physical port. If this counter is increasing, it implies that the
931       peer connected to the adapter has a larger MTU configured. Using same MTU
932       configuration shall resolve this issue.
933     - Error
934
935   * - `rx_oversize_pkts_phy`
936     - The number of dropped received packets due to length which exceed MTU
937       size on a physical port. If this counter is increasing, it implies that
938       the peer connected to the adapter has a larger MTU configured. Using same
939       MTU configuration shall resolve this issue.
940     - Error
941
942   * - `rx_symbol_err_phy`
943     - The number of received packets dropped due to physical coding errors
944       (symbol errors) on a physical port.
945     - Error
946
947   * - `rx_mac_control_phy`
948     - The number of MAC control packets received on the physical port.
949     - Informative
950
951   * - `tx_mac_control_phy`
952     - The number of MAC control packets transmitted on the physical port.
953     - Informative
954
955   * - `rx_pause_ctrl_phy`
956     - The number of link layer pause packets received on a physical port. If
957       this counter is increasing, it implies that the network is congested and
958       cannot absorb the traffic coming from to the adapter.
959     - Informative
960
961   * - `tx_pause_ctrl_phy`
962     - The number of link layer pause packets transmitted on a physical port. If
963       this counter is increasing, it implies that the NIC is congested and
964       cannot absorb the traffic coming from the network.
965     - Informative
966
967   * - `rx_unsupported_op_phy`
968     - The number of MAC control packets received with unsupported opcode on a
969       physical port.
970     - Error
971
972   * - `rx_discards_phy`
973     - The number of received packets dropped due to lack of buffers on a
974       physical port. If this counter is increasing, it implies that the adapter
975       is congested and cannot absorb the traffic coming from the network.
976     - Error
977
978   * - `tx_discards_phy`
979     - The number of packets which were discarded on transmission, even no
980       errors were detected. the drop might occur due to link in down state,
981       head of line drop, pause from the network, etc.
982     - Error
983
984   * - `tx_errors_phy`
985     - The number of transmitted packets dropped due to a length which exceed
986       MTU size on a physical port.
987     - Error
988
989   * - `rx_undersize_pkts_phy`
990     - The number of received packets dropped due to length which is shorter
991       than 64 bytes on a physical port. If this counter is increasing, it
992       implies that the peer connected to the adapter has a non-standard MTU
993       configured or malformed packet had arrived.
994     - Error
995
996   * - `rx_fragments_phy`
997     - The number of received packets dropped due to a length which is shorter
998       than 64 bytes and has FCS error on a physical port. If this counter is
999       increasing, it implies that the peer connected to the adapter has a
1000       non-standard MTU configured.
1001     - Error
1002
1003   * - `rx_jabbers_phy`
1004     - The number of received packets d due to a length which is longer than 64
1005       bytes and had FCS error on a physical port.
1006     - Error
1007
1008   * - `rx_64_bytes_phy`
1009     - The number of packets received on the physical port with size of 64 bytes.
1010     - Informative
1011
1012   * - `rx_65_to_127_bytes_phy`
1013     - The number of packets received on the physical port with size of 65 to
1014       127 bytes.
1015     - Informative
1016
1017   * - `rx_128_to_255_bytes_phy`
1018     - The number of packets received on the physical port with size of 128 to
1019       255 bytes.
1020     - Informative
1021
1022   * - `rx_256_to_511_bytes_phy`
1023     - The number of packets received on the physical port with size of 256 to
1024       512 bytes.
1025     - Informative
1026
1027   * - `rx_512_to_1023_bytes_phy`
1028     - The number of packets received on the physical port with size of 512 to
1029       1023 bytes.
1030     - Informative
1031
1032   * - `rx_1024_to_1518_bytes_phy`
1033     - The number of packets received on the physical port with size of 1024 to
1034       1518 bytes.
1035     - Informative
1036
1037   * - `rx_1519_to_2047_bytes_phy`
1038     - The number of packets received on the physical port with size of 1519 to
1039       2047 bytes.
1040     - Informative
1041
1042   * - `rx_2048_to_4095_bytes_phy`
1043     - The number of packets received on the physical port with size of 2048 to
1044       4095 bytes.
1045     - Informative
1046
1047   * - `rx_4096_to_8191_bytes_phy`
1048     - The number of packets received on the physical port with size of 4096 to
1049       8191 bytes.
1050     - Informative
1051
1052   * - `rx_8192_to_10239_bytes_phy`
1053     - The number of packets received on the physical port with size of 8192 to
1054       10239 bytes.
1055     - Informative
1056
1057   * - `link_down_events_phy`
1058     - The number of times where the link operative state changed to down. In
1059       case this counter is increasing it may imply on port flapping. You may
1060       need to replace the cable/transceiver.
1061     - Error
1062
1063   * - `rx_out_of_buffer`
1064     - Number of times receive queue had no software buffers allocated for the
1065       adapter's incoming traffic.
1066     - Error
1067
1068   * - `module_bus_stuck`
1069     - The number of times that module's I\ :sup:`2`\C bus (data or clock)
1070       short-wire was detected. You may need to replace the cable/transceiver.
1071     - Error
1072
1073   * - `module_high_temp`
1074     - The number of times that the module temperature was too high. If this
1075       issue persist, you may need to check the ambient temperature or replace
1076       the cable/transceiver module.
1077     - Error
1078
1079   * - `module_bad_shorted`
1080     - The number of times that the module cables were shorted. You may need to
1081       replace the cable/transceiver module.
1082     - Error
1083
1084   * - `module_unplug`
1085     - The number of times that module was ejected.
1086     - Informative
1087
1088   * - `rx_buffer_passed_thres_phy`
1089     - The number of events where the port receive buffer was over 85% full.
1090     - Informative
1091
1092   * - `tx_pause_storm_warning_events`
1093     - The number of times the device was sending pauses for a long period of
1094       time.
1095     - Informative
1096
1097   * - `tx_pause_storm_error_events`
1098     - The number of times the device was sending pauses for a long period of
1099       time, reaching time out and disabling transmission of pause frames. on
1100       the period where pause frames were disabled, drop could have been
1101       occurred.
1102     - Error
1103
1104   * - `rx[i]_buff_alloc_err`
1105     - Failed to allocate a buffer to received packet (or SKB) on ring i.
1106     - Error
1107
1108   * - `rx_bits_phy`
1109     - This counter provides information on the total amount of traffic that
1110       could have been received and can be used as a guideline to measure the
1111       ratio of errored traffic in `rx_pcs_symbol_err_phy` and
1112       `rx_corrected_bits_phy`.
1113     - Informative
1114
1115   * - `rx_pcs_symbol_err_phy`
1116     - This counter counts the number of symbol errors that wasn’t corrected by
1117       FEC correction algorithm or that FEC algorithm was not active on this
1118       interface. If this counter is increasing, it implies that the link
1119       between the NIC and the network is suffering from high BER, and that
1120       traffic is lost. You may need to replace the cable/transceiver. The error
1121       rate is the number of `rx_pcs_symbol_err_phy` divided by the number of
1122       `rx_bits_phy` on a specific time frame.
1123     - Error
1124
1125   * - `rx_corrected_bits_phy`
1126     - The number of corrected bits on this port according to active FEC
1127       (RS/FC). If this counter is increasing, it implies that the link between
1128       the NIC and the network is suffering from high BER. The corrected bit
1129       rate is the number of `rx_corrected_bits_phy` divided by the number of
1130       `rx_bits_phy` on a specific time frame.
1131     - Error
1132
1133   * - `rx_err_lane_[l]_phy`
1134     - This counter counts the number of physical raw errors per lane l index.
1135       The counter counts errors before FEC corrections. If this counter is
1136       increasing, it implies that the link between the NIC and the network is
1137       suffering from high BER, and that traffic might be lost. You may need to
1138       replace the cable/transceiver. Please check in accordance with
1139       `rx_corrected_bits_phy`.
1140     - Error
1141
1142   * - `rx_global_pause`
1143     - The number of pause packets received on the physical port. If this
1144       counter is increasing, it implies that the network is congested and
1145       cannot absorb the traffic coming from the adapter. Note: This counter is
1146       only enabled when global pause mode is enabled.
1147     - Informative
1148
1149   * - `rx_global_pause_duration`
1150     - The duration of pause received (in microSec) on the physical port. The
1151       counter represents the time the port did not send any traffic. If this
1152       counter is increasing, it implies that the network is congested and
1153       cannot absorb the traffic coming from the adapter. Note: This counter is
1154       only enabled when global pause mode is enabled.
1155     - Informative
1156
1157   * - `tx_global_pause`
1158     - The number of pause packets transmitted on a physical port. If this
1159       counter is increasing, it implies that the adapter is congested and
1160       cannot absorb the traffic coming from the network. Note: This counter is
1161       only enabled when global pause mode is enabled.
1162     - Informative
1163
1164   * - `tx_global_pause_duration`
1165     - The duration of pause transmitter (in microSec) on the physical port.
1166       Note: This counter is only enabled when global pause mode is enabled.
1167     - Informative
1168
1169   * - `rx_global_pause_transition`
1170     - The number of times a transition from Xoff to Xon on the physical port
1171       has occurred. Note: This counter is only enabled when global pause mode
1172       is enabled.
1173     - Informative
1174
1175   * - `rx_if_down_packets`
1176     - The number of received packets that were dropped due to interface down.
1177     - Informative
1178
1179Priority Port Counters
1180----------------------
1181The following counters are physical port counters that are counted per L2
1182priority (0-7).
1183
1184**Note:** `p` in the counter name represents the priority.
1185
1186.. flat-table:: Priority Port Counter Table
1187   :widths: 2 3 1
1188
1189   * - Counter
1190     - Description
1191     - Type
1192
1193   * - `rx_prio[p]_bytes`
1194     - The number of bytes received with priority p on the physical port.
1195     - Informative
1196
1197   * - `rx_prio[p]_packets`
1198     - The number of packets received with priority p on the physical port.
1199     - Informative
1200
1201   * - `tx_prio[p]_bytes`
1202     - The number of bytes transmitted on priority p on the physical port.
1203     - Informative
1204
1205   * - `tx_prio[p]_packets`
1206     - The number of packets transmitted on priority p on the physical port.
1207     - Informative
1208
1209   * - `rx_prio[p]_pause`
1210     - The number of pause packets received with priority p on a physical port.
1211       If this counter is increasing, it implies that the network is congested
1212       and cannot absorb the traffic coming from the adapter. Note: This counter
1213       is available only if PFC was enabled on priority p.
1214     - Informative
1215
1216   * - `rx_prio[p]_pause_duration`
1217     - The duration of pause received (in microSec) on priority p on the
1218       physical port. The counter represents the time the port did not send any
1219       traffic on this priority. If this counter is increasing, it implies that
1220       the network is congested and cannot absorb the traffic coming from the
1221       adapter. Note: This counter is available only if PFC was enabled on
1222       priority p.
1223     - Informative
1224
1225   * - `rx_prio[p]_pause_transition`
1226     - The number of times a transition from Xoff to Xon on priority p on the
1227       physical port has occurred. Note: This counter is available only if PFC
1228       was enabled on priority p.
1229     - Informative
1230
1231   * - `tx_prio[p]_pause`
1232     - The number of pause packets transmitted on priority p on a physical port.
1233       If this counter is increasing, it implies that the adapter is congested
1234       and cannot absorb the traffic coming from the network. Note: This counter
1235       is available only if PFC was enabled on priority p.
1236     - Informative
1237
1238   * - `tx_prio[p]_pause_duration`
1239     - The duration of pause transmitter (in microSec) on priority p on the
1240       physical port. Note: This counter is available only if PFC was enabled on
1241       priority p.
1242     - Informative
1243
1244   * - `rx_prio[p]_buf_discard`
1245     - The number of packets discarded by device due to lack of per host receive
1246       buffers.
1247     - Informative
1248
1249   * - `rx_prio[p]_cong_discard`
1250     - The number of packets discarded by device due to per host congestion.
1251     - Informative
1252
1253   * - `rx_prio[p]_marked`
1254     - The number of packets ecn marked by device due to per host congestion.
1255     - Informative
1256
1257   * - `rx_prio[p]_discards`
1258     - The number of packets discarded by device due to lack of receive buffers.
1259     - Informative
1260
1261Device Counters
1262---------------
1263.. flat-table:: Device Counter Table
1264   :widths: 2 3 1
1265
1266   * - Counter
1267     - Description
1268     - Type
1269
1270   * - `rx_pci_signal_integrity`
1271     - Counts physical layer PCIe signal integrity errors, the number of
1272       transitions to recovery due to Framing errors and CRC (dlp and tlp). If
1273       this counter is raising, try moving the adapter card to a different slot
1274       to rule out a bad PCI slot. Validate that you are running with the latest
1275       firmware available and latest server BIOS version.
1276     - Error
1277
1278   * - `tx_pci_signal_integrity`
1279     - Counts physical layer PCIe signal integrity errors, the number of
1280       transition to recovery initiated by the other side (moving to recovery
1281       due to getting TS/EIEOS). If this counter is raising, try moving the
1282       adapter card to a different slot to rule out a bad PCI slot. Validate
1283       that you are running with the latest firmware available and latest server
1284       BIOS version.
1285     - Error
1286
1287   * - `outbound_pci_buffer_overflow`
1288     - The number of packets dropped due to pci buffer overflow. If this counter
1289       is raising in high rate, it might indicate that the receive traffic rate
1290       for a host is larger than the PCIe bus and therefore a congestion occurs.
1291     - Informative
1292
1293   * - `outbound_pci_stalled_rd`
1294     - The percentage (in the range 0...100) of time within the last second that
1295       the NIC had outbound non-posted reads requests but could not perform the
1296       operation due to insufficient posted credits.
1297     - Informative
1298
1299   * - `outbound_pci_stalled_wr`
1300     - The percentage (in the range 0...100) of time within the last second that
1301       the NIC had outbound posted writes requests but could not perform the
1302       operation due to insufficient posted credits.
1303     - Informative
1304
1305   * - `outbound_pci_stalled_rd_events`
1306     - The number of seconds where `outbound_pci_stalled_rd` was above 30%.
1307     - Informative
1308
1309   * - `outbound_pci_stalled_wr_events`
1310     - The number of seconds where `outbound_pci_stalled_wr` was above 30%.
1311     - Informative
1312
1313   * - `dev_out_of_buffer`
1314     - The number of times the device owned queue had not enough buffers
1315       allocated.
1316     - Error
1317