xref: /linux/Documentation/networking/diagnostic/twisted_pair_layer1_diagnostics.rst (revision fcc79e1714e8c2b8e216dc3149812edd37884eef)
1.. SPDX-License-Identifier: GPL-2.0
2
3Diagnostic Concept for Investigating Twisted Pair Ethernet Variants at OSI Layer 1
4==================================================================================
5
6Introduction
7------------
8
9This documentation is designed for two primary audiences:
10
111. **Users and System Administrators**: For those dealing with real-world
12   Ethernet issues, this guide provides a practical, step-by-step
13   troubleshooting flow to help identify and resolve common problems in Twisted
14   Pair Ethernet at OSI Layer 1. If you're facing unstable links, speed drops,
15   or mysterious network issues, jump right into the step-by-step guide and
16   follow it through to find your solution.
17
182. **Kernel Developers**: For developers working with network drivers and PHY
19   support, this documentation outlines the diagnostic process and highlights
20   areas where the Linux kernel’s diagnostic interfaces could be extended or
21   improved. By understanding the diagnostic flow, developers can better
22   prioritize future enhancements.
23
24Step-by-Step Diagnostic Guide from Linux (General Ethernet)
25-----------------------------------------------------------
26
27This diagnostic guide covers common Ethernet troubleshooting scenarios,
28focusing on **link stability and detection** across different Ethernet
29environments, including **Single-Pair Ethernet (SPE)** and **Multi-Pair
30Ethernet (MPE)**, as well as power delivery technologies like **PoDL** (Power
31over Data Line) and **PoE** (Clause 33 PSE).
32
33The guide is designed to help users diagnose physical layer (Layer 1) issues on
34systems running **Linux kernel version 6.11 or newer**, utilizing **ethtool
35version 6.10 or later** and **iproute2 version 6.4.0 or later**.
36
37In this guide, we assume that users may have **limited or no access to the link
38partner** and will focus on diagnosing issues locally.
39
40Diagnostic Scenarios
41~~~~~~~~~~~~~~~~~~~~
42
43- **Link is up and stable, but no data transfer**: If the link is stable but
44  there are issues with data transmission, refer to the **OSI Layer 2
45  Troubleshooting Guide**.
46
47- **Link is unstable**: Link resets, speed drops, or other fluctuations
48  indicate potential issues at the hardware or physical layer.
49
50- **No link detected**: The interface is up, but no link is established.
51
52Verify Interface Status
53~~~~~~~~~~~~~~~~~~~~~~~
54
55Begin by verifying the status of the Ethernet interface to check if it is
56administratively up. Unlike `ethtool`, which provides information on the link
57and PHY status, it does not show the **administrative state** of the interface.
58To check this, you should use the `ip` command, which describes the interface
59state within the angle brackets `"<>"` in its output.
60
61For example, in the output `<NO-CARRIER,BROADCAST,MULTICAST,UP>`, the important
62keywords are:
63
64- **UP**: The interface is in the administrative "UP" state.
65- **NO-CARRIER**: The interface is administratively up, but no physical link is
66  detected.
67
68If the output shows `<BROADCAST,MULTICAST>`, this indicates the interface is in
69the administrative "DOWN" state.
70
71- **Command:** `ip link show dev <interface>`
72
73- **Expected Output:**
74
75  .. code-block:: bash
76
77     4: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 ...
78        link/ether 88:14:2b:00:96:f2 brd ff:ff:ff:ff:ff:ff
79
80- **Interpreting the Output:**
81
82  - **Administrative UP State**:
83
84    - If the output contains **"UP"**, the interface is administratively up,
85      and the system is trying to establish a physical link.
86
87    - If you also see **"NO-CARRIER"**, it means the physical link has not been
88      detected, indicating potential Layer 1 issues like a cable fault,
89      misconfiguration, or no connection at the link partner. In this case,
90      proceed to the **Inspect Link Status and PHY Configuration** section.
91
92  - **Administrative DOWN State**:
93
94    - If the output lacks **"UP"** and shows only states like
95      **"<BROADCAST,MULTICAST>"**, it means the interface is administratively
96      down. In this case, bring the interface up using the following command:
97
98      .. code-block:: bash
99
100         ip link set dev <interface> up
101
102- **Next Steps**:
103
104  - If the interface is **administratively up** but shows **NO-CARRIER**,
105    proceed to the **Inspect Link Status and PHY Configuration** section to
106    troubleshoot potential physical layer issues.
107
108  - If the interface was **administratively down** and you have brought it up,
109    ensure to **repeat this verification step** to confirm the new state of the
110    interface before proceeding
111
112  - **If the interface is up and the link is detected**:
113
114    - If the output shows **"UP"** and there is **no `NO-CARRIER`**, the
115      interface is administratively up, and the physical link has been
116      successfully established. If everything is working as expected, the Layer
117      1 diagnostics are complete, and no further action is needed.
118
119    - If the interface is up and the link is detected but **no data is being
120      transferred**, the issue is likely beyond Layer 1, and you should proceed
121      with diagnosing the higher layers of the OSI model. This may involve
122      checking Layer 2 configurations (such as VLANs or MAC address issues),
123      Layer 3 settings (like IP addresses, routing, or ARP), or Layer 4 and
124      above (firewalls, services, etc.).
125
126    - If the **link is unstable** or **frequently resetting or dropping**, this
127      may indicate a physical layer issue such as a faulty cable, interference,
128      or power delivery problems. In this case, proceed with the next step in
129      this guide.
130
131Inspect Link Status and PHY Configuration
132~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
133
134Use `ethtool -I` to check the link status, PHY configuration, supported link
135modes, and additional statistics such as the **Link Down Events** counter. This
136step is essential for diagnosing Layer 1 problems such as speed mismatches,
137duplex issues, and link instability.
138
139For both **Single-Pair Ethernet (SPE)** and **Multi-Pair Ethernet (MPE)**
140devices, you will use this step to gather key details about the link. **SPE**
141links generally support a single speed and mode without autonegotiation (with
142the exception of **10BaseT1L**), while **MPE** devices typically support
143multiple link modes and autonegotiation.
144
145- **Command:** `ethtool -I <interface>`
146
147- **Example Output for SPE Interface (Non-autonegotiation)**:
148
149  .. code-block:: bash
150
151     Settings for spe4:
152         Supported ports: [ TP ]
153         Supported link modes:   100baseT1/Full
154         Supported pause frame use: No
155         Supports auto-negotiation: No
156         Supported FEC modes: Not reported
157         Advertised link modes: Not applicable
158         Advertised pause frame use: No
159         Advertised auto-negotiation: No
160         Advertised FEC modes: Not reported
161         Speed: 100Mb/s
162         Duplex: Full
163         Auto-negotiation: off
164         master-slave cfg: forced slave
165         master-slave status: slave
166         Port: Twisted Pair
167         PHYAD: 6
168         Transceiver: external
169         MDI-X: Unknown
170         Supports Wake-on: d
171         Wake-on: d
172         Link detected: yes
173         SQI: 7/7
174         Link Down Events: 2
175
176- **Example Output for MPE Interface (Autonegotiation)**:
177
178  .. code-block:: bash
179
180     Settings for eth1:
181         Supported ports: [ TP    MII ]
182         Supported link modes:   10baseT/Half 10baseT/Full
183                                 100baseT/Half 100baseT/Full
184         Supported pause frame use: Symmetric Receive-only
185         Supports auto-negotiation: Yes
186         Supported FEC modes: Not reported
187         Advertised link modes:  10baseT/Half 10baseT/Full
188                                 100baseT/Half 100baseT/Full
189         Advertised pause frame use: Symmetric Receive-only
190         Advertised auto-negotiation: Yes
191         Advertised FEC modes: Not reported
192         Link partner advertised link modes:  10baseT/Half 10baseT/Full
193                                              100baseT/Half 100baseT/Full
194         Link partner advertised pause frame use: Symmetric Receive-only
195         Link partner advertised auto-negotiation: Yes
196         Link partner advertised FEC modes: Not reported
197         Speed: 100Mb/s
198         Duplex: Full
199         Auto-negotiation: on
200         Port: Twisted Pair
201         PHYAD: 10
202         Transceiver: internal
203         MDI-X: Unknown
204         Supports Wake-on: pg
205         Wake-on: p
206         Link detected: yes
207         Link Down Events: 1
208
209- **Next Steps**:
210
211  - Record the output provided by `ethtool`, particularly noting the
212    **master-slave status**, **speed**, **duplex**, and other relevant fields.
213    This information will be useful for further analysis or troubleshooting.
214    Once the **ethtool** output has been collected and stored, move on to the
215    next diagnostic step.
216
217Check Power Delivery (PoDL or PoE)
218~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219
220If it is known that **PoDL** or **PoE** is **not implemented** on the system,
221or the **PSE** (Power Sourcing Equipment) is managed by proprietary user-space
222software or external tools, you can skip this step. In such cases, verify power
223delivery through alternative methods, such as checking hardware indicators
224(LEDs), using multimeters, or consulting vendor-specific software for
225monitoring power status.
226
227If **PoDL** or **PoE** is implemented and managed directly by Linux, follow
228these steps to ensure power is being delivered correctly:
229
230- **Command:** `ethtool --show-pse <interface>`
231
232- **Expected Output Examples**:
233
234  1. **PSE Not Supported**:
235
236     If no PSE is attached or the interface does not support PSE, the following
237     output is expected:
238
239     .. code-block:: bash
240
241        netlink error: No PSE is attached
242        netlink error: Operation not supported
243
244  2. **PoDL (Single-Pair Ethernet)**:
245
246     When PoDL is implemented, you might see the following attributes:
247
248     .. code-block:: bash
249
250        PSE attributes for eth1:
251        PoDL PSE Admin State: enabled
252        PoDL PSE Power Detection Status: delivering power
253
254  3. **PoE (Clause 33 PSE)**:
255
256     For standard PoE, the output may look like this:
257
258     .. code-block:: bash
259
260        PSE attributes for eth1:
261        Clause 33 PSE Admin State: enabled
262        Clause 33 PSE Power Detection Status: delivering power
263        Clause 33 PSE Available Power Limit: 18000
264
265- **Adjust Power Limit (if needed)**:
266
267  - Sometimes, the available power limit may not be sufficient for the link
268    partner. You can increase the power limit as needed.
269
270  - **Command:** `ethtool --set-pse <interface> c33-pse-avail-pw-limit <limit>`
271
272    Example:
273
274    .. code-block:: bash
275
276      ethtool --set-pse eth1 c33-pse-avail-pw-limit 18000
277      ethtool --show-pse eth1
278
279    **Expected Output** after adjusting the power limit:
280
281    .. code-block:: bash
282
283      Clause 33 PSE Available Power Limit: 18000
284
285
286- **Next Steps**:
287
288  - **PoE or PoDL Not Used**: If **PoE** or **PoDL** is not implemented or used
289    on the system, proceed to the next diagnostic step, as power delivery is
290    not relevant for this setup.
291
292  - **PoE or PoDL Controlled Externally**: If **PoE** or **PoDL** is used but
293    is not managed by the Linux kernel's **PSE-PD** framework (i.e., it is
294    controlled by proprietary user-space software or external tools), this part
295    is out of scope for this documentation. Please consult vendor-specific
296    documentation or external tools for monitoring and managing power delivery.
297
298  - **PSE Admin State Disabled**:
299
300    - If the `PSE Admin State:` is **disabled**, enable it by running one of
301      the following commands:
302
303      .. code-block:: bash
304
305         ethtool --set-pse <devname> podl-pse-admin-control enable
306
307      or, for Clause 33 PSE (PoE):
308
309         ethtool --set-pse <devname> c33-pse-admin-control enable
310
311    - After enabling the PSE Admin State, return to the start of the **Check
312      Power Delivery (PoDL or PoE)** step to recheck the power delivery status.
313
314  - **Power Not Delivered**: If the `Power Detection Status` shows something
315    other than "delivering power" (e.g., `over current`), troubleshoot the
316    **PSE**. Check for potential issues such as a short circuit in the cable,
317    insufficient power delivery, or a fault in the PSE itself.
318
319  - **Power Delivered but No Link**: If power is being delivered but no link is
320    established, proceed with further diagnostics by performing **Cable
321    Diagnostics** or reviewing the **Inspect Link Status and PHY
322    Configuration** steps to identify any underlying issues with the physical
323    link or settings.
324
325Cable Diagnostics
326~~~~~~~~~~~~~~~~~
327
328Use `ethtool` to test for physical layer issues such as cable faults. The test
329results can vary depending on the cable's condition, the technology in use, and
330the state of the link partner. The results from the cable test will help in
331diagnosing issues like open circuits, shorts, impedance mismatches, and
332noise-related problems.
333
334- **Command:** `ethtool --cable-test <interface>`
335
336The following are the typical outputs for **Single-Pair Ethernet (SPE)** and
337**Multi-Pair Ethernet (MPE)**:
338
339- **For Single-Pair Ethernet (SPE)**:
340  - **Expected Output (SPE)**:
341
342  .. code-block:: bash
343
344    Cable test completed for device eth1.
345    Pair A, fault length: 25.00m
346    Pair A code Open Circuit
347
348  This indicates an open circuit or cable fault at the reported distance, but
349  results can be influenced by the link partner's state. Refer to the
350  **"Troubleshooting Based on Cable Test Results"** section for further
351  interpretation of these results.
352
353- **For Multi-Pair Ethernet (MPE)**:
354  - **Expected Output (MPE)**:
355
356  .. code-block:: bash
357
358    Cable test completed for device eth0.
359    Pair A code OK
360    Pair B code OK
361    Pair C code Open Circuit
362
363  Here, Pair C is reported as having an open circuit, while Pairs A and B are
364  functioning correctly. However, if autonegotiation is in use on Pairs A and
365  B, the cable test may be disrupted. Refer to the **"Troubleshooting Based on
366  Cable Test Results"** section for a detailed explanation of these issues and
367  how to resolve them.
368
369For detailed descriptions of the different possible cable test results, please
370refer to the **"Troubleshooting Based on Cable Test Results"** section.
371
372Troubleshooting Based on Cable Test Results
373^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
374
375After running the cable test, the results can help identify specific issues in
376the physical connection. However, it is important to note that **cable testing
377results heavily depend on the capabilities and characteristics of both the
378local hardware and the link partner**. The accuracy and reliability of the
379results can vary significantly between different hardware implementations.
380
381In some cases, this can introduce **blind spots** in the current cable testing
382implementation, where certain results may not accurately reflect the actual
383physical state of the cable. For example:
384
385- An **Open Circuit** result might not only indicate a damaged or disconnected
386  cable but also occur if the cable is properly attached to a powered-down link
387  partner.
388
389- Some PHYs may report a **Short within Pair** if the link partner is in
390  **forced slave mode**, even though there is no actual short in the cable.
391
392To help users interpret the results more effectively, it could be beneficial to
393extend the **kernel UAPI** (User API) to provide additional context or
394**possible variants** of issues based on the hardware’s characteristics. Since
395these quirks are often hardware-specific, the **kernel driver** would be an
396ideal source of such information. By providing flags or hints related to
397potential false positives for each test result, users would have a better
398understanding of what to verify and where to investigate further.
399
400Until such improvements are made, users should be aware of these limitations
401and manually verify cable issues as needed. Physical inspections may help
402resolve uncertainties related to false positive results.
403
404The results can be one of the following:
405
406- **OK**:
407
408  - The cable is functioning correctly, and no issues were detected.
409
410  - **Next Steps**: If you are still experiencing issues, it might be related
411    to higher-layer problems, such as duplex mismatches or speed negotiation,
412    which are not physical-layer issues.
413
414  - **Special Case for `BaseT1` (1000/100/10BaseT1)**: In `BaseT1` systems, an
415    "OK" result typically also means that the link is up and likely in **slave
416    mode**, since cable tests usually only pass in this mode. For some
417    **10BaseT1L** PHYs, an "OK" result may occur even if the cable is too long
418    for the PHY's configured range (for example, when the range is configured
419    for short-distance mode).
420
421- **Open Circuit**:
422
423  - An **Open Circuit** result typically indicates that the cable is damaged or
424    disconnected at the reported fault length. Consider these possibilities:
425
426    - If the link partner is in **admin down** state or powered off, you might
427      still get an "Open Circuit" result even if the cable is functional.
428
429    - **Next Steps**: Inspect the cable at the fault length for visible damage
430      or loose connections. Verify the link partner is powered on and in the
431      correct mode.
432
433- **Short within Pair**:
434
435  - A **Short within Pair** indicates an unintended connection within the same
436    pair of wires, typically caused by physical damage to the cable.
437
438    - **Next Steps**: Replace or repair the cable and check for any physical
439      damage or improperly crimped connectors.
440
441- **Short to Another Pair**:
442
443  - A **Short to Another Pair** means the wires from different pairs are
444    shorted, which could occur due to physical damage or incorrect wiring.
445
446    - **Next Steps**: Replace or repair the damaged cable. Inspect the cable for
447      incorrect terminations or pinched wiring.
448
449- **Impedance Mismatch**:
450
451  - **Impedance Mismatch** indicates a reflection caused by an impedance
452    discontinuity in the cable. This can happen when a part of the cable has
453    abnormal impedance (e.g., when different cable types are spliced together
454    or when there is a defect in the cable).
455
456    - **Next Steps**: Check the cable quality and ensure consistent impedance
457      throughout its length. Replace any sections of the cable that do not meet
458      specifications.
459
460- **Noise**:
461
462  - **Noise** means that the Time Domain Reflectometry (TDR) test could not
463    complete due to excessive noise on the cable, which can be caused by
464    interference from electromagnetic sources.
465
466    - **Next Steps**: Identify and eliminate sources of electromagnetic
467      interference (EMI) near the cable. Consider using shielded cables or
468      rerouting the cable away from noise sources.
469
470- **Resolution Not Possible**:
471
472  - **Resolution Not Possible** means that the TDR test could not detect the
473    issue due to the resolution limitations of the test or because the fault is
474    beyond the distance that the test can measure.
475
476    - **Next Steps**: Inspect the cable manually if possible, or use alternative
477      diagnostic tools that can handle greater distances or higher resolution.
478
479- **Unknown**:
480
481  - An **Unknown** result may occur when the test cannot classify the fault or
482    when a specific issue is outside the scope of the tool's detection
483    capabilities.
484
485    - **Next Steps**: Re-run the test, verify the link partner's state, and inspect
486      the cable manually if necessary.
487
488Verify Link Partner PHY Configuration
489~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
490
491If the cable test passes but the link is still not functioning correctly, it’s
492essential to verify the configuration of the link partner’s PHY. Mismatches in
493speed, duplex settings, or master-slave roles can cause connection issues.
494
495Autonegotiation Mismatch
496^^^^^^^^^^^^^^^^^^^^^^^^
497
498- If both link partners support autonegotiation, ensure that autonegotiation is
499  enabled on both sides and that all supported link modes are advertised. A
500  mismatch can lead to connectivity problems or sub optimal performance.
501
502- **Quick Fix:** Reset autonegotiation to the default settings, which will
503  advertise all default link modes:
504
505  .. code-block:: bash
506
507     ethtool -s <interface> autoneg on
508
509- **Command to check configuration:** `ethtool <interface>`
510
511- **Expected Output:** Ensure that both sides advertise compatible link modes.
512  If autonegotiation is off, verify that both link partners are configured for
513  the same speed and duplex.
514
515  The following example shows a case where the local PHY advertises fewer link
516  modes than it supports. This will reduce the number of overlapping link modes
517  with the link partner. In the worst case, there will be no common link modes,
518  and the link will not be created:
519
520  .. code-block:: bash
521
522     Settings for eth0:
523        Supported link modes:  1000baseT/Full, 100baseT/Full
524        Advertised link modes: 1000baseT/Full
525        Speed: 1000Mb/s
526        Duplex: Full
527        Auto-negotiation: on
528
529Combined Mode Mismatch (Autonegotiation on One Side, Forced on the Other)
530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
531
532- One possible issue occurs when one side is using **autonegotiation** (as in
533  most modern systems), and the other side is set to a **forced link mode**
534  (e.g., older hardware with single-speed hubs). In such cases, modern PHYs
535  will attempt to detect the forced mode on the other side. If the link is
536  established, you may notice:
537
538  - **No or empty "Link partner advertised link modes"**.
539
540  - **"Link partner advertised auto-negotiation:"** will be **"no"** or not
541    present.
542
543- This type of detection does not always work reliably:
544
545  - Typically, the modern PHY will default to **Half Duplex**, even if the link
546    partner is actually configured for **Full Duplex**.
547
548  - Some PHYs may not work reliably if the link partner switches from one
549    forced mode to another. In this case, only a down/up cycle may help.
550
551- **Next Steps**: Set both sides to the same fixed speed and duplex mode to
552  avoid potential detection issues.
553
554  .. code-block:: bash
555
556     ethtool -s <interface> speed 1000 duplex full autoneg off
557
558Master/Slave Role Mismatch (BaseT1 and 1000BaseT PHYs)
559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
560
561- In **BaseT1** systems (e.g., 1000BaseT1, 100BaseT1), link establishment
562  requires that one device is configured as **master** and the other as
563  **slave**. A mismatch in this master-slave configuration can prevent the link
564  from being established. However, **1000BaseT** also supports configurable
565  master/slave roles and can face similar issues.
566
567- **Role Preference in 1000BaseT**: The **1000BaseT** specification allows link
568  partners to negotiate master-slave roles or role preferences during
569  autonegotiation. Some PHYs have hardware limitations or bugs that prevent
570  them from functioning properly in certain roles. In such cases, drivers may
571  force these PHYs into a specific role (e.g., **forced master** or **forced
572  slave**) or try a weaker option by setting preferences. If both link partners
573  have the same issue and are forced into the same mode (e.g., both forced into
574  master mode), they will not be able to establish a link.
575
576- **Next Steps**: Ensure that one side is configured as **master** and the
577  other as **slave** to avoid this issue, particularly when hardware
578  limitations are involved, or try the weaker **preferred** option instead of
579  **forced**. Check for any driver-related restrictions or forced modes.
580
581- **Command to force master/slave mode**:
582
583  .. code-block:: bash
584
585     ethtool -s <interface> master-slave forced-master
586
587  or:
588
589  .. code-block:: bash
590
591     ethtool -s <interface> master-slave forced-master speed 1000 duplex full autoneg off
592
593
594- **Check the current master/slave status**:
595
596  .. code-block:: bash
597
598     ethtool <interface>
599
600  Example Output:
601
602  .. code-block:: bash
603
604     master-slave cfg: forced-master
605     master-slave status: master
606
607- **Hardware Bugs and Driver Forcing**: If a known hardware issue forces the
608  PHY into a specific mode, it’s essential to check the driver source code or
609  hardware documentation for details. Ensure that the roles are compatible
610  across both link partners, and if both PHYs are forced into the same mode,
611  adjust one side accordingly to resolve the mismatch.
612
613Monitor Link Resets and Speed Drops
614~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
615
616If the link is unstable, showing frequent resets or speed drops, this may
617indicate issues with the cable, PHY configuration, or environmental factors.
618While there is still no completely unified way in Linux to directly monitor
619downshift events or link speed changes via user space tools, both the Linux
620kernel logs and `ethtool` can provide valuable insights, especially if the
621driver supports reporting such events.
622
623- **Monitor Kernel Logs for Link Resets and Speed Drops**:
624
625  - The Linux kernel will print link status changes, including downshift
626    events, in the system logs. These messages typically include speed changes,
627    duplex mode, and downshifted link speed (if the driver supports it).
628
629  - **Command to monitor kernel logs in real-time:**
630
631    .. code-block:: bash
632
633      dmesg -w | grep "Link is Up\|Link is Down"
634
635  - Example Output (if a downshift occurs):
636
637    .. code-block:: bash
638
639      eth0: Link is Up - 100Mbps/Full (downshifted) - flow control rx/tx
640      eth0: Link is Down
641
642    This indicates that the link has been established but has downshifted from
643    a higher speed.
644
645  - **Note**: Not all drivers or PHYs support downshift reporting, so you may
646    not see this information for all devices.
647
648- **Monitor Link Down Events Using `ethtool`**:
649
650  - Starting with the latest kernel and `ethtool` versions, you can track
651    **Link Down Events** using the `ethtool -I` command. This will provide
652    counters for link drops, helping to diagnose link instability issues if
653    supported by the driver.
654
655  - **Command to monitor link down events:**
656
657    .. code-block:: bash
658
659      ethtool -I <interface>
660
661  - Example Output (if supported):
662
663    .. code-block:: bash
664
665      PSE attributes for eth1:
666      Link Down Events: 5
667
668    This indicates that the link has dropped 5 times. Frequent link down events
669    may indicate cable or environmental issues that require further
670    investigation.
671
672- **Check Link Status and Speed**:
673
674  - Even though downshift counts or events are not easily tracked, you can
675    still use `ethtool` to manually check the current link speed and status.
676
677  - **Command:** `ethtool <interface>`
678
679  - **Expected Output:**
680
681    .. code-block:: bash
682
683      Speed: 1000Mb/s
684      Duplex: Full
685      Auto-negotiation: on
686      Link detected: yes
687
688    Any inconsistencies in the expected speed or duplex setting could indicate
689    an issue.
690
691- **Disable Energy-Efficient Ethernet (EEE) for Diagnostics**:
692
693  - **EEE** (Energy-Efficient Ethernet) can be a source of link instability due
694    to transitions in and out of low-power states. For diagnostic purposes, it
695    may be useful to **temporarily** disable EEE to determine if it is
696    contributing to link instability. This is **not a generic recommendation**
697    for disabling power management.
698
699  - **Next Steps**: Disable EEE and monitor if the link becomes stable. If
700    disabling EEE resolves the issue, report the bug so that the driver can be
701    fixed.
702
703  - **Command:**
704
705    .. code-block:: bash
706
707      ethtool --set-eee <interface> eee off
708
709  - **Important**: If disabling EEE resolves the instability, the issue should
710    be reported to the maintainers as a bug, and the driver should be corrected
711    to handle EEE properly without causing instability. Disabling EEE
712    permanently should not be seen as a solution.
713
714- **Monitor Error Counters**:
715
716  - While some NIC drivers and PHYs provide error counters, there is no unified
717    set of PHY-specific counters across all hardware. Additionally, not all
718    PHYs provide useful information related to errors like CRC errors, frame
719    drops, or link flaps. Therefore, this step is dependent on the specific
720    hardware and driver support.
721
722  - **Next Steps**: Use `ethtool -S <interface>` to check if your driver
723    provides useful error counters. In some cases, counters may provide
724    information about errors like link flaps or physical layer problems (e.g.,
725    excessive CRC errors), but results can vary significantly depending on the
726    PHY.
727
728  - **Command:** `ethtool -S <interface>`
729
730  - **Example Output (if supported)**:
731
732    .. code-block:: bash
733
734      rx_crc_errors: 123
735      tx_errors: 45
736      rx_frame_errors: 78
737
738  - **Note**: If no meaningful error counters are available or if counters are
739    not supported, you may need to rely on physical inspections (e.g., cable
740    condition) or kernel log messages (e.g., link up/down events) to further
741    diagnose the issue.
742
743When All Else Fails...
744~~~~~~~~~~~~~~~~~~~~~~
745
746So you've checked the cables, monitored the logs, disabled EEE, and still...
747nothing? Don’t worry, you’re not alone. Sometimes, Ethernet gremlins just don’t
748want to cooperate.
749
750But before you throw in the towel (or the Ethernet cable), take a deep breath.
751It’s always possible that:
752
7531. Your PHY has a unique, undocumented personality.
754
7552. The problem is lying dormant, waiting for just the right moment to magically
756   resolve itself (hey, it happens!).
757
7583. Or, it could be that the ultimate solution simply hasn’t been invented yet.
759
760If none of the above bring you comfort, there’s one final step: contribute! If
761you've uncovered new or unusual issues, or have creative diagnostic methods,
762feel free to share your findings and extend this documentation. Together, we
763can hunt down every elusive network issue - one twisted pair at a time.
764
765Remember: sometimes the solution is just a reboot away, but if not, it’s time to
766dig deeper - or report that bug!
767
768