1.. SPDX-License-Identifier: GPL-2.0 2 3======================================== 4Debugging advice for driver development 5======================================== 6 7This document serves as a general starting point and lookup for debugging 8device drivers. 9While this guide focuses on debugging that requires re-compiling the 10module/kernel, the :doc:`userspace debugging guide 11</process/debugging/userspace_debugging_guide>` will guide 12you through tools like dynamic debug, ftrace and other tools useful for 13debugging issues and behavior. 14For general debugging advice, see the :doc:`general advice document 15</process/debugging/index>`. 16 17.. contents:: 18 :depth: 3 19 20The following sections show you the available tools. 21 22printk() & friends 23------------------ 24 25These are derivatives of printf() with varying destinations and support for 26being dynamically turned on or off, or lack thereof. 27 28Simple printk() 29~~~~~~~~~~~~~~~ 30 31The classic, can be used to great effect for quick and dirty development 32of new modules or to extract arbitrary necessary data for troubleshooting. 33 34Prerequisite: ``CONFIG_PRINTK`` (usually enabled by default) 35 36**Pros**: 37 38- No need to learn anything, simple to use 39- Easy to modify exactly to your needs (formatting of the data (See: 40 :doc:`/core-api/printk-formats`), visibility in the log) 41- Can cause delays in the execution of the code (beneficial to confirm whether 42 timing is a factor) 43 44**Cons**: 45 46- Requires rebuilding the kernel/module 47- Can cause delays in the execution of the code (which can cause issues to be 48 not reproducible) 49 50For the full documentation see :doc:`/core-api/printk-basics` 51 52Trace_printk 53~~~~~~~~~~~~ 54 55Prerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>`` 56 57It is a tiny bit less comfortable to use than printk(), because you will have 58to read the messages from the trace file (See: :ref:`read_ftrace_log` 59instead of from the kernel log, but very useful when printk() adds unwanted 60delays into the code execution, causing issues to be flaky or hidden.) 61 62If the processing of this still causes timing issues then you can try 63trace_puts(). 64 65For the full Documentation see trace_printk() 66 67dev_dbg 68~~~~~~~ 69 70Print statement, which can be targeted by 71:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains 72additional information about the device used within the context. 73 74**When is it appropriate to leave a debug print in the code?** 75 76Permanent debug statements have to be useful for a developer to troubleshoot 77driver misbehavior. Judging that is a bit more of an art than a science, but 78some guidelines are in the :ref:`Coding style guidelines 79<process/coding-style:13) printing kernel messages>`. In almost all cases the 80debug statements shouldn't be upstreamed, as a working driver is supposed to be 81silent. 82 83Custom printk 84~~~~~~~~~~~~~ 85 86Example:: 87 88 #define core_dbg(fmt, arg...) do { \ 89 if (core_debug) \ 90 printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \ 91 } while (0) 92 93**When should you do this?** 94 95It is better to just use a pr_debug(), which can later be turned on/off with 96dynamic debug. Additionally, a lot of drivers activate these prints via a 97variable like ``core_debug`` set by a module parameter. However, Module 98parameters `are not recommended anymore 99<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_. 100 101Ftrace 102------ 103 104Creating a custom Ftrace tracepoint 105~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 106 107A tracepoint adds a hook into your code that will be called and logged when the 108tracepoint is enabled. This can be used, for example, to trace hitting a 109conditional branch or to dump the internal state at specific points of the code 110flow during a debugging session. 111 112Here is a basic description of :ref:`how to implement new tracepoints 113<trace/tracepoints:usage>`. 114 115For the full event tracing documentation see :doc:`/trace/events` 116 117For the full Ftrace documentation see :doc:`/trace/ftrace` 118 119DebugFS 120------- 121 122Prerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>`` 123 124DebugFS differs from the other approaches of debugging, as it doesn't write 125messages to the kernel log nor add traces to the code. Instead it allows the 126developer to handle a set of files. 127With these files you can either store values of variables or make 128register/memory dumps or you can make these files writable and modify 129values/settings in the driver. 130 131Possible use-cases among others: 132 133- Store register values 134- Keep track of variables 135- Store errors 136- Store settings 137- Toggle a setting like debug on/off 138- Error injection 139 140This is especially useful, when the size of a data dump would be hard to digest 141as part of the general kernel log (for example when dumping raw bitstream data) 142or when you are not interested in all the values all the time, but with the 143possibility to inspect them. 144 145The general idea is: 146 147- Create a directory during probe (``struct dentry *parent = 148 debugfs_create_dir("my_driver", NULL);``) 149- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``) 150 151 - In this example the file is found in 152 ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for 153 user/group/all) 154 - any read of the file will return the current contents of the variable 155 ``my_variable`` 156 157- Clean up the directory when removing the device 158 (``debugfs_remove_recursive(parent);``) 159 160For the full documentation see :doc:`/filesystems/debugfs`. 161 162KASAN, UBSAN, lockdep and other error checkers 163---------------------------------------------- 164 165KASAN (Kernel Address Sanitizer) 166~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 167 168Prerequisite: ``CONFIG_KASAN`` 169 170KASAN is a dynamic memory error detector that helps to find use-after-free and 171out-of-bounds bugs. It uses compile-time instrumentation to check every memory 172access. 173 174For the full documentation see :doc:`/dev-tools/kasan`. 175 176UBSAN (Undefined Behavior Sanitizer) 177~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 178 179Prerequisite: ``CONFIG_UBSAN`` 180 181UBSAN relies on compiler instrumentation and runtime checks to detect undefined 182behavior. It is designed to find a variety of issues, including signed integer 183overflow, array index out of bounds, and more. 184 185For the full documentation see :doc:`/dev-tools/ubsan` 186 187lockdep (Lock Dependency Validator) 188~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 189 190Prerequisite: ``CONFIG_DEBUG_LOCKDEP`` 191 192lockdep is a runtime lock dependency validator that detects potential deadlocks 193and other locking-related issues in the kernel. 194It tracks lock acquisitions and releases, building a dependency graph that is 195analyzed for potential deadlocks. 196lockdep is especially useful for validating the correctness of lock ordering in 197the kernel. 198 199PSI (Pressure stall information tracking) 200~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 201 202Prerequisite: ``CONFIG_PSI`` 203 204PSI is a measurement tool to identify excessive overcommits on hardware 205resources, that can cause performance disruptions or even OOM kills. 206 207device coredump 208--------------- 209 210Prerequisite: ``#include <linux/devcoredump.h>`` 211 212Provides the infrastructure for a driver to provide arbitrary data to userland. 213It is most often used in conjunction with udev or similar userland application 214to listen for kernel uevents, which indicate that the dump is ready. Udev has 215rules to copy that file somewhere for long-term storage and analysis, as by 216default, the data for the dump is automatically cleaned up after 5 minutes. 217That data is analyzed with driver-specific tools or GDB. 218 219You can find an example implementation at: 220`drivers/media/platform/qcom/venus/core.c 221<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__ 222 223**Copyright** ©2024 : Collabora 224