xref: /linux/Documentation/process/debugging/driver_development_debugging_guide.rst (revision a037699da0a17e99832875a936b5a6285d8de849)
1*a037699dSSebastian Fricke.. SPDX-License-Identifier: GPL-2.0
2*a037699dSSebastian Fricke
3*a037699dSSebastian Fricke========================================
4*a037699dSSebastian FrickeDebugging advice for driver development
5*a037699dSSebastian Fricke========================================
6*a037699dSSebastian Fricke
7*a037699dSSebastian FrickeThis document serves as a general starting point and lookup for debugging
8*a037699dSSebastian Frickedevice drivers.
9*a037699dSSebastian FrickeWhile this guide focuses on debugging that requires re-compiling the
10*a037699dSSebastian Frickemodule/kernel, the :doc:`userspace debugging guide
11*a037699dSSebastian Fricke</process/debugging/userspace_debugging_guide>` will guide
12*a037699dSSebastian Frickeyou through tools like dynamic debug, ftrace and other tools useful for
13*a037699dSSebastian Frickedebugging issues and behavior.
14*a037699dSSebastian FrickeFor general debugging advice, see the :doc:`general advice document
15*a037699dSSebastian Fricke</process/debugging/index>`.
16*a037699dSSebastian Fricke
17*a037699dSSebastian Fricke.. contents::
18*a037699dSSebastian Fricke    :depth: 3
19*a037699dSSebastian Fricke
20*a037699dSSebastian FrickeThe following sections show you the available tools.
21*a037699dSSebastian Fricke
22*a037699dSSebastian Frickeprintk() & friends
23*a037699dSSebastian Fricke------------------
24*a037699dSSebastian Fricke
25*a037699dSSebastian FrickeThese are derivatives of printf() with varying destinations and support for
26*a037699dSSebastian Frickebeing dynamically turned on or off, or lack thereof.
27*a037699dSSebastian Fricke
28*a037699dSSebastian FrickeSimple printk()
29*a037699dSSebastian Fricke~~~~~~~~~~~~~~~
30*a037699dSSebastian Fricke
31*a037699dSSebastian FrickeThe classic, can be used to great effect for quick and dirty development
32*a037699dSSebastian Frickeof new modules or to extract arbitrary necessary data for troubleshooting.
33*a037699dSSebastian Fricke
34*a037699dSSebastian FrickePrerequisite: ``CONFIG_PRINTK`` (usually enabled by default)
35*a037699dSSebastian Fricke
36*a037699dSSebastian Fricke**Pros**:
37*a037699dSSebastian Fricke
38*a037699dSSebastian Fricke- No need to learn anything, simple to use
39*a037699dSSebastian Fricke- Easy to modify exactly to your needs (formatting of the data (See:
40*a037699dSSebastian Fricke  :doc:`/core-api/printk-formats`), visibility in the log)
41*a037699dSSebastian Fricke- Can cause delays in the execution of the code (beneficial to confirm whether
42*a037699dSSebastian Fricke  timing is a factor)
43*a037699dSSebastian Fricke
44*a037699dSSebastian Fricke**Cons**:
45*a037699dSSebastian Fricke
46*a037699dSSebastian Fricke- Requires rebuilding the kernel/module
47*a037699dSSebastian Fricke- Can cause delays in the execution of the code (which can cause issues to be
48*a037699dSSebastian Fricke  not reproducible)
49*a037699dSSebastian Fricke
50*a037699dSSebastian FrickeFor the full documentation see :doc:`/core-api/printk-basics`
51*a037699dSSebastian Fricke
52*a037699dSSebastian FrickeTrace_printk
53*a037699dSSebastian Fricke~~~~~~~~~~~~
54*a037699dSSebastian Fricke
55*a037699dSSebastian FrickePrerequisite: ``CONFIG_DYNAMIC_FTRACE`` & ``#include <linux/ftrace.h>``
56*a037699dSSebastian Fricke
57*a037699dSSebastian FrickeIt is a tiny bit less comfortable to use than printk(), because you will have
58*a037699dSSebastian Fricketo read the messages from the trace file (See: :ref:`read_ftrace_log`
59*a037699dSSebastian Frickeinstead of from the kernel log, but very useful when printk() adds unwanted
60*a037699dSSebastian Frickedelays into the code execution, causing issues to be flaky or hidden.)
61*a037699dSSebastian Fricke
62*a037699dSSebastian FrickeIf the processing of this still causes timing issues then you can try
63*a037699dSSebastian Fricketrace_puts().
64*a037699dSSebastian Fricke
65*a037699dSSebastian FrickeFor the full Documentation see trace_printk()
66*a037699dSSebastian Fricke
67*a037699dSSebastian Frickedev_dbg
68*a037699dSSebastian Fricke~~~~~~~
69*a037699dSSebastian Fricke
70*a037699dSSebastian FrickePrint statement, which can be targeted by
71*a037699dSSebastian Fricke:ref:`process/debugging/userspace_debugging_guide:dynamic debug` that contains
72*a037699dSSebastian Frickeadditional information about the device used within the context.
73*a037699dSSebastian Fricke
74*a037699dSSebastian Fricke**When is it appropriate to leave a debug print in the code?**
75*a037699dSSebastian Fricke
76*a037699dSSebastian FrickePermanent debug statements have to be useful for a developer to troubleshoot
77*a037699dSSebastian Frickedriver misbehavior. Judging that is a bit more of an art than a science, but
78*a037699dSSebastian Frickesome guidelines are in the :ref:`Coding style guidelines
79*a037699dSSebastian Fricke<process/coding-style:13) printing kernel messages>`. In almost all cases the
80*a037699dSSebastian Frickedebug statements shouldn't be upstreamed, as a working driver is supposed to be
81*a037699dSSebastian Frickesilent.
82*a037699dSSebastian Fricke
83*a037699dSSebastian FrickeCustom printk
84*a037699dSSebastian Fricke~~~~~~~~~~~~~
85*a037699dSSebastian Fricke
86*a037699dSSebastian FrickeExample::
87*a037699dSSebastian Fricke
88*a037699dSSebastian Fricke  #define core_dbg(fmt, arg...) do { \
89*a037699dSSebastian Fricke	  if (core_debug) \
90*a037699dSSebastian Fricke		  printk(KERN_DEBUG pr_fmt("core: " fmt), ## arg); \
91*a037699dSSebastian Fricke	  } while (0)
92*a037699dSSebastian Fricke
93*a037699dSSebastian Fricke**When should you do this?**
94*a037699dSSebastian Fricke
95*a037699dSSebastian FrickeIt is better to just use a pr_debug(), which can later be turned on/off with
96*a037699dSSebastian Frickedynamic debug. Additionally, a lot of drivers activate these prints via a
97*a037699dSSebastian Frickevariable like ``core_debug`` set by a module parameter. However, Module
98*a037699dSSebastian Frickeparameters `are not recommended anymore
99*a037699dSSebastian Fricke<https://lore.kernel.org/all/2024032757-surcharge-grime-d3dd@gregkh>`_.
100*a037699dSSebastian Fricke
101*a037699dSSebastian FrickeFtrace
102*a037699dSSebastian Fricke------
103*a037699dSSebastian Fricke
104*a037699dSSebastian FrickeCreating a custom Ftrace tracepoint
105*a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
106*a037699dSSebastian Fricke
107*a037699dSSebastian FrickeA tracepoint adds a hook into your code that will be called and logged when the
108*a037699dSSebastian Fricketracepoint is enabled. This can be used, for example, to trace hitting a
109*a037699dSSebastian Frickeconditional branch or to dump the internal state at specific points of the code
110*a037699dSSebastian Frickeflow during a debugging session.
111*a037699dSSebastian Fricke
112*a037699dSSebastian FrickeHere is a basic description of :ref:`how to implement new tracepoints
113*a037699dSSebastian Fricke<trace/tracepoints:usage>`.
114*a037699dSSebastian Fricke
115*a037699dSSebastian FrickeFor the full event tracing documentation see :doc:`/trace/events`
116*a037699dSSebastian Fricke
117*a037699dSSebastian FrickeFor the full Ftrace documentation see :doc:`/trace/ftrace`
118*a037699dSSebastian Fricke
119*a037699dSSebastian FrickeDebugFS
120*a037699dSSebastian Fricke-------
121*a037699dSSebastian Fricke
122*a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_FS` & `#include <linux/debugfs.h>``
123*a037699dSSebastian Fricke
124*a037699dSSebastian FrickeDebugFS differs from the other approaches of debugging, as it doesn't write
125*a037699dSSebastian Frickemessages to the kernel log nor add traces to the code. Instead it allows the
126*a037699dSSebastian Frickedeveloper to handle a set of files.
127*a037699dSSebastian FrickeWith these files you can either store values of variables or make
128*a037699dSSebastian Frickeregister/memory dumps or you can make these files writable and modify
129*a037699dSSebastian Frickevalues/settings in the driver.
130*a037699dSSebastian Fricke
131*a037699dSSebastian FrickePossible use-cases among others:
132*a037699dSSebastian Fricke
133*a037699dSSebastian Fricke- Store register values
134*a037699dSSebastian Fricke- Keep track of variables
135*a037699dSSebastian Fricke- Store errors
136*a037699dSSebastian Fricke- Store settings
137*a037699dSSebastian Fricke- Toggle a setting like debug on/off
138*a037699dSSebastian Fricke- Error injection
139*a037699dSSebastian Fricke
140*a037699dSSebastian FrickeThis is especially useful, when the size of a data dump would be hard to digest
141*a037699dSSebastian Frickeas part of the general kernel log (for example when dumping raw bitstream data)
142*a037699dSSebastian Frickeor when you are not interested in all the values all the time, but with the
143*a037699dSSebastian Frickepossibility to inspect them.
144*a037699dSSebastian Fricke
145*a037699dSSebastian FrickeThe general idea is:
146*a037699dSSebastian Fricke
147*a037699dSSebastian Fricke- Create a directory during probe (``struct dentry *parent =
148*a037699dSSebastian Fricke  debugfs_create_dir("my_driver", NULL);``)
149*a037699dSSebastian Fricke- Create a file (``debugfs_create_u32("my_value", 444, parent, &my_variable);``)
150*a037699dSSebastian Fricke
151*a037699dSSebastian Fricke  - In this example the file is found in
152*a037699dSSebastian Fricke    ``/sys/kernel/debug/my_driver/my_value`` (with read permissions for
153*a037699dSSebastian Fricke    user/group/all)
154*a037699dSSebastian Fricke  - any read of the file will return the current contents of the variable
155*a037699dSSebastian Fricke    ``my_variable``
156*a037699dSSebastian Fricke
157*a037699dSSebastian Fricke- Clean up the directory when removing the device
158*a037699dSSebastian Fricke  (``debugfs_remove_recursive(parent);``)
159*a037699dSSebastian Fricke
160*a037699dSSebastian FrickeFor the full documentation see :doc:`/filesystems/debugfs`.
161*a037699dSSebastian Fricke
162*a037699dSSebastian FrickeKASAN, UBSAN, lockdep and other error checkers
163*a037699dSSebastian Fricke----------------------------------------------
164*a037699dSSebastian Fricke
165*a037699dSSebastian FrickeKASAN (Kernel Address Sanitizer)
166*a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
167*a037699dSSebastian Fricke
168*a037699dSSebastian FrickePrerequisite: ``CONFIG_KASAN``
169*a037699dSSebastian Fricke
170*a037699dSSebastian FrickeKASAN is a dynamic memory error detector that helps to find use-after-free and
171*a037699dSSebastian Frickeout-of-bounds bugs. It uses compile-time instrumentation to check every memory
172*a037699dSSebastian Frickeaccess.
173*a037699dSSebastian Fricke
174*a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/kasan`.
175*a037699dSSebastian Fricke
176*a037699dSSebastian FrickeUBSAN (Undefined Behavior Sanitizer)
177*a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
178*a037699dSSebastian Fricke
179*a037699dSSebastian FrickePrerequisite: ``CONFIG_UBSAN``
180*a037699dSSebastian Fricke
181*a037699dSSebastian FrickeUBSAN relies on compiler instrumentation and runtime checks to detect undefined
182*a037699dSSebastian Frickebehavior. It is designed to find a variety of issues, including signed integer
183*a037699dSSebastian Frickeoverflow, array index out of bounds, and more.
184*a037699dSSebastian Fricke
185*a037699dSSebastian FrickeFor the full documentation see :doc:`/dev-tools/ubsan`
186*a037699dSSebastian Fricke
187*a037699dSSebastian Frickelockdep (Lock Dependency Validator)
188*a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189*a037699dSSebastian Fricke
190*a037699dSSebastian FrickePrerequisite: ``CONFIG_DEBUG_LOCKDEP``
191*a037699dSSebastian Fricke
192*a037699dSSebastian Frickelockdep is a runtime lock dependency validator that detects potential deadlocks
193*a037699dSSebastian Frickeand other locking-related issues in the kernel.
194*a037699dSSebastian FrickeIt tracks lock acquisitions and releases, building a dependency graph that is
195*a037699dSSebastian Frickeanalyzed for potential deadlocks.
196*a037699dSSebastian Frickelockdep is especially useful for validating the correctness of lock ordering in
197*a037699dSSebastian Frickethe kernel.
198*a037699dSSebastian Fricke
199*a037699dSSebastian FrickePSI (Pressure stall information tracking)
200*a037699dSSebastian Fricke~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
201*a037699dSSebastian Fricke
202*a037699dSSebastian FrickePrerequisite: ``CONFIG_PSI``
203*a037699dSSebastian Fricke
204*a037699dSSebastian FrickePSI is a measurement tool to identify excessive overcommits on hardware
205*a037699dSSebastian Frickeresources, that can cause performance disruptions or even OOM kills.
206*a037699dSSebastian Fricke
207*a037699dSSebastian Frickedevice coredump
208*a037699dSSebastian Fricke---------------
209*a037699dSSebastian Fricke
210*a037699dSSebastian FrickePrerequisite: ``#include <linux/devcoredump.h>``
211*a037699dSSebastian Fricke
212*a037699dSSebastian FrickeProvides the infrastructure for a driver to provide arbitrary data to userland.
213*a037699dSSebastian FrickeIt is most often used in conjunction with udev or similar userland application
214*a037699dSSebastian Fricketo listen for kernel uevents, which indicate that the dump is ready. Udev has
215*a037699dSSebastian Frickerules to copy that file somewhere for long-term storage and analysis, as by
216*a037699dSSebastian Frickedefault, the data for the dump is automatically cleaned up after 5 minutes.
217*a037699dSSebastian FrickeThat data is analyzed with driver-specific tools or GDB.
218*a037699dSSebastian Fricke
219*a037699dSSebastian FrickeYou can find an example implementation at:
220*a037699dSSebastian Fricke`drivers/media/platform/qcom/venus/core.c
221*a037699dSSebastian Fricke<https://elixir.bootlin.com/linux/v6.11.6/source/drivers/media/platform/qcom/venus/core.c#L30>`__
222*a037699dSSebastian Fricke
223*a037699dSSebastian Fricke**Copyright** ©2024 : Collabora
224