xref: /linux/Documentation/gpu/drm-uapi.rst (revision c06b6cde2a1c3bcbb561bd57bb6f34eae9030921)
1.. Copyright 2020 DisplayLink (UK) Ltd.
2
3===================
4Userland interfaces
5===================
6
7The DRM core exports several interfaces to applications, generally
8intended to be used through corresponding libdrm wrapper functions. In
9addition, drivers export device-specific interfaces for use by userspace
10drivers & device-aware applications through ioctls and sysfs files.
11
12External interfaces include: memory mapping, context management, DMA
13operations, AGP management, vblank control, fence management, memory
14management, and output management.
15
16Cover generic ioctls and sysfs layout here. We only need high-level
17info, since man pages should cover the rest.
18
19.. contents::
20
21libdrm Device Lookup
22====================
23
24.. kernel-doc:: drivers/gpu/drm/drm_ioctl.c
25   :doc: getunique and setversion story
26
27
28.. _drm_primary_node:
29
30Primary Nodes, DRM Master and Authentication
31============================================
32
33.. kernel-doc:: drivers/gpu/drm/drm_auth.c
34   :doc: master and authentication
35
36.. kernel-doc:: drivers/gpu/drm/drm_auth.c
37   :export:
38
39.. kernel-doc:: include/drm/drm_auth.h
40   :internal:
41
42
43.. _drm_leasing:
44
45DRM Display Resource Leasing
46============================
47
48.. kernel-doc:: drivers/gpu/drm/drm_lease.c
49   :doc: drm leasing
50
51Open-Source Userspace Requirements
52==================================
53
54The DRM subsystem has stricter requirements than most other kernel subsystems on
55what the userspace side for new uAPI needs to look like. This section here
56explains what exactly those requirements are, and why they exist.
57
58The short summary is that any addition of DRM uAPI requires corresponding
59open-sourced userspace patches, and those patches must be reviewed and ready for
60merging into a suitable and canonical upstream project.
61
62GFX devices (both display and render/GPU side) are really complex bits of
63hardware, with userspace and kernel by necessity having to work together really
64closely.  The interfaces, for rendering and modesetting, must be extremely wide
65and flexible, and therefore it is almost always impossible to precisely define
66them for every possible corner case. This in turn makes it really practically
67infeasible to differentiate between behaviour that's required by userspace, and
68which must not be changed to avoid regressions, and behaviour which is only an
69accidental artifact of the current implementation.
70
71Without access to the full source code of all userspace users that means it
72becomes impossible to change the implementation details, since userspace could
73depend upon the accidental behaviour of the current implementation in minute
74details. And debugging such regressions without access to source code is pretty
75much impossible. As a consequence this means:
76
77- The Linux kernel's "no regression" policy holds in practice only for
78  open-source userspace of the DRM subsystem. DRM developers are perfectly fine
79  if closed-source blob drivers in userspace use the same uAPI as the open
80  drivers, but they must do so in the exact same way as the open drivers.
81  Creative (ab)use of the interfaces will, and in the past routinely has, lead
82  to breakage.
83
84- Any new userspace interface must have an open-source implementation as
85  demonstration vehicle.
86
87The other reason for requiring open-source userspace is uAPI review. Since the
88kernel and userspace parts of a GFX stack must work together so closely, code
89review can only assess whether a new interface achieves its goals by looking at
90both sides. Making sure that the interface indeed covers the use-case fully
91leads to a few additional requirements:
92
93- The open-source userspace must not be a toy/test application, but the real
94  thing. Specifically it needs to handle all the usual error and corner cases.
95  These are often the places where new uAPI falls apart and hence essential to
96  assess the fitness of a proposed interface.
97
98- The userspace side must be fully reviewed and tested to the standards of that
99  userspace project. For e.g. mesa this means piglit testcases and review on the
100  mailing list. This is again to ensure that the new interface actually gets the
101  job done.  The userspace-side reviewer should also provide an Acked-by on the
102  kernel uAPI patch indicating that they believe the proposed uAPI is sound and
103  sufficiently documented and validated for userspace's consumption.
104
105- The userspace patches must be against the canonical upstream, not some vendor
106  fork. This is to make sure that no one cheats on the review and testing
107  requirements by doing a quick fork.
108
109- The kernel patch can only be merged after all the above requirements are met,
110  but it **must** be merged to either drm-next or drm-misc-next **before** the
111  userspace patches land. uAPI always flows from the kernel, doing things the
112  other way round risks divergence of the uAPI definitions and header files.
113
114These are fairly steep requirements, but have grown out from years of shared
115pain and experience with uAPI added hastily, and almost always regretted about
116just as fast. GFX devices change really fast, requiring a paradigm shift and
117entire new set of uAPI interfaces every few years at least. Together with the
118Linux kernel's guarantee to keep existing userspace running for 10+ years this
119is already rather painful for the DRM subsystem, with multiple different uAPIs
120for the same thing co-existing. If we add a few more complete mistakes into the
121mix every year it would be entirely unmanageable.
122
123The DRM subsystem has however no concern with independent closed-source
124userspace implementations. To officialize that position, the DRM uAPI headers
125are covered by the MIT license.
126
127.. _drm_render_node:
128
129Render nodes
130============
131
132DRM core provides multiple character-devices for user-space to use.
133Depending on which device is opened, user-space can perform a different
134set of operations (mainly ioctls). The primary node is always created
135and called card<num>. Additionally, a currently unused control node,
136called controlD<num> is also created. The primary node provides all
137legacy operations and historically was the only interface used by
138userspace. With KMS, the control node was introduced. However, the
139planned KMS control interface has never been written and so the control
140node stays unused to date.
141
142With the increased use of offscreen renderers and GPGPU applications,
143clients no longer require running compositors or graphics servers to
144make use of a GPU. But the DRM API required unprivileged clients to
145authenticate to a DRM-Master prior to getting GPU access. To avoid this
146step and to grant clients GPU access without authenticating, render
147nodes were introduced. Render nodes solely serve render clients, that
148is, no modesetting or privileged ioctls can be issued on render nodes.
149Only non-global rendering commands are allowed. If a driver supports
150render nodes, it must advertise it via the DRIVER_RENDER DRM driver
151capability. If not supported, the primary node must be used for render
152clients together with the legacy drmAuth authentication procedure.
153
154If a driver advertises render node support, DRM core will create a
155separate render node called renderD<num>. There will be one render node
156per device. No ioctls except PRIME-related ioctls will be allowed on
157this node. Especially GEM_OPEN will be explicitly prohibited. For a
158complete list of driver-independent ioctls that can be used on render
159nodes, see the ioctls marked DRM_RENDER_ALLOW in drm_ioctl.c  Render
160nodes are designed to avoid the buffer-leaks, which occur if clients
161guess the flink names or mmap offsets on the legacy interface.
162Additionally to this basic interface, drivers must mark their
163driver-dependent render-only ioctls as DRM_RENDER_ALLOW so render
164clients can use them. Driver authors must be careful not to allow any
165privileged ioctls on render nodes.
166
167With render nodes, user-space can now control access to the render node
168via basic file-system access-modes. A running graphics server which
169authenticates clients on the privileged primary/legacy node is no longer
170required. Instead, a client can open the render node and is immediately
171granted GPU access. Communication between clients (or servers) is done
172via PRIME. FLINK from render node to legacy node is not supported. New
173clients must not use the insecure FLINK interface.
174
175Besides dropping all modeset/global ioctls, render nodes also drop the
176DRM-Master concept. There is no reason to associate render clients with
177a DRM-Master as they are independent of any graphics server. Besides,
178they must work without any running master, anyway. Drivers must be able
179to run without a master object if they support render nodes. If, on the
180other hand, a driver requires shared state between clients which is
181visible to user-space and accessible beyond open-file boundaries, they
182cannot support render nodes.
183
184Device Hot-Unplug
185=================
186
187.. note::
188   The following is the plan. Implementation is not there yet
189   (2020 May).
190
191Graphics devices (display and/or render) may be connected via USB (e.g.
192display adapters or docking stations) or Thunderbolt (e.g. eGPU). An end
193user is able to hot-unplug this kind of devices while they are being
194used, and expects that the very least the machine does not crash. Any
195damage from hot-unplugging a DRM device needs to be limited as much as
196possible and userspace must be given the chance to handle it if it wants
197to. Ideally, unplugging a DRM device still lets a desktop continue to
198run, but that is going to need explicit support throughout the whole
199graphics stack: from kernel and userspace drivers, through display
200servers, via window system protocols, and in applications and libraries.
201
202Other scenarios that should lead to the same are: unrecoverable GPU
203crash, PCI device disappearing off the bus, or forced unbind of a driver
204from the physical device.
205
206In other words, from userspace perspective everything needs to keep on
207working more or less, until userspace stops using the disappeared DRM
208device and closes it completely. Userspace will learn of the device
209disappearance from the device removed uevent, ioctls returning ENODEV
210(or driver-specific ioctls returning driver-specific things), or open()
211returning ENXIO.
212
213Only after userspace has closed all relevant DRM device and dmabuf file
214descriptors and removed all mmaps, the DRM driver can tear down its
215instance for the device that no longer exists. If the same physical
216device somehow comes back in the mean time, it shall be a new DRM
217device.
218
219Similar to PIDs, chardev minor numbers are not recycled immediately. A
220new DRM device always picks the next free minor number compared to the
221previous one allocated, and wraps around when minor numbers are
222exhausted.
223
224The goal raises at least the following requirements for the kernel and
225drivers.
226
227Requirements for KMS UAPI
228-------------------------
229
230- KMS connectors must change their status to disconnected.
231
232- Legacy modesets and pageflips, and atomic commits, both real and
233  TEST_ONLY, and any other ioctls either fail with ENODEV or fake
234  success.
235
236- Pending non-blocking KMS operations deliver the DRM events userspace
237  is expecting. This applies also to ioctls that faked success.
238
239- open() on a device node whose underlying device has disappeared will
240  fail with ENXIO.
241
242- Attempting to create a DRM lease on a disappeared DRM device will
243  fail with ENODEV. Existing DRM leases remain and work as listed
244  above.
245
246Requirements for Render and Cross-Device UAPI
247---------------------------------------------
248
249- All GPU jobs that can no longer run must have their fences
250  force-signalled to avoid inflicting hangs on userspace.
251  The associated error code is ENODEV.
252
253- Some userspace APIs already define what should happen when the device
254  disappears (OpenGL, GL ES: `GL_KHR_robustness`_; `Vulkan`_:
255  VK_ERROR_DEVICE_LOST; etc.). DRM drivers are free to implement this
256  behaviour the way they see best, e.g. returning failures in
257  driver-specific ioctls and handling those in userspace drivers, or
258  rely on uevents, and so on.
259
260- dmabuf which point to memory that has disappeared will either fail to
261  import with ENODEV or continue to be successfully imported if it would
262  have succeeded before the disappearance. See also about memory maps
263  below for already imported dmabufs.
264
265- Attempting to import a dmabuf to a disappeared device will either fail
266  with ENODEV or succeed if it would have succeeded without the
267  disappearance.
268
269- open() on a device node whose underlying device has disappeared will
270  fail with ENXIO.
271
272.. _GL_KHR_robustness: https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_robustness.txt
273.. _Vulkan: https://www.khronos.org/vulkan/
274
275Requirements for Memory Maps
276----------------------------
277
278Memory maps have further requirements that apply to both existing maps
279and maps created after the device has disappeared. If the underlying
280memory disappears, the map is created or modified such that reads and
281writes will still complete successfully but the result is undefined.
282This applies to both userspace mmap()'d memory and memory pointed to by
283dmabuf which might be mapped to other devices (cross-device dmabuf
284imports).
285
286Raising SIGBUS is not an option, because userspace cannot realistically
287handle it. Signal handlers are global, which makes them extremely
288difficult to use correctly from libraries like those that Mesa produces.
289Signal handlers are not composable, you can't have different handlers
290for GPU1 and GPU2 from different vendors, and a third handler for
291mmapped regular files. Threads cause additional pain with signal
292handling as well.
293
294Device reset
295============
296
297The GPU stack is really complex and is prone to errors, from hardware bugs,
298faulty applications and everything in between the many layers. Some errors
299require resetting the device in order to make the device usable again. This
300section describes the expectations for DRM and usermode drivers when a
301device resets and how to propagate the reset status.
302
303Device resets can not be disabled without tainting the kernel, which can lead to
304hanging the entire kernel through shrinkers/mmu_notifiers. Userspace role in
305device resets is to propagate the message to the application and apply any
306special policy for blocking guilty applications, if any. Corollary is that
307debugging a hung GPU context require hardware support to be able to preempt such
308a GPU context while it's stopped.
309
310Kernel Mode Driver
311------------------
312
313The KMD is responsible for checking if the device needs a reset, and to perform
314it as needed. Usually a hang is detected when a job gets stuck executing.
315
316Propagation of errors to userspace has proven to be tricky since it goes in
317the opposite direction of the usual flow of commands. Because of this vendor
318independent error handling was added to the &dma_fence object, this way drivers
319can add an error code to their fences before signaling them. See function
320dma_fence_set_error() on how to do this and for examples of error codes to use.
321
322The DRM scheduler also allows setting error codes on all pending fences when
323hardware submissions are restarted after an reset. Error codes are also
324forwarded from the hardware fence to the scheduler fence to bubble up errors
325to the higher levels of the stack and eventually userspace.
326
327Fence errors can be queried by userspace through the generic SYNC_IOC_FILE_INFO
328IOCTL as well as through driver specific interfaces.
329
330Additional to setting fence errors drivers should also keep track of resets per
331context, the DRM scheduler provides the drm_sched_entity_error() function as
332helper for this use case. After a reset, KMD should reject new command
333submissions for affected contexts.
334
335User Mode Driver
336----------------
337
338After command submission, UMD should check if the submission was accepted or
339rejected. After a reset, KMD should reject submissions, and UMD can issue an
340ioctl to the KMD to check the reset status, and this can be checked more often
341if the UMD requires it. After detecting a reset, UMD will then proceed to report
342it to the application using the appropriate API error code, as explained in the
343section below about robustness.
344
345Robustness
346----------
347
348The only way to try to keep a graphical API context working after a reset is if
349it complies with the robustness aspects of the graphical API that it is using.
350
351Graphical APIs provide ways to applications to deal with device resets. However,
352there is no guarantee that the app will use such features correctly, and a
353userspace that doesn't support robust interfaces (like a non-robust
354OpenGL context or API without any robustness support like libva) leave the
355robustness handling entirely to the userspace driver. There is no strong
356community consensus on what the userspace driver should do in that case,
357since all reasonable approaches have some clear downsides.
358
359OpenGL
360~~~~~~
361
362Apps using OpenGL should use the available robust interfaces, like the
363extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). This
364interface tells if a reset has happened, and if so, all the context state is
365considered lost and the app proceeds by creating new ones. There's no consensus
366on what to do to if robustness is not in use.
367
368Vulkan
369~~~~~~
370
371Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions.
372This error code means, among other things, that a device reset has happened and
373it needs to recreate the contexts to keep going.
374
375Reporting causes of resets
376--------------------------
377
378Apart from propagating the reset through the stack so apps can recover, it's
379really useful for driver developers to learn more about what caused the reset in
380the first place. For this, drivers can make use of devcoredump to store relevant
381information about the reset and send device wedged event with ``none`` recovery
382method (as explained in "Device Wedging" chapter) to notify userspace, so this
383information can be collected and added to user bug reports.
384
385Device Wedging
386==============
387
388Drivers can optionally make use of device wedged event (implemented as
389drm_dev_wedged_event() in DRM subsystem), which notifies userspace of 'wedged'
390(hanged/unusable) state of the DRM device through a uevent. This is useful
391especially in cases where the device is no longer operating as expected and has
392become unrecoverable from driver context. Purpose of this implementation is to
393provide drivers a generic way to recover the device with the help of userspace
394intervention, without taking any drastic measures (like resetting or
395re-enumerating the full bus, on which the underlying physical device is sitting)
396in the driver.
397
398A 'wedged' device is basically a device that is declared dead by the driver
399after exhausting all possible attempts to recover it from driver context. The
400uevent is the notification that is sent to userspace along with a hint about
401what could possibly be attempted to recover the device from userspace and bring
402it back to usable state. Different drivers may have different ideas of a
403'wedged' device depending on hardware implementation of the underlying physical
404device, and hence the vendor agnostic nature of the event. It is up to the
405drivers to decide when they see the need for device recovery and how they want
406to recover from the available methods.
407
408Driver prerequisites
409--------------------
410
411The driver, before opting for recovery, needs to make sure that the 'wedged'
412device doesn't harm the system as a whole by taking care of the prerequisites.
413Necessary actions must include disabling DMA to system memory as well as any
414communication channels with other devices. Further, the driver must ensure
415that all dma_fences are signalled and any device state that the core kernel
416might depend on is cleaned up. All existing mmaps should be invalidated and
417page faults should be redirected to a dummy page. Once the event is sent, the
418device must be kept in 'wedged' state until the recovery is performed. New
419accesses to the device (IOCTLs) should be rejected, preferably with an error
420code that resembles the type of failure the device has encountered. This will
421signify the reason for wedging, which can be reported to the application if
422needed.
423
424Recovery
425--------
426
427Current implementation defines four recovery methods, out of which, drivers
428can use any one, multiple or none. Method(s) of choice will be sent in the
429uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in order of less to
430more side-effects. See the section `Vendor Specific Recovery`_
431for ``WEDGED=vendor-specific``. If driver is unsure about recovery or
432method is unknown, ``WEDGED=unknown`` will be sent instead.
433
434Userspace consumers can parse this event and attempt recovery as per the
435following expectations.
436
437    =============== ========================================
438    Recovery method Consumer expectations
439    =============== ========================================
440    none            optional telemetry collection
441    rebind          unbind + bind driver
442    bus-reset       unbind + bus reset/re-enumeration + bind
443    vendor-specific vendor specific recovery method
444    unknown         consumer policy
445    =============== ========================================
446
447No Recovery
448-----------
449
450Here ``WEDGED=none`` signifies that no recovery is expected from the consumer
451but it can still try to gather telemetry information (devcoredump, syslog) for
452debug purpose in order to root cause the hang. This is useful because the first
453hang is usually the most critical one which can result in consequential hangs
454or complete wedging.
455
456Vendor Specific Recovery
457------------------------
458
459When ``WEDGED=vendor-specific`` is sent, it indicates that the device requires
460a recovery procedure specific to the hardware vendor and is not one of the
461standardized approaches.
462
463``WEDGED=vendor-specific`` may be used to indicate different cases within a
464single vendor driver, each requiring a distinct recovery procedure.
465In such scenarios, the vendor driver must provide comprehensive documentation
466that describes each case, include additional hints to identify specific case and
467outline the corresponding recovery procedure. The documentation includes:
468
469Case - A list of all cases that sends the ``WEDGED=vendor-specific`` recovery method.
470
471Hints - Additional Information to assist the userspace consumer in identifying and
472differentiating between different cases. This can be exposed through sysfs, debugfs,
473traces, dmesg etc.
474
475Recovery Procedure - Clear instructions and guidance for recovering each case.
476This may include userspace scripts, tools needed for the recovery procedure.
477
478It is the responsibility of the admin/userspace consumer to identify the case and
479verify additional identification hints before attempting a recovery procedure.
480
481Example: If the device uses the Xe driver, then userspace consumer should refer to
482:ref:`Xe Device Wedging <xe-device-wedging>` for the detailed documentation.
483
484Task information
485----------------
486
487The information about which application (if any) was involved in the device
488wedging is useful for userspace if they want to notify the user about what
489happened (e.g. the compositor display a message to the user "The <task name>
490caused a graphical error and the system recovered") or to implement policies
491(e.g. the daemon may "ban" an task that keeps resetting the device). If the task
492information is available, the uevent will display as ``PID=<pid>`` and
493``TASK=<task name>``. Otherwise, ``PID`` and ``TASK`` will not appear in the
494event string.
495
496The reliability of this information is driver and hardware specific, and should
497be taken with a caution regarding it's precision. To have a big picture of what
498really happened, the devcoredump file provides much more detailed information
499about the device state and about the event.
500
501Consumer prerequisites
502----------------------
503
504It is the responsibility of the consumer to make sure that the device or its
505resources are not in use by any process before attempting recovery. With IOCTLs
506erroring out, all device memory should be unmapped and file descriptors should
507be closed to prevent leaks or undefined behaviour. The idea here is to clear the
508device of all user context beforehand and set the stage for a clean recovery.
509
510For ``WEDGED=vendor-specific`` recovery method, it is the responsibility of the
511consumer to check the driver documentation and the usecase before attempting
512a recovery.
513
514Example - rebind
515----------------
516
517Udev rule::
518
519    SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
520    RUN+="/path/to/rebind.sh $env{DEVPATH}"
521
522Recovery script::
523
524    #!/bin/sh
525
526    DEVPATH=$(readlink -f /sys/$1/device)
527    DEVICE=$(basename $DEVPATH)
528    DRIVER=$(readlink -f $DEVPATH/driver)
529
530    echo -n $DEVICE > $DRIVER/unbind
531    echo -n $DEVICE > $DRIVER/bind
532
533Customization
534-------------
535
536Although basic recovery is possible with a simple script, consumers can define
537custom policies around recovery. For example, if the driver supports multiple
538recovery methods, consumers can opt for the suitable one depending on scenarios
539like repeat offences or vendor specific failures. Consumers can also choose to
540have the device available for debugging or telemetry collection and base their
541recovery decision on the findings. This is useful especially when the driver is
542unsure about recovery or method is unknown.
543
544.. _drm_driver_ioctl:
545
546IOCTL Support on Device Nodes
547=============================
548
549.. kernel-doc:: drivers/gpu/drm/drm_ioctl.c
550   :doc: driver specific ioctls
551
552Recommended IOCTL Return Values
553-------------------------------
554
555In theory a driver's IOCTL callback is only allowed to return very few error
556codes. In practice it's good to abuse a few more. This section documents common
557practice within the DRM subsystem:
558
559ENOENT:
560        Strictly this should only be used when a file doesn't exist e.g. when
561        calling the open() syscall. We reuse that to signal any kind of object
562        lookup failure, e.g. for unknown GEM buffer object handles, unknown KMS
563        object handles and similar cases.
564
565ENOSPC:
566        Some drivers use this to differentiate "out of kernel memory" from "out
567        of VRAM". Sometimes also applies to other limited gpu resources used for
568        rendering (e.g. when you have a special limited compression buffer).
569        Sometimes resource allocation/reservation issues in command submission
570        IOCTLs are also signalled through EDEADLK.
571
572        Simply running out of kernel/system memory is signalled through ENOMEM.
573
574EPERM/EACCES:
575        Returned for an operation that is valid, but needs more privileges.
576        E.g. root-only or much more common, DRM master-only operations return
577        this when called by unpriviledged clients. There's no clear
578        difference between EACCES and EPERM.
579
580ENODEV:
581        The device is not present anymore or is not yet fully initialized.
582
583EOPNOTSUPP:
584        Feature (like PRIME, modesetting, GEM) is not supported by the driver.
585
586ENXIO:
587        Remote failure, either a hardware transaction (like i2c), but also used
588        when the exporting driver of a shared dma-buf or fence doesn't support a
589        feature needed.
590
591EINTR:
592        DRM drivers assume that userspace restarts all IOCTLs. Any DRM IOCTL can
593        return EINTR and in such a case should be restarted with the IOCTL
594        parameters left unchanged.
595
596EIO:
597        The GPU died and couldn't be resurrected through a reset. Modesetting
598        hardware failures are signalled through the "link status" connector
599        property.
600
601EINVAL:
602        Catch-all for anything that is an invalid argument combination which
603        cannot work.
604
605IOCTL also use other error codes like ETIME, EFAULT, EBUSY, ENOTTY but their
606usage is in line with the common meanings. The above list tries to just document
607DRM specific patterns. Note that ENOTTY has the slightly unintuitive meaning of
608"this IOCTL does not exist", and is used exactly as such in DRM.
609
610.. kernel-doc:: include/drm/drm_ioctl.h
611   :internal:
612
613.. kernel-doc:: drivers/gpu/drm/drm_ioctl.c
614   :export:
615
616.. kernel-doc:: drivers/gpu/drm/drm_ioc32.c
617   :export:
618
619Testing and validation
620======================
621
622Testing Requirements for userspace API
623--------------------------------------
624
625New cross-driver userspace interface extensions, like new IOCTL, new KMS
626properties, new files in sysfs or anything else that constitutes an API change
627should have driver-agnostic testcases in IGT for that feature, if such a test
628can be reasonably made using IGT for the target hardware.
629
630Validating changes with IGT
631---------------------------
632
633There's a collection of tests that aims to cover the whole functionality of
634DRM drivers and that can be used to check that changes to DRM drivers or the
635core don't regress existing functionality. This test suite is called IGT and
636its code and instructions to build and run can be found in
637https://gitlab.freedesktop.org/drm/igt-gpu-tools/.
638
639Using VKMS to test DRM API
640--------------------------
641
642VKMS is a software-only model of a KMS driver that is useful for testing
643and for running compositors. VKMS aims to enable a virtual display without
644the need for a hardware display capability. These characteristics made VKMS
645a perfect tool for validating the DRM core behavior and also support the
646compositor developer. VKMS makes it possible to test DRM functions in a
647virtual machine without display, simplifying the validation of some of the
648core changes.
649
650To Validate changes in DRM API with VKMS, start setting the kernel: make
651sure to enable VKMS module; compile the kernel with the VKMS enabled and
652install it in the target machine. VKMS can be run in a Virtual Machine
653(QEMU, virtme or similar). It's recommended the use of KVM with the minimum
654of 1GB of RAM and four cores.
655
656It's possible to run the IGT-tests in a VM in two ways:
657
658	1. Use IGT inside a VM
659	2. Use IGT from the host machine and write the results in a shared directory.
660
661Following is an example of using a VM with a shared directory with
662the host machine to run igt-tests. This example uses virtme::
663
664	$ virtme-run --rwdir /path/for/shared_dir --kdir=path/for/kernel/directory --mods=auto
665
666Run the igt-tests in the guest machine. This example runs the 'kms_flip'
667tests::
668
669	$ /path/for/igt-gpu-tools/scripts/run-tests.sh -p -s -t "kms_flip.*" -v
670
671In this example, instead of building the igt_runner, Piglit is used
672(-p option). It creates an HTML summary of the test results and saves
673them in the folder "igt-gpu-tools/results". It executes only the igt-tests
674matching the -t option.
675
676Display CRC Support
677-------------------
678
679.. kernel-doc:: drivers/gpu/drm/drm_debugfs_crc.c
680   :doc: CRC ABI
681
682.. kernel-doc:: drivers/gpu/drm/drm_debugfs_crc.c
683   :export:
684
685Debugfs Support
686---------------
687
688.. kernel-doc:: include/drm/drm_debugfs.h
689   :internal:
690
691.. kernel-doc:: drivers/gpu/drm/drm_debugfs.c
692   :export:
693
694Sysfs Support
695=============
696
697.. kernel-doc:: drivers/gpu/drm/drm_sysfs.c
698   :doc: overview
699
700.. kernel-doc:: drivers/gpu/drm/drm_sysfs.c
701   :export:
702
703
704VBlank event handling
705=====================
706
707The DRM core exposes two vertical blank related ioctls:
708
709:c:macro:`DRM_IOCTL_WAIT_VBLANK`
710    This takes a struct drm_wait_vblank structure as its argument, and
711    it is used to block or request a signal when a specified vblank
712    event occurs.
713
714:c:macro:`DRM_IOCTL_MODESET_CTL`
715    This was only used for user-mode-settind drivers around modesetting
716    changes to allow the kernel to update the vblank interrupt after
717    mode setting, since on many devices the vertical blank counter is
718    reset to 0 at some point during modeset. Modern drivers should not
719    call this any more since with kernel mode setting it is a no-op.
720
721Userspace API Structures
722========================
723
724.. kernel-doc:: include/uapi/drm/drm_mode.h
725   :doc: overview
726
727.. _crtc_index:
728
729CRTC index
730----------
731
732CRTC's have both an object ID and an index, and they are not the same thing.
733The index is used in cases where a densely packed identifier for a CRTC is
734needed, for instance a bitmask of CRTC's. The member possible_crtcs of struct
735drm_mode_get_plane is an example.
736
737:c:macro:`DRM_IOCTL_MODE_GETRESOURCES` populates a structure with an array of
738CRTC ID's, and the CRTC index is its position in this array.
739
740.. kernel-doc:: include/uapi/drm/drm.h
741   :internal:
742
743.. kernel-doc:: include/uapi/drm/drm_mode.h
744   :internal:
745
746
747dma-buf interoperability
748========================
749
750Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst for
751information on how dma-buf is integrated and exposed within DRM.
752
753
754Trace events
755============
756
757See Documentation/trace/tracepoints.rst for information about using
758Linux Kernel Tracepoints.
759In the DRM subsystem, some events are considered stable uAPI to avoid
760breaking tools (e.g.: GPUVis, umr) relying on them. Stable means that fields
761cannot be removed, nor their formatting updated. Adding new fields is
762possible, under the normal uAPI requirements.
763
764Stable uAPI events
765------------------
766
767From ``drivers/gpu/drm/scheduler/gpu_scheduler_trace.h``
768
769.. kernel-doc::  drivers/gpu/drm/scheduler/gpu_scheduler_trace.h
770   :doc: uAPI trace events
771