xref: /linux/Documentation/gpu/todo.rst (revision ec63e2a4897075e427c121d863bd89c44578094f)
1.. _todo:
2
3=========
4TODO list
5=========
6
7This section contains a list of smaller janitorial tasks in the kernel DRM
8graphics subsystem useful as newbie projects. Or for slow rainy days.
9
10Subsystem-wide refactorings
11===========================
12
13De-midlayer drivers
14-------------------
15
16With the recent ``drm_bus`` cleanup patches for 3.17 it is no longer required
17to have a ``drm_bus`` structure set up. Drivers can directly set up the
18``drm_device`` structure instead of relying on bus methods in ``drm_usb.c``
19and ``drm_pci.c``. The goal is to get rid of the driver's ``->load`` /
20``->unload`` callbacks and open-code the load/unload sequence properly, using
21the new two-stage ``drm_device`` setup/teardown.
22
23Once all existing drivers are converted we can also remove those bus support
24files for USB and platform devices.
25
26All you need is a GPU for a non-converted driver (currently almost all of
27them, but also all the virtual ones used by KVM, so everyone qualifies).
28
29Contact: Daniel Vetter, Thierry Reding, respective driver maintainers
30
31
32Remove custom dumb_map_offset implementations
33---------------------------------------------
34
35All GEM based drivers should be using drm_gem_create_mmap_offset() instead.
36Audit each individual driver, make sure it'll work with the generic
37implementation (there's lots of outdated locking leftovers in various
38implementations), and then remove it.
39
40Contact: Daniel Vetter, respective driver maintainers
41
42Convert existing KMS drivers to atomic modesetting
43--------------------------------------------------
44
453.19 has the atomic modeset interfaces and helpers, so drivers can now be
46converted over. Modern compositors like Wayland or Surfaceflinger on Android
47really want an atomic modeset interface, so this is all about the bright
48future.
49
50There is a conversion guide for atomic and all you need is a GPU for a
51non-converted driver (again virtual HW drivers for KVM are still all
52suitable).
53
54As part of this drivers also need to convert to universal plane (which means
55exposing primary & cursor as proper plane objects). But that's much easier to
56do by directly using the new atomic helper driver callbacks.
57
58Contact: Daniel Vetter, respective driver maintainers
59
60Clean up the clipped coordination confusion around planes
61---------------------------------------------------------
62
63We have a helper to get this right with drm_plane_helper_check_update(), but
64it's not consistently used. This should be fixed, preferrably in the atomic
65helpers (and drivers then moved over to clipped coordinates). Probably the
66helper should also be moved from drm_plane_helper.c to the atomic helpers, to
67avoid confusion - the other helpers in that file are all deprecated legacy
68helpers.
69
70Contact: Ville Syrjälä, Daniel Vetter, driver maintainers
71
72Convert early atomic drivers to async commit helpers
73----------------------------------------------------
74
75For the first year the atomic modeset helpers didn't support asynchronous /
76nonblocking commits, and every driver had to hand-roll them. This is fixed
77now, but there's still a pile of existing drivers that easily could be
78converted over to the new infrastructure.
79
80One issue with the helpers is that they require that drivers handle completion
81events for atomic commits correctly. But fixing these bugs is good anyway.
82
83Contact: Daniel Vetter, respective driver maintainers
84
85Fallout from atomic KMS
86-----------------------
87
88``drm_atomic_helper.c`` provides a batch of functions which implement legacy
89IOCTLs on top of the new atomic driver interface. Which is really nice for
90gradual conversion of drivers, but unfortunately the semantic mismatches are
91a bit too severe. So there's some follow-up work to adjust the function
92interfaces to fix these issues:
93
94* atomic needs the lock acquire context. At the moment that's passed around
95  implicitly with some horrible hacks, and it's also allocate with
96  ``GFP_NOFAIL`` behind the scenes. All legacy paths need to start allocating
97  the acquire context explicitly on stack and then also pass it down into
98  drivers explicitly so that the legacy-on-atomic functions can use them.
99
100  Except for some driver code this is done. This task should be finished by
101  adding WARN_ON(!drm_drv_uses_atomic_modeset) in drm_modeset_lock_all().
102
103* A bunch of the vtable hooks are now in the wrong place: DRM has a split
104  between core vfunc tables (named ``drm_foo_funcs``), which are used to
105  implement the userspace ABI. And then there's the optional hooks for the
106  helper libraries (name ``drm_foo_helper_funcs``), which are purely for
107  internal use. Some of these hooks should be move from ``_funcs`` to
108  ``_helper_funcs`` since they are not part of the core ABI. There's a
109  ``FIXME`` comment in the kerneldoc for each such case in ``drm_crtc.h``.
110
111Contact: Daniel Vetter
112
113Get rid of dev->struct_mutex from GEM drivers
114---------------------------------------------
115
116``dev->struct_mutex`` is the Big DRM Lock from legacy days and infested
117everything. Nowadays in modern drivers the only bit where it's mandatory is
118serializing GEM buffer object destruction. Which unfortunately means drivers
119have to keep track of that lock and either call ``unreference`` or
120``unreference_locked`` depending upon context.
121
122Core GEM doesn't have a need for ``struct_mutex`` any more since kernel 4.8,
123and there's a ``gem_free_object_unlocked`` callback for any drivers which are
124entirely ``struct_mutex`` free.
125
126For drivers that need ``struct_mutex`` it should be replaced with a driver-
127private lock. The tricky part is the BO free functions, since those can't
128reliably take that lock any more. Instead state needs to be protected with
129suitable subordinate locks or some cleanup work pushed to a worker thread. For
130performance-critical drivers it might also be better to go with a more
131fine-grained per-buffer object and per-context lockings scheme. Currently only the
132``msm`` driver still use ``struct_mutex``.
133
134Contact: Daniel Vetter, respective driver maintainers
135
136Convert instances of dev_info/dev_err/dev_warn to their DRM_DEV_* equivalent
137----------------------------------------------------------------------------
138
139For drivers which could have multiple instances, it is necessary to
140differentiate between which is which in the logs. Since DRM_INFO/WARN/ERROR
141don't do this, drivers used dev_info/warn/err to make this differentiation. We
142now have DRM_DEV_* variants of the drm print macros, so we can start to convert
143those drivers back to using drm-formwatted specific log messages.
144
145Before you start this conversion please contact the relevant maintainers to make
146sure your work will be merged - not everyone agrees that the DRM dmesg macros
147are better.
148
149Contact: Sean Paul, Maintainer of the driver you plan to convert
150
151Convert drivers to use simple modeset suspend/resume
152----------------------------------------------------
153
154Most drivers (except i915 and nouveau) that use
155drm_atomic_helper_suspend/resume() can probably be converted to use
156drm_mode_config_helper_suspend/resume(). Also there's still open-coded version
157of the atomic suspend/resume code in older atomic modeset drivers.
158
159Contact: Maintainer of the driver you plan to convert
160
161Convert drivers to use drm_fb_helper_fbdev_setup/teardown()
162-----------------------------------------------------------
163
164Most drivers can use drm_fb_helper_fbdev_setup() except maybe:
165
166- amdgpu which has special logic to decide whether to call
167  drm_helper_disable_unused_functions()
168
169- armada which isn't atomic and doesn't call
170  drm_helper_disable_unused_functions()
171
172- i915 which calls drm_fb_helper_initial_config() in a worker
173
174Drivers that use drm_framebuffer_remove() to clean up the fbdev framebuffer can
175probably use drm_fb_helper_fbdev_teardown().
176
177Contact: Maintainer of the driver you plan to convert
178
179Clean up mmap forwarding
180------------------------
181
182A lot of drivers forward gem mmap calls to dma-buf mmap for imported buffers.
183And also a lot of them forward dma-buf mmap to the gem mmap implementations.
184Would be great to refactor this all into a set of small common helpers.
185
186Contact: Daniel Vetter
187
188Generic fbdev defio support
189---------------------------
190
191The defio support code in the fbdev core has some very specific requirements,
192which means drivers need to have a special framebuffer for fbdev. Which prevents
193us from using the generic fbdev emulation code everywhere. The main issue is
194that it uses some fields in struct page itself, which breaks shmem gem objects
195(and other things).
196
197Possible solution would be to write our own defio mmap code in the drm fbdev
198emulation. It would need to fully wrap the existing mmap ops, forwarding
199everything after it has done the write-protect/mkwrite trickery:
200
201- In the drm_fbdev_fb_mmap helper, if we need defio, change the
202  default page prots to write-protected with something like this::
203
204      vma->vm_page_prot = pgprot_wrprotect(vma->vm_page_prot);
205
206- Set the mkwrite and fsync callbacks with similar implementions to the core
207  fbdev defio stuff. These should all work on plain ptes, they don't actually
208  require a struct page.  uff. These should all work on plain ptes, they don't
209  actually require a struct page.
210
211- Track the dirty pages in a separate structure (bitfield with one bit per page
212  should work) to avoid clobbering struct page.
213
214Might be good to also have some igt testcases for this.
215
216Contact: Daniel Vetter, Noralf Tronnes
217
218Put a reservation_object into drm_gem_object
219--------------------------------------------
220
221This would remove the need for the ->gem_prime_res_obj callback. It would also
222allow us to implement generic helpers for waiting for a bo, allowing for quite a
223bit of refactoring in the various wait ioctl implementations.
224
225Contact: Daniel Vetter
226
227idr_init_base()
228---------------
229
230DRM core&drivers uses a lot of idr (integer lookup directories) for mapping
231userspace IDs to internal objects, and in most places ID=0 means NULL and hence
232is never used. Switching to idr_init_base() for these would make the idr more
233efficient.
234
235Contact: Daniel Vetter
236
237Defaults for .gem_prime_import and export
238-----------------------------------------
239
240Most drivers don't need to set drm_driver->gem_prime_import and
241->gem_prime_export now that drm_gem_prime_import() and drm_gem_prime_export()
242are the default.
243
244struct drm_gem_object_funcs
245---------------------------
246
247GEM objects can now have a function table instead of having the callbacks on the
248DRM driver struct. This is now the preferred way and drivers can be moved over.
249
250Use DRM_MODESET_LOCK_ALL_* helpers instead of boilerplate
251---------------------------------------------------------
252
253For cases where drivers are attempting to grab the modeset locks with a local
254acquire context. Replace the boilerplate code surrounding
255drm_modeset_lock_all_ctx() with DRM_MODESET_LOCK_ALL_BEGIN() and
256DRM_MODESET_LOCK_ALL_END() instead.
257
258This should also be done for all places where drm_modest_lock_all() is still
259used.
260
261As a reference, take a look at the conversions already completed in drm core.
262
263Contact: Sean Paul, respective driver maintainers
264
265Rename CMA helpers to DMA helpers
266---------------------------------
267
268CMA (standing for contiguous memory allocator) is really a bit an accident of
269what these were used for first, a much better name would be DMA helpers. In the
270text these should even be called coherent DMA memory helpers (so maybe CDM, but
271no one knows what that means) since underneath they just use dma_alloc_coherent.
272
273Contact: Laurent Pinchart, Daniel Vetter
274
275Convert direct mode.vrefresh accesses to use drm_mode_vrefresh()
276----------------------------------------------------------------
277
278drm_display_mode.vrefresh isn't guaranteed to be populated. As such, using it
279is risky and has been known to cause div-by-zero bugs. Fortunately, drm core
280has helper which will use mode.vrefresh if it's !0 and will calculate it from
281the timings when it's 0.
282
283Use simple search/replace, or (more fun) cocci to replace instances of direct
284vrefresh access with a call to the helper. Check out
285https://lists.freedesktop.org/archives/dri-devel/2019-January/205186.html for
286inspiration.
287
288Once all instances of vrefresh have been converted, remove vrefresh from
289drm_display_mode to avoid future use.
290
291Contact: Sean Paul
292
293Remove drm_display_mode.hsync
294-----------------------------
295
296We have drm_mode_hsync() to calculate this from hsync_start/end, since drivers
297shouldn't/don't use this, remove this member to avoid any temptations to use it
298in the future. If there is any debug code using drm_display_mode.hsync, convert
299it to use drm_mode_hsync() instead.
300
301Contact: Sean Paul
302
303Core refactorings
304=================
305
306Clean up the DRM header mess
307----------------------------
308
309The DRM subsystem originally had only one huge global header, ``drmP.h``. This
310is now split up, but many source files still include it. The remaining part of
311the cleanup work here is to replace any ``#include <drm/drmP.h>`` by only the
312headers needed (and fixing up any missing pre-declarations in the headers).
313
314In the end no .c file should need to include ``drmP.h`` anymore.
315
316Contact: Daniel Vetter
317
318Add missing kerneldoc for exported functions
319--------------------------------------------
320
321The DRM reference documentation is still lacking kerneldoc in a few areas. The
322task would be to clean up interfaces like moving functions around between
323files to better group them and improving the interfaces like dropping return
324values for functions that never fail. Then write kerneldoc for all exported
325functions and an overview section and integrate it all into the drm book.
326
327See https://dri.freedesktop.org/docs/drm/ for what's there already.
328
329Contact: Daniel Vetter
330
331Make panic handling work
332------------------------
333
334This is a really varied tasks with lots of little bits and pieces:
335
336* The panic path can't be tested currently, leading to constant breaking. The
337  main issue here is that panics can be triggered from hardirq contexts and
338  hence all panic related callback can run in hardirq context. It would be
339  awesome if we could test at least the fbdev helper code and driver code by
340  e.g. trigger calls through drm debugfs files. hardirq context could be
341  achieved by using an IPI to the local processor.
342
343* There's a massive confusion of different panic handlers. DRM fbdev emulation
344  helpers have one, but on top of that the fbcon code itself also has one. We
345  need to make sure that they stop fighting over each another.
346
347* ``drm_can_sleep()`` is a mess. It hides real bugs in normal operations and
348  isn't a full solution for panic paths. We need to make sure that it only
349  returns true if there's a panic going on for real, and fix up all the
350  fallout.
351
352* The panic handler must never sleep, which also means it can't ever
353  ``mutex_lock()``. Also it can't grab any other lock unconditionally, not
354  even spinlocks (because NMI and hardirq can panic too). We need to either
355  make sure to not call such paths, or trylock everything. Really tricky.
356
357* For the above locking troubles reasons it's pretty much impossible to
358  attempt a synchronous modeset from panic handlers. The only thing we could
359  try to achive is an atomic ``set_base`` of the primary plane, and hope that
360  it shows up. Everything else probably needs to be delayed to some worker or
361  something else which happens later on. Otherwise it just kills the box
362  harder, prevent the panic from going out on e.g. netconsole.
363
364* There's also proposal for a simplied DRM console instead of the full-blown
365  fbcon and DRM fbdev emulation. Any kind of panic handling tricks should
366  obviously work for both console, in case we ever get kmslog merged.
367
368Contact: Daniel Vetter
369
370Clean up the debugfs support
371----------------------------
372
373There's a bunch of issues with it:
374
375- The drm_info_list ->show() function doesn't even bother to cast to the drm
376  structure for you. This is lazy.
377
378- We probably want to have some support for debugfs files on crtc/connectors and
379  maybe other kms objects directly in core. There's even drm_print support in
380  the funcs for these objects to dump kms state, so it's all there. And then the
381  ->show() functions should obviously give you a pointer to the right object.
382
383- The drm_info_list stuff is centered on drm_minor instead of drm_device. For
384  anything we want to print drm_device (or maybe drm_file) is the right thing.
385
386- The drm_driver->debugfs_init hooks we have is just an artifact of the old
387  midlayered load sequence. DRM debugfs should work more like sysfs, where you
388  can create properties/files for an object anytime you want, and the core
389  takes care of publishing/unpuplishing all the files at register/unregister
390  time. Drivers shouldn't need to worry about these technicalities, and fixing
391  this (together with the drm_minor->drm_device move) would allow us to remove
392  debugfs_init.
393
394Contact: Daniel Vetter
395
396KMS cleanups
397------------
398
399Some of these date from the very introduction of KMS in 2008 ...
400
401- Make ->funcs and ->helper_private vtables optional. There's a bunch of empty
402  function tables in drivers, but before we can remove them we need to make sure
403  that all the users in helpers and drivers do correctly check for a NULL
404  vtable.
405
406- Cleanup up the various ->destroy callbacks. A lot of them just wrapt the
407  drm_*_cleanup implementations and can be removed. Some tack a kfree() at the
408  end, for which we could add drm_*_cleanup_kfree(). And then there's the (for
409  historical reasons) misnamed drm_primary_helper_destroy() function.
410
411Better Testing
412==============
413
414Enable trinity for DRM
415----------------------
416
417And fix up the fallout. Should be really interesting ...
418
419Make KMS tests in i-g-t generic
420-------------------------------
421
422The i915 driver team maintains an extensive testsuite for the i915 DRM driver,
423including tons of testcases for corner-cases in the modesetting API. It would
424be awesome if those tests (at least the ones not relying on Intel-specific GEM
425features) could be made to run on any KMS driver.
426
427Basic work to run i-g-t tests on non-i915 is done, what's now missing is mass-
428converting things over. For modeset tests we also first need a bit of
429infrastructure to use dumb buffers for untiled buffers, to be able to run all
430the non-i915 specific modeset tests.
431
432Extend virtual test driver (VKMS)
433---------------------------------
434
435See the documentation of :ref:`VKMS <vkms>` for more details. This is an ideal
436internship task, since it only requires a virtual machine and can be sized to
437fit the available time.
438
439Contact: Daniel Vetter
440
441Driver Specific
442===============
443
444tinydrm
445-------
446
447Tinydrm is the helper driver for really simple fb drivers. The goal is to make
448those drivers as simple as possible, so lots of room for refactoring:
449
450- backlight helpers, probably best to put them into a new drm_backlight.c.
451  This is because drivers/video is de-facto unmaintained. We could also
452  move drivers/video/backlight to drivers/gpu/backlight and take it all
453  over within drm-misc, but that's more work. Backlight helpers require a fair
454  bit of reworking and refactoring. A simple example is the enabling of a backlight.
455  Tinydrm has helpers for this. It would be good if other drivers can also use the
456  helper. However, there are various cases we need to consider i.e different
457  drivers seem to have different ways of enabling/disabling a backlight.
458  We also need to consider the backlight drivers (like gpio_backlight). The situation
459  is further complicated by the fact that the backlight is tied to fbdev
460  via fb_notifier_callback() which has complicated logic. For further details, refer
461  to the following discussion thread:
462  https://groups.google.com/forum/#!topic/outreachy-kernel/8rBe30lwtdA
463
464- spi helpers, probably best put into spi core/helper code. Thierry said
465  the spi maintainer is fast&reactive, so shouldn't be a big issue.
466
467- extract the mipi-dbi helper (well, the non-tinydrm specific parts at
468  least) into a separate helper, like we have for mipi-dsi already. Or follow
469  one of the ideas for having a shared dsi/dbi helper, abstracting away the
470  transport details more.
471
472- Quick aside: The unregister devm stuff is kinda getting the lifetimes of
473  a drm_device wrong. Doesn't matter, since everyone else gets it wrong
474  too :-)
475
476Contact: Noralf Trønnes, Daniel Vetter
477
478AMD DC Display Driver
479---------------------
480
481AMD DC is the display driver for AMD devices starting with Vega. There has been
482a bunch of progress cleaning it up but there's still plenty of work to be done.
483
484See drivers/gpu/drm/amd/display/TODO for tasks.
485
486Contact: Harry Wentland, Alex Deucher
487
488i915
489----
490
491- Our early/late pm callbacks could be removed in favour of using
492  device_link_add to model the dependency between i915 and snd_had. See
493  https://dri.freedesktop.org/docs/drm/driver-api/device_link.html
494
495Outside DRM
496===========
497