xref: /linux/Documentation/filesystems/bcachefs/future/idle_work.rst (revision ab93e0dd72c37d378dd936f031ffb83ff2bd87ce)
1Idle/background work classes design doc:
2
3Right now, our behaviour at idle isn't ideal, it was designed for servers that
4would be under sustained load, to keep pending work at a "medium" level, to
5let work build up so we can process it in more efficient batches, while also
6giving headroom for bursts in load.
7
8But for desktops or mobile - scenarios where work is less sustained and power
9usage is more important - we want to operate differently, with a "rush to
10idle" so the system can go to sleep. We don't want to be dribbling out
11background work while the system should be idle.
12
13The complicating factor is that there are a number of background tasks, which
14form a heirarchy (or a digraph, depending on how you divide it up) - one
15background task may generate work for another.
16
17Thus proper idle detection needs to model this heirarchy.
18
19- Foreground writes
20- Page cache writeback
21- Copygc, rebalance
22- Journal reclaim
23
24When we implement idle detection and rush to idle, we need to be careful not
25to disturb too much the existing behaviour that works reasonably well when the
26system is under sustained load (or perhaps improve it in the case of
27rebalance, which currently does not actively attempt to let work batch up).
28
29SUSTAINED LOAD REGIME
30---------------------
31
32When the system is under continuous load, we want these jobs to run
33continuously - this is perhaps best modelled with a P/D controller, where
34they'll be trying to keep a target value (i.e. fragmented disk space,
35available journal space) roughly in the middle of some range.
36
37The goal under sustained load is to balance our ability to handle load spikes
38without running out of x resource (free disk space, free space in the
39journal), while also letting some work accumululate to be batched (or become
40unnecessary).
41
42For example, we don't want to run copygc too aggressively, because then it
43will be evacuating buckets that would have become empty (been overwritten or
44deleted) anyways, and we don't want to wait until we're almost out of free
45space because then the system will behave unpredicably - suddenly we're doing
46a lot more work to service each write and the system becomes much slower.
47
48IDLE REGIME
49-----------
50
51When the system becomes idle, we should start flushing our pending work
52quicker so the system can go to sleep.
53
54Note that the definition of "idle" depends on where in the heirarchy a task
55is - a task should start flushing work more quickly when the task above it has
56stopped generating new work.
57
58e.g. rebalance should start flushing more quickly when page cache writeback is
59idle, and journal reclaim should only start flushing more quickly when both
60copygc and rebalance are idle.
61
62It's important to let work accumulate when more work is still incoming and we
63still have room, because flushing is always more efficient if we let it batch
64up. New writes may overwrite data before rebalance moves it, and tasks may be
65generating more updates for the btree nodes that journal reclaim needs to flush.
66
67On idle, how much work we do at each interval should be proportional to the
68length of time we have been idle for. If we're idle only for a short duration,
69we shouldn't flush everything right away; the system might wake up and start
70generating new work soon, and flushing immediately might end up doing a lot of
71work that would have been unnecessary if we'd allowed things to batch more.
72
73To summarize, we will need:
74
75 - A list of classes for background tasks that generate work, which will
76   include one "foreground" class.
77 - Tracking for each class - "Am I doing work, or have I gone to sleep?"
78 - And each class should check the class above it when deciding how much work to issue.
79