1====================================== 2Housekeeping 3====================================== 4 5 6CPU Isolation moves away kernel work that may otherwise run on any CPU. 7The purpose of its related features is to reduce the OS jitter that some 8extreme workloads can't stand, such as in some DPDK usecases. 9 10The kernel work moved away by CPU isolation is commonly described as 11"housekeeping" because it includes ground work that performs cleanups, 12statistics maintainance and actions relying on them, memory release, 13various deferrals etc... 14 15Sometimes housekeeping is just some unbound work (unbound workqueues, 16unbound timers, ...) that gets easily assigned to non-isolated CPUs. 17But sometimes housekeeping is tied to a specific CPU and requires 18elaborated tricks to be offloaded to non-isolated CPUs (RCU_NOCB, remote 19scheduler tick, etc...). 20 21Thus, a housekeeping CPU can be considered as the reverse of an isolated 22CPU. It is simply a CPU that can execute housekeeping work. There must 23always be at least one online housekeeping CPU at any time. The CPUs that 24are not isolated are automatically assigned as housekeeping. 25 26Housekeeping is currently divided in four features described 27by the ``enum hk_type type``: 28 291. HK_TYPE_DOMAIN matches the work moved away by scheduler domain 30 isolation performed through ``isolcpus=domain`` boot parameter or 31 isolated cpuset partitions in cgroup v2. This includes scheduler 32 load balancing, unbound workqueues and timers. 33 342. HK_TYPE_KERNEL_NOISE matches the work moved away by tick isolation 35 performed through ``nohz_full=`` or ``isolcpus=nohz`` boot 36 parameters. This includes remote scheduler tick, vmstat and lockup 37 watchdog. 38 393. HK_TYPE_MANAGED_IRQ matches the IRQ handlers moved away by managed 40 IRQ isolation performed through ``isolcpus=managed_irq``. 41 424. HK_TYPE_DOMAIN_BOOT matches the work moved away by scheduler domain 43 isolation performed through ``isolcpus=domain`` only. It is similar 44 to HK_TYPE_DOMAIN except it ignores the isolation performed by 45 cpusets. 46 47 48Housekeeping cpumasks 49================================= 50 51Housekeeping cpumasks include the CPUs that can execute the work moved 52away by the matching isolation feature. These cpumasks are returned by 53the following function:: 54 55 const struct cpumask *housekeeping_cpumask(enum hk_type type) 56 57By default, if neither ``nohz_full=``, nor ``isolcpus``, nor cpuset's 58isolated partitions are used, which covers most usecases, this function 59returns the cpu_possible_mask. 60 61Otherwise the function returns the cpumask complement of the isolation 62feature. For example: 63 64With isolcpus=domain,7 the following will return a mask with all possible 65CPUs except 7:: 66 67 housekeeping_cpumask(HK_TYPE_DOMAIN) 68 69Similarly with nohz_full=5,6 the following will return a mask with all 70possible CPUs except 5,6:: 71 72 housekeeping_cpumask(HK_TYPE_KERNEL_NOISE) 73 74 75Synchronization against cpusets 76================================= 77 78Cpuset can modify the HK_TYPE_DOMAIN housekeeping cpumask while creating, 79modifying or deleting an isolated partition. 80 81The users of HK_TYPE_DOMAIN cpumask must then make sure to synchronize 82properly against cpuset in order to make sure that: 83 841. The cpumask snapshot stays coherent. 85 862. No housekeeping work is queued on a newly made isolated CPU. 87 883. Pending housekeeping work that was queued to a non isolated 89 CPU which just turned isolated through cpuset must be flushed 90 before the related created/modified isolated partition is made 91 available to userspace. 92 93This synchronization is maintained by an RCU based scheme. The cpuset update 94side waits for an RCU grace period after updating the HK_TYPE_DOMAIN 95cpumask and before flushing pending works. On the read side, care must be 96taken to gather the housekeeping target election and the work enqueue within 97the same RCU read side critical section. 98 99A typical layout example would look like this on the update side 100(``housekeeping_update()``):: 101 102 rcu_assign_pointer(housekeeping_cpumasks[type], trial); 103 synchronize_rcu(); 104 flush_workqueue(example_workqueue); 105 106And then on the read side:: 107 108 rcu_read_lock(); 109 cpu = housekeeping_any_cpu(HK_TYPE_DOMAIN); 110 queue_work_on(cpu, example_workqueue, work); 111 rcu_read_unlock(); 112