1.. SPDX-License-Identifier: GPL-2.0 2 3================== 4The Page Allocator 5================== 6 7The kernel page allocator services all general page allocation requests, such 8as :code:`kmalloc`. CXL configuration steps affect the behavior of the page 9allocator based on the selected `Memory Zone` and `NUMA node` the capacity is 10placed in. 11 12This section mostly focuses on how these configurations affect the page 13allocator (as of Linux v6.15) rather than the overall page allocator behavior. 14 15NUMA nodes and mempolicy 16======================== 17Unless a task explicitly registers a mempolicy, the default memory policy 18of the linux kernel is to allocate memory from the `local NUMA node` first, 19and fall back to other nodes only if the local node is pressured. 20 21Generally, we expect to see local DRAM and CXL memory on separate NUMA nodes, 22with the CXL memory being non-local. Technically, however, it is possible 23for a compute node to have no local DRAM, and for CXL memory to be the 24`local` capacity for that compute node. 25 26 27Memory Zones 28============ 29CXL capacity may be onlined in :code:`ZONE_NORMAL` or :code:`ZONE_MOVABLE`. 30 31As of v6.15, the page allocator attempts to allocate from the highest 32available and compatible ZONE for an allocation from the local node first. 33 34An example of a `zone incompatibility` is attempting to service an allocation 35marked :code:`GFP_KERNEL` from :code:`ZONE_MOVABLE`. Kernel allocations are 36typically not migratable, and as a result can only be serviced from 37:code:`ZONE_NORMAL` or lower. 38 39To simplify this, the page allocator will prefer :code:`ZONE_MOVABLE` over 40:code:`ZONE_NORMAL` by default, but if :code:`ZONE_MOVABLE` is depleted, it 41will fallback to allocate from :code:`ZONE_NORMAL`. 42 43 44CGroups and CPUSets 45=================== 46Finally, assuming CXL memory is reachable via the page allocation (i.e. onlined 47in :code:`ZONE_NORMAL`), the :code:`cpusets.mems_allowed` may be used by 48containers to limit the accessibility of certain NUMA nodes for tasks in that 49container. Users may wish to utilize this in multi-tenant systems where some 50tasks prefer not to use slower memory. 51 52In the reclaim section we'll discuss some limitations of this interface to 53prevent demotions of shared data to CXL memory (if demotions are enabled). 54 55