16acfcd0dSSeongJae Park.. SPDX-License-Identifier: GPL-2.0 26acfcd0dSSeongJae Park 36acfcd0dSSeongJae Park============================= 46acfcd0dSSeongJae ParkDAMON-based LRU-lists Sorting 56acfcd0dSSeongJae Park============================= 66acfcd0dSSeongJae Park 76acfcd0dSSeongJae ParkDAMON-based LRU-lists Sorting (DAMON_LRU_SORT) is a static kernel module that 86acfcd0dSSeongJae Parkaimed to be used for proactive and lightweight data access pattern based 96acfcd0dSSeongJae Park(de)prioritization of pages on their LRU-lists for making LRU-lists a more 106acfcd0dSSeongJae Parktrusworthy data access pattern source. 116acfcd0dSSeongJae Park 126acfcd0dSSeongJae ParkWhere Proactive LRU-lists Sorting is Required? 136acfcd0dSSeongJae Park============================================== 146acfcd0dSSeongJae Park 156acfcd0dSSeongJae ParkAs page-granularity access checking overhead could be significant on huge 166acfcd0dSSeongJae Parksystems, LRU lists are normally not proactively sorted but partially and 176acfcd0dSSeongJae Parkreactively sorted for special events including specific user requests, system 186acfcd0dSSeongJae Parkcalls and memory pressure. As a result, LRU lists are sometimes not so 196acfcd0dSSeongJae Parkperfectly prepared to be used as a trustworthy access pattern source for some 206acfcd0dSSeongJae Parksituations including reclamation target pages selection under sudden memory 216acfcd0dSSeongJae Parkpressure. 226acfcd0dSSeongJae Park 236acfcd0dSSeongJae ParkBecause DAMON can identify access patterns of best-effort accuracy while 246acfcd0dSSeongJae Parkinducing only user-specified range of overhead, proactively running 256acfcd0dSSeongJae ParkDAMON_LRU_SORT could be helpful for making LRU lists more trustworthy access 266acfcd0dSSeongJae Parkpattern source with low and controlled overhead. 276acfcd0dSSeongJae Park 286acfcd0dSSeongJae ParkHow It Works? 296acfcd0dSSeongJae Park============= 306acfcd0dSSeongJae Park 316acfcd0dSSeongJae ParkDAMON_LRU_SORT finds hot pages (pages of memory regions that showing access 326acfcd0dSSeongJae Parkrates that higher than a user-specified threshold) and cold pages (pages of 336acfcd0dSSeongJae Parkmemory regions that showing no access for a time that longer than a 346acfcd0dSSeongJae Parkuser-specified threshold) using DAMON, and prioritizes hot pages while 356acfcd0dSSeongJae Parkdeprioritizing cold pages on their LRU-lists. To avoid it consuming too much 366acfcd0dSSeongJae ParkCPU for the prioritizations, a CPU time usage limit can be configured. Under 376acfcd0dSSeongJae Parkthe limit, it prioritizes and deprioritizes more hot and cold pages first, 386acfcd0dSSeongJae Parkrespectively. System administrators can also configure under what situation 396acfcd0dSSeongJae Parkthis scheme should automatically activated and deactivated with three memory 406acfcd0dSSeongJae Parkpressure watermarks. 416acfcd0dSSeongJae Park 426acfcd0dSSeongJae ParkIts default parameters for hotness/coldness thresholds and CPU quota limit are 436acfcd0dSSeongJae Parkconservatively chosen. That is, the module under its default parameters could 446acfcd0dSSeongJae Parkbe widely used without harm for common situations while providing a level of 456acfcd0dSSeongJae Parkbenefits for systems having clear hot/cold access patterns under memory 466acfcd0dSSeongJae Parkpressure while consuming only a limited small portion of CPU time. 476acfcd0dSSeongJae Park 486acfcd0dSSeongJae ParkInterface: Module Parameters 496acfcd0dSSeongJae Park============================ 506acfcd0dSSeongJae Park 516acfcd0dSSeongJae ParkTo use this feature, you should first ensure your system is running on a kernel 526acfcd0dSSeongJae Parkthat is built with ``CONFIG_DAMON_LRU_SORT=y``. 536acfcd0dSSeongJae Park 546acfcd0dSSeongJae ParkTo let sysadmins enable or disable it and tune for the given system, 556acfcd0dSSeongJae ParkDAMON_LRU_SORT utilizes module parameters. That is, you can put 566acfcd0dSSeongJae Park``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write 57*b05ada56SHui Suproper values to ``/sys/module/damon_lru_sort/parameters/<parameter>`` files. 586acfcd0dSSeongJae Park 596acfcd0dSSeongJae ParkBelow are the description of each parameter. 606acfcd0dSSeongJae Park 616acfcd0dSSeongJae Parkenabled 626acfcd0dSSeongJae Park------- 636acfcd0dSSeongJae Park 646acfcd0dSSeongJae ParkEnable or disable DAMON_LRU_SORT. 656acfcd0dSSeongJae Park 666acfcd0dSSeongJae ParkYou can enable DAMON_LRU_SORT by setting the value of this parameter as ``Y``. 676acfcd0dSSeongJae ParkSetting it as ``N`` disables DAMON_LRU_SORT. Note that DAMON_LRU_SORT could do 686acfcd0dSSeongJae Parkno real monitoring and LRU-lists sorting due to the watermarks-based activation 696acfcd0dSSeongJae Parkcondition. Refer to below descriptions for the watermarks parameter for this. 706acfcd0dSSeongJae Park 716acfcd0dSSeongJae Parkcommit_inputs 726acfcd0dSSeongJae Park------------- 736acfcd0dSSeongJae Park 746acfcd0dSSeongJae ParkMake DAMON_LRU_SORT reads the input parameters again, except ``enabled``. 756acfcd0dSSeongJae Park 766acfcd0dSSeongJae ParkInput parameters that updated while DAMON_LRU_SORT is running are not applied 776acfcd0dSSeongJae Parkby default. Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values 786acfcd0dSSeongJae Parkof parametrs except ``enabled`` again. Once the re-reading is done, this 796acfcd0dSSeongJae Parkparameter is set as ``N``. If invalid parameters are found while the 806acfcd0dSSeongJae Parkre-reading, DAMON_LRU_SORT will be disabled. 816acfcd0dSSeongJae Park 826acfcd0dSSeongJae Parkhot_thres_access_freq 836acfcd0dSSeongJae Park--------------------- 846acfcd0dSSeongJae Park 856acfcd0dSSeongJae ParkAccess frequency threshold for hot memory regions identification in permil. 866acfcd0dSSeongJae Park 876acfcd0dSSeongJae ParkIf a memory region is accessed in frequency of this or higher, DAMON_LRU_SORT 886acfcd0dSSeongJae Parkidentifies the region as hot, and mark it as accessed on the LRU list, so that 896acfcd0dSSeongJae Parkit could not be reclaimed under memory pressure. 50% by default. 906acfcd0dSSeongJae Park 916acfcd0dSSeongJae Parkcold_min_age 926acfcd0dSSeongJae Park------------ 936acfcd0dSSeongJae Park 946acfcd0dSSeongJae ParkTime threshold for cold memory regions identification in microseconds. 956acfcd0dSSeongJae Park 966acfcd0dSSeongJae ParkIf a memory region is not accessed for this or longer time, DAMON_LRU_SORT 976acfcd0dSSeongJae Parkidentifies the region as cold, and mark it as unaccessed on the LRU list, so 986acfcd0dSSeongJae Parkthat it could be reclaimed first under memory pressure. 120 seconds by 996acfcd0dSSeongJae Parkdefault. 1006acfcd0dSSeongJae Park 1016acfcd0dSSeongJae Parkquota_ms 1026acfcd0dSSeongJae Park-------- 1036acfcd0dSSeongJae Park 1046acfcd0dSSeongJae ParkLimit of time for trying the LRU lists sorting in milliseconds. 1056acfcd0dSSeongJae Park 1066acfcd0dSSeongJae ParkDAMON_LRU_SORT tries to use only up to this time within a time window 1076acfcd0dSSeongJae Park(quota_reset_interval_ms) for trying LRU lists sorting. This can be used 1086acfcd0dSSeongJae Parkfor limiting CPU consumption of DAMON_LRU_SORT. If the value is zero, the 1096acfcd0dSSeongJae Parklimit is disabled. 1106acfcd0dSSeongJae Park 1116acfcd0dSSeongJae Park10 ms by default. 1126acfcd0dSSeongJae Park 1136acfcd0dSSeongJae Parkquota_reset_interval_ms 1146acfcd0dSSeongJae Park----------------------- 1156acfcd0dSSeongJae Park 1166acfcd0dSSeongJae ParkThe time quota charge reset interval in milliseconds. 1176acfcd0dSSeongJae Park 1186acfcd0dSSeongJae ParkThe charge reset interval for the quota of time (quota_ms). That is, 1196acfcd0dSSeongJae ParkDAMON_LRU_SORT does not try LRU-lists sorting for more than quota_ms 1206acfcd0dSSeongJae Parkmilliseconds or quota_sz bytes within quota_reset_interval_ms milliseconds. 1216acfcd0dSSeongJae Park 1226acfcd0dSSeongJae Park1 second by default. 1236acfcd0dSSeongJae Park 1246acfcd0dSSeongJae Parkwmarks_interval 1256acfcd0dSSeongJae Park--------------- 1266acfcd0dSSeongJae Park 1276acfcd0dSSeongJae ParkThe watermarks check time interval in microseconds. 1286acfcd0dSSeongJae Park 1296acfcd0dSSeongJae ParkMinimal time to wait before checking the watermarks, when DAMON_LRU_SORT is 1306acfcd0dSSeongJae Parkenabled but inactive due to its watermarks rule. 5 seconds by default. 1316acfcd0dSSeongJae Park 1326acfcd0dSSeongJae Parkwmarks_high 1336acfcd0dSSeongJae Park----------- 1346acfcd0dSSeongJae Park 1356acfcd0dSSeongJae ParkFree memory rate (per thousand) for the high watermark. 1366acfcd0dSSeongJae Park 1376acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is higher than this, 1386acfcd0dSSeongJae ParkDAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the 1396acfcd0dSSeongJae Parkwatermarks. 200 (20%) by default. 1406acfcd0dSSeongJae Park 1416acfcd0dSSeongJae Parkwmarks_mid 1426acfcd0dSSeongJae Park---------- 1436acfcd0dSSeongJae Park 1446acfcd0dSSeongJae ParkFree memory rate (per thousand) for the middle watermark. 1456acfcd0dSSeongJae Park 1466acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is between this and 1476acfcd0dSSeongJae Parkthe low watermark, DAMON_LRU_SORT becomes active, so starts the monitoring and 1486acfcd0dSSeongJae Parkthe LRU-lists sorting. 150 (15%) by default. 1496acfcd0dSSeongJae Park 1506acfcd0dSSeongJae Parkwmarks_low 1516acfcd0dSSeongJae Park---------- 1526acfcd0dSSeongJae Park 1536acfcd0dSSeongJae ParkFree memory rate (per thousand) for the low watermark. 1546acfcd0dSSeongJae Park 1556acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is lower than this, 1566acfcd0dSSeongJae ParkDAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the 1576acfcd0dSSeongJae Parkwatermarks. 50 (5%) by default. 1586acfcd0dSSeongJae Park 1596acfcd0dSSeongJae Parksample_interval 1606acfcd0dSSeongJae Park--------------- 1616acfcd0dSSeongJae Park 1626acfcd0dSSeongJae ParkSampling interval for the monitoring in microseconds. 1636acfcd0dSSeongJae Park 1646acfcd0dSSeongJae ParkThe sampling interval of DAMON for the cold memory monitoring. Please refer to 1656acfcd0dSSeongJae Parkthe DAMON documentation (:doc:`usage`) for more detail. 5ms by default. 1666acfcd0dSSeongJae Park 1676acfcd0dSSeongJae Parkaggr_interval 1686acfcd0dSSeongJae Park------------- 1696acfcd0dSSeongJae Park 1706acfcd0dSSeongJae ParkAggregation interval for the monitoring in microseconds. 1716acfcd0dSSeongJae Park 1726acfcd0dSSeongJae ParkThe aggregation interval of DAMON for the cold memory monitoring. Please 1736acfcd0dSSeongJae Parkrefer to the DAMON documentation (:doc:`usage`) for more detail. 100ms by 1746acfcd0dSSeongJae Parkdefault. 1756acfcd0dSSeongJae Park 1766acfcd0dSSeongJae Parkmin_nr_regions 1776acfcd0dSSeongJae Park-------------- 1786acfcd0dSSeongJae Park 1796acfcd0dSSeongJae ParkMinimum number of monitoring regions. 1806acfcd0dSSeongJae Park 1816acfcd0dSSeongJae ParkThe minimal number of monitoring regions of DAMON for the cold memory 1826acfcd0dSSeongJae Parkmonitoring. This can be used to set lower-bound of the monitoring quality. 1836acfcd0dSSeongJae ParkBut, setting this too high could result in increased monitoring overhead. 1846acfcd0dSSeongJae ParkPlease refer to the DAMON documentation (:doc:`usage`) for more detail. 10 by 1856acfcd0dSSeongJae Parkdefault. 1866acfcd0dSSeongJae Park 1876acfcd0dSSeongJae Parkmax_nr_regions 1886acfcd0dSSeongJae Park-------------- 1896acfcd0dSSeongJae Park 1906acfcd0dSSeongJae ParkMaximum number of monitoring regions. 1916acfcd0dSSeongJae Park 1926acfcd0dSSeongJae ParkThe maximum number of monitoring regions of DAMON for the cold memory 1936acfcd0dSSeongJae Parkmonitoring. This can be used to set upper-bound of the monitoring overhead. 1946acfcd0dSSeongJae ParkHowever, setting this too low could result in bad monitoring quality. Please 1956acfcd0dSSeongJae Parkrefer to the DAMON documentation (:doc:`usage`) for more detail. 1000 by 1966acfcd0dSSeongJae Parkdefaults. 1976acfcd0dSSeongJae Park 1986acfcd0dSSeongJae Parkmonitor_region_start 1996acfcd0dSSeongJae Park-------------------- 2006acfcd0dSSeongJae Park 2016acfcd0dSSeongJae ParkStart of target memory region in physical address. 2026acfcd0dSSeongJae Park 2036acfcd0dSSeongJae ParkThe start physical address of memory region that DAMON_LRU_SORT will do work 2046acfcd0dSSeongJae Parkagainst. By default, biggest System RAM is used as the region. 2056acfcd0dSSeongJae Park 2066acfcd0dSSeongJae Parkmonitor_region_end 2076acfcd0dSSeongJae Park------------------ 2086acfcd0dSSeongJae Park 2096acfcd0dSSeongJae ParkEnd of target memory region in physical address. 2106acfcd0dSSeongJae Park 2116acfcd0dSSeongJae ParkThe end physical address of memory region that DAMON_LRU_SORT will do work 2126acfcd0dSSeongJae Parkagainst. By default, biggest System RAM is used as the region. 2136acfcd0dSSeongJae Park 2146acfcd0dSSeongJae Parkkdamond_pid 2156acfcd0dSSeongJae Park----------- 2166acfcd0dSSeongJae Park 2176acfcd0dSSeongJae ParkPID of the DAMON thread. 2186acfcd0dSSeongJae Park 2196acfcd0dSSeongJae ParkIf DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread. Else, 2206acfcd0dSSeongJae Park-1. 2216acfcd0dSSeongJae Park 2226acfcd0dSSeongJae Parknr_lru_sort_tried_hot_regions 2236acfcd0dSSeongJae Park----------------------------- 2246acfcd0dSSeongJae Park 2256acfcd0dSSeongJae ParkNumber of hot memory regions that tried to be LRU-sorted. 2266acfcd0dSSeongJae Park 2276acfcd0dSSeongJae Parkbytes_lru_sort_tried_hot_regions 2286acfcd0dSSeongJae Park-------------------------------- 2296acfcd0dSSeongJae Park 2306acfcd0dSSeongJae ParkTotal bytes of hot memory regions that tried to be LRU-sorted. 2316acfcd0dSSeongJae Park 2326acfcd0dSSeongJae Parknr_lru_sorted_hot_regions 2336acfcd0dSSeongJae Park------------------------- 2346acfcd0dSSeongJae Park 2356acfcd0dSSeongJae ParkNumber of hot memory regions that successfully be LRU-sorted. 2366acfcd0dSSeongJae Park 2376acfcd0dSSeongJae Parkbytes_lru_sorted_hot_regions 2386acfcd0dSSeongJae Park---------------------------- 2396acfcd0dSSeongJae Park 2406acfcd0dSSeongJae ParkTotal bytes of hot memory regions that successfully be LRU-sorted. 2416acfcd0dSSeongJae Park 2426acfcd0dSSeongJae Parknr_hot_quota_exceeds 2436acfcd0dSSeongJae Park-------------------- 2446acfcd0dSSeongJae Park 2456acfcd0dSSeongJae ParkNumber of times that the time quota limit for hot regions have exceeded. 2466acfcd0dSSeongJae Park 2476acfcd0dSSeongJae Parknr_lru_sort_tried_cold_regions 2486acfcd0dSSeongJae Park------------------------------ 2496acfcd0dSSeongJae Park 2506acfcd0dSSeongJae ParkNumber of cold memory regions that tried to be LRU-sorted. 2516acfcd0dSSeongJae Park 2526acfcd0dSSeongJae Parkbytes_lru_sort_tried_cold_regions 2536acfcd0dSSeongJae Park--------------------------------- 2546acfcd0dSSeongJae Park 2556acfcd0dSSeongJae ParkTotal bytes of cold memory regions that tried to be LRU-sorted. 2566acfcd0dSSeongJae Park 2576acfcd0dSSeongJae Parknr_lru_sorted_cold_regions 2586acfcd0dSSeongJae Park-------------------------- 2596acfcd0dSSeongJae Park 2606acfcd0dSSeongJae ParkNumber of cold memory regions that successfully be LRU-sorted. 2616acfcd0dSSeongJae Park 2626acfcd0dSSeongJae Parkbytes_lru_sorted_cold_regions 2636acfcd0dSSeongJae Park----------------------------- 2646acfcd0dSSeongJae Park 2656acfcd0dSSeongJae ParkTotal bytes of cold memory regions that successfully be LRU-sorted. 2666acfcd0dSSeongJae Park 2676acfcd0dSSeongJae Parknr_cold_quota_exceeds 2686acfcd0dSSeongJae Park--------------------- 2696acfcd0dSSeongJae Park 2706acfcd0dSSeongJae ParkNumber of times that the time quota limit for cold regions have exceeded. 2716acfcd0dSSeongJae Park 2726acfcd0dSSeongJae ParkExample 2736acfcd0dSSeongJae Park======= 2746acfcd0dSSeongJae Park 2756acfcd0dSSeongJae ParkBelow runtime example commands make DAMON_LRU_SORT to find memory regions 2766acfcd0dSSeongJae Parkhaving >=50% access frequency and LRU-prioritize while LRU-deprioritizing 2776acfcd0dSSeongJae Parkmemory regions that not accessed for 120 seconds. The prioritization and 2786acfcd0dSSeongJae Parkdeprioritization is limited to be done using only up to 1% CPU time to avoid 2796acfcd0dSSeongJae ParkDAMON_LRU_SORT consuming too much CPU time for the (de)prioritization. It also 2806acfcd0dSSeongJae Parkasks DAMON_LRU_SORT to do nothing if the system's free memory rate is more than 2816acfcd0dSSeongJae Park50%, but start the real works if it becomes lower than 40%. If DAMON_RECLAIM 2826acfcd0dSSeongJae Parkdoesn't make progress and therefore the free memory rate becomes lower than 2836acfcd0dSSeongJae Park20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to 2846acfcd0dSSeongJae Parkthe LRU-list based page granularity reclamation. :: 2856acfcd0dSSeongJae Park 286*b05ada56SHui Su # cd /sys/module/damon_lru_sort/parameters 2876acfcd0dSSeongJae Park # echo 500 > hot_thres_access_freq 2886acfcd0dSSeongJae Park # echo 120000000 > cold_min_age 2896acfcd0dSSeongJae Park # echo 10 > quota_ms 2906acfcd0dSSeongJae Park # echo 1000 > quota_reset_interval_ms 2916acfcd0dSSeongJae Park # echo 500 > wmarks_high 2926acfcd0dSSeongJae Park # echo 400 > wmarks_mid 2936acfcd0dSSeongJae Park # echo 200 > wmarks_low 2946acfcd0dSSeongJae Park # echo Y > enabled 295