xref: /linux/Documentation/admin-guide/mm/damon/lru_sort.rst (revision 9a87ffc99ec8eb8d35eed7c4f816d75f5cc9662e)
16acfcd0dSSeongJae Park.. SPDX-License-Identifier: GPL-2.0
26acfcd0dSSeongJae Park
36acfcd0dSSeongJae Park=============================
46acfcd0dSSeongJae ParkDAMON-based LRU-lists Sorting
56acfcd0dSSeongJae Park=============================
66acfcd0dSSeongJae Park
76acfcd0dSSeongJae ParkDAMON-based LRU-lists Sorting (DAMON_LRU_SORT) is a static kernel module that
86acfcd0dSSeongJae Parkaimed to be used for proactive and lightweight data access pattern based
96acfcd0dSSeongJae Park(de)prioritization of pages on their LRU-lists for making LRU-lists a more
106acfcd0dSSeongJae Parktrusworthy data access pattern source.
116acfcd0dSSeongJae Park
126acfcd0dSSeongJae ParkWhere Proactive LRU-lists Sorting is Required?
136acfcd0dSSeongJae Park==============================================
146acfcd0dSSeongJae Park
156acfcd0dSSeongJae ParkAs page-granularity access checking overhead could be significant on huge
166acfcd0dSSeongJae Parksystems, LRU lists are normally not proactively sorted but partially and
176acfcd0dSSeongJae Parkreactively sorted for special events including specific user requests, system
186acfcd0dSSeongJae Parkcalls and memory pressure.  As a result, LRU lists are sometimes not so
196acfcd0dSSeongJae Parkperfectly prepared to be used as a trustworthy access pattern source for some
206acfcd0dSSeongJae Parksituations including reclamation target pages selection under sudden memory
216acfcd0dSSeongJae Parkpressure.
226acfcd0dSSeongJae Park
236acfcd0dSSeongJae ParkBecause DAMON can identify access patterns of best-effort accuracy while
246acfcd0dSSeongJae Parkinducing only user-specified range of overhead, proactively running
256acfcd0dSSeongJae ParkDAMON_LRU_SORT could be helpful for making LRU lists more trustworthy access
266acfcd0dSSeongJae Parkpattern source with low and controlled overhead.
276acfcd0dSSeongJae Park
286acfcd0dSSeongJae ParkHow It Works?
296acfcd0dSSeongJae Park=============
306acfcd0dSSeongJae Park
316acfcd0dSSeongJae ParkDAMON_LRU_SORT finds hot pages (pages of memory regions that showing access
326acfcd0dSSeongJae Parkrates that higher than a user-specified threshold) and cold pages (pages of
336acfcd0dSSeongJae Parkmemory regions that showing no access for a time that longer than a
346acfcd0dSSeongJae Parkuser-specified threshold) using DAMON, and prioritizes hot pages while
356acfcd0dSSeongJae Parkdeprioritizing cold pages on their LRU-lists.  To avoid it consuming too much
366acfcd0dSSeongJae ParkCPU for the prioritizations, a CPU time usage limit can be configured.  Under
376acfcd0dSSeongJae Parkthe limit, it prioritizes and deprioritizes more hot and cold pages first,
386acfcd0dSSeongJae Parkrespectively.  System administrators can also configure under what situation
396acfcd0dSSeongJae Parkthis scheme should automatically activated and deactivated with three memory
406acfcd0dSSeongJae Parkpressure watermarks.
416acfcd0dSSeongJae Park
426acfcd0dSSeongJae ParkIts default parameters for hotness/coldness thresholds and CPU quota limit are
436acfcd0dSSeongJae Parkconservatively chosen.  That is, the module under its default parameters could
446acfcd0dSSeongJae Parkbe widely used without harm for common situations while providing a level of
456acfcd0dSSeongJae Parkbenefits for systems having clear hot/cold access patterns under memory
466acfcd0dSSeongJae Parkpressure while consuming only a limited small portion of CPU time.
476acfcd0dSSeongJae Park
486acfcd0dSSeongJae ParkInterface: Module Parameters
496acfcd0dSSeongJae Park============================
506acfcd0dSSeongJae Park
516acfcd0dSSeongJae ParkTo use this feature, you should first ensure your system is running on a kernel
526acfcd0dSSeongJae Parkthat is built with ``CONFIG_DAMON_LRU_SORT=y``.
536acfcd0dSSeongJae Park
546acfcd0dSSeongJae ParkTo let sysadmins enable or disable it and tune for the given system,
556acfcd0dSSeongJae ParkDAMON_LRU_SORT utilizes module parameters.  That is, you can put
566acfcd0dSSeongJae Park``damon_lru_sort.<parameter>=<value>`` on the kernel boot command line or write
57*b05ada56SHui Suproper values to ``/sys/module/damon_lru_sort/parameters/<parameter>`` files.
586acfcd0dSSeongJae Park
596acfcd0dSSeongJae ParkBelow are the description of each parameter.
606acfcd0dSSeongJae Park
616acfcd0dSSeongJae Parkenabled
626acfcd0dSSeongJae Park-------
636acfcd0dSSeongJae Park
646acfcd0dSSeongJae ParkEnable or disable DAMON_LRU_SORT.
656acfcd0dSSeongJae Park
666acfcd0dSSeongJae ParkYou can enable DAMON_LRU_SORT by setting the value of this parameter as ``Y``.
676acfcd0dSSeongJae ParkSetting it as ``N`` disables DAMON_LRU_SORT.  Note that DAMON_LRU_SORT could do
686acfcd0dSSeongJae Parkno real monitoring and LRU-lists sorting due to the watermarks-based activation
696acfcd0dSSeongJae Parkcondition.  Refer to below descriptions for the watermarks parameter for this.
706acfcd0dSSeongJae Park
716acfcd0dSSeongJae Parkcommit_inputs
726acfcd0dSSeongJae Park-------------
736acfcd0dSSeongJae Park
746acfcd0dSSeongJae ParkMake DAMON_LRU_SORT reads the input parameters again, except ``enabled``.
756acfcd0dSSeongJae Park
766acfcd0dSSeongJae ParkInput parameters that updated while DAMON_LRU_SORT is running are not applied
776acfcd0dSSeongJae Parkby default.  Once this parameter is set as ``Y``, DAMON_LRU_SORT reads values
786acfcd0dSSeongJae Parkof parametrs except ``enabled`` again.  Once the re-reading is done, this
796acfcd0dSSeongJae Parkparameter is set as ``N``.  If invalid parameters are found while the
806acfcd0dSSeongJae Parkre-reading, DAMON_LRU_SORT will be disabled.
816acfcd0dSSeongJae Park
826acfcd0dSSeongJae Parkhot_thres_access_freq
836acfcd0dSSeongJae Park---------------------
846acfcd0dSSeongJae Park
856acfcd0dSSeongJae ParkAccess frequency threshold for hot memory regions identification in permil.
866acfcd0dSSeongJae Park
876acfcd0dSSeongJae ParkIf a memory region is accessed in frequency of this or higher, DAMON_LRU_SORT
886acfcd0dSSeongJae Parkidentifies the region as hot, and mark it as accessed on the LRU list, so that
896acfcd0dSSeongJae Parkit could not be reclaimed under memory pressure.  50% by default.
906acfcd0dSSeongJae Park
916acfcd0dSSeongJae Parkcold_min_age
926acfcd0dSSeongJae Park------------
936acfcd0dSSeongJae Park
946acfcd0dSSeongJae ParkTime threshold for cold memory regions identification in microseconds.
956acfcd0dSSeongJae Park
966acfcd0dSSeongJae ParkIf a memory region is not accessed for this or longer time, DAMON_LRU_SORT
976acfcd0dSSeongJae Parkidentifies the region as cold, and mark it as unaccessed on the LRU list, so
986acfcd0dSSeongJae Parkthat it could be reclaimed first under memory pressure.  120 seconds by
996acfcd0dSSeongJae Parkdefault.
1006acfcd0dSSeongJae Park
1016acfcd0dSSeongJae Parkquota_ms
1026acfcd0dSSeongJae Park--------
1036acfcd0dSSeongJae Park
1046acfcd0dSSeongJae ParkLimit of time for trying the LRU lists sorting in milliseconds.
1056acfcd0dSSeongJae Park
1066acfcd0dSSeongJae ParkDAMON_LRU_SORT tries to use only up to this time within a time window
1076acfcd0dSSeongJae Park(quota_reset_interval_ms) for trying LRU lists sorting.  This can be used
1086acfcd0dSSeongJae Parkfor limiting CPU consumption of DAMON_LRU_SORT.  If the value is zero, the
1096acfcd0dSSeongJae Parklimit is disabled.
1106acfcd0dSSeongJae Park
1116acfcd0dSSeongJae Park10 ms by default.
1126acfcd0dSSeongJae Park
1136acfcd0dSSeongJae Parkquota_reset_interval_ms
1146acfcd0dSSeongJae Park-----------------------
1156acfcd0dSSeongJae Park
1166acfcd0dSSeongJae ParkThe time quota charge reset interval in milliseconds.
1176acfcd0dSSeongJae Park
1186acfcd0dSSeongJae ParkThe charge reset interval for the quota of time (quota_ms).  That is,
1196acfcd0dSSeongJae ParkDAMON_LRU_SORT does not try LRU-lists sorting for more than quota_ms
1206acfcd0dSSeongJae Parkmilliseconds or quota_sz bytes within quota_reset_interval_ms milliseconds.
1216acfcd0dSSeongJae Park
1226acfcd0dSSeongJae Park1 second by default.
1236acfcd0dSSeongJae Park
1246acfcd0dSSeongJae Parkwmarks_interval
1256acfcd0dSSeongJae Park---------------
1266acfcd0dSSeongJae Park
1276acfcd0dSSeongJae ParkThe watermarks check time interval in microseconds.
1286acfcd0dSSeongJae Park
1296acfcd0dSSeongJae ParkMinimal time to wait before checking the watermarks, when DAMON_LRU_SORT is
1306acfcd0dSSeongJae Parkenabled but inactive due to its watermarks rule.  5 seconds by default.
1316acfcd0dSSeongJae Park
1326acfcd0dSSeongJae Parkwmarks_high
1336acfcd0dSSeongJae Park-----------
1346acfcd0dSSeongJae Park
1356acfcd0dSSeongJae ParkFree memory rate (per thousand) for the high watermark.
1366acfcd0dSSeongJae Park
1376acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is higher than this,
1386acfcd0dSSeongJae ParkDAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
1396acfcd0dSSeongJae Parkwatermarks.  200 (20%) by default.
1406acfcd0dSSeongJae Park
1416acfcd0dSSeongJae Parkwmarks_mid
1426acfcd0dSSeongJae Park----------
1436acfcd0dSSeongJae Park
1446acfcd0dSSeongJae ParkFree memory rate (per thousand) for the middle watermark.
1456acfcd0dSSeongJae Park
1466acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is between this and
1476acfcd0dSSeongJae Parkthe low watermark, DAMON_LRU_SORT becomes active, so starts the monitoring and
1486acfcd0dSSeongJae Parkthe LRU-lists sorting.  150 (15%) by default.
1496acfcd0dSSeongJae Park
1506acfcd0dSSeongJae Parkwmarks_low
1516acfcd0dSSeongJae Park----------
1526acfcd0dSSeongJae Park
1536acfcd0dSSeongJae ParkFree memory rate (per thousand) for the low watermark.
1546acfcd0dSSeongJae Park
1556acfcd0dSSeongJae ParkIf free memory of the system in bytes per thousand bytes is lower than this,
1566acfcd0dSSeongJae ParkDAMON_LRU_SORT becomes inactive, so it does nothing but periodically checks the
1576acfcd0dSSeongJae Parkwatermarks.  50 (5%) by default.
1586acfcd0dSSeongJae Park
1596acfcd0dSSeongJae Parksample_interval
1606acfcd0dSSeongJae Park---------------
1616acfcd0dSSeongJae Park
1626acfcd0dSSeongJae ParkSampling interval for the monitoring in microseconds.
1636acfcd0dSSeongJae Park
1646acfcd0dSSeongJae ParkThe sampling interval of DAMON for the cold memory monitoring.  Please refer to
1656acfcd0dSSeongJae Parkthe DAMON documentation (:doc:`usage`) for more detail.  5ms by default.
1666acfcd0dSSeongJae Park
1676acfcd0dSSeongJae Parkaggr_interval
1686acfcd0dSSeongJae Park-------------
1696acfcd0dSSeongJae Park
1706acfcd0dSSeongJae ParkAggregation interval for the monitoring in microseconds.
1716acfcd0dSSeongJae Park
1726acfcd0dSSeongJae ParkThe aggregation interval of DAMON for the cold memory monitoring.  Please
1736acfcd0dSSeongJae Parkrefer to the DAMON documentation (:doc:`usage`) for more detail.  100ms by
1746acfcd0dSSeongJae Parkdefault.
1756acfcd0dSSeongJae Park
1766acfcd0dSSeongJae Parkmin_nr_regions
1776acfcd0dSSeongJae Park--------------
1786acfcd0dSSeongJae Park
1796acfcd0dSSeongJae ParkMinimum number of monitoring regions.
1806acfcd0dSSeongJae Park
1816acfcd0dSSeongJae ParkThe minimal number of monitoring regions of DAMON for the cold memory
1826acfcd0dSSeongJae Parkmonitoring.  This can be used to set lower-bound of the monitoring quality.
1836acfcd0dSSeongJae ParkBut, setting this too high could result in increased monitoring overhead.
1846acfcd0dSSeongJae ParkPlease refer to the DAMON documentation (:doc:`usage`) for more detail.  10 by
1856acfcd0dSSeongJae Parkdefault.
1866acfcd0dSSeongJae Park
1876acfcd0dSSeongJae Parkmax_nr_regions
1886acfcd0dSSeongJae Park--------------
1896acfcd0dSSeongJae Park
1906acfcd0dSSeongJae ParkMaximum number of monitoring regions.
1916acfcd0dSSeongJae Park
1926acfcd0dSSeongJae ParkThe maximum number of monitoring regions of DAMON for the cold memory
1936acfcd0dSSeongJae Parkmonitoring.  This can be used to set upper-bound of the monitoring overhead.
1946acfcd0dSSeongJae ParkHowever, setting this too low could result in bad monitoring quality.  Please
1956acfcd0dSSeongJae Parkrefer to the DAMON documentation (:doc:`usage`) for more detail.  1000 by
1966acfcd0dSSeongJae Parkdefaults.
1976acfcd0dSSeongJae Park
1986acfcd0dSSeongJae Parkmonitor_region_start
1996acfcd0dSSeongJae Park--------------------
2006acfcd0dSSeongJae Park
2016acfcd0dSSeongJae ParkStart of target memory region in physical address.
2026acfcd0dSSeongJae Park
2036acfcd0dSSeongJae ParkThe start physical address of memory region that DAMON_LRU_SORT will do work
2046acfcd0dSSeongJae Parkagainst.  By default, biggest System RAM is used as the region.
2056acfcd0dSSeongJae Park
2066acfcd0dSSeongJae Parkmonitor_region_end
2076acfcd0dSSeongJae Park------------------
2086acfcd0dSSeongJae Park
2096acfcd0dSSeongJae ParkEnd of target memory region in physical address.
2106acfcd0dSSeongJae Park
2116acfcd0dSSeongJae ParkThe end physical address of memory region that DAMON_LRU_SORT will do work
2126acfcd0dSSeongJae Parkagainst.  By default, biggest System RAM is used as the region.
2136acfcd0dSSeongJae Park
2146acfcd0dSSeongJae Parkkdamond_pid
2156acfcd0dSSeongJae Park-----------
2166acfcd0dSSeongJae Park
2176acfcd0dSSeongJae ParkPID of the DAMON thread.
2186acfcd0dSSeongJae Park
2196acfcd0dSSeongJae ParkIf DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread.  Else,
2206acfcd0dSSeongJae Park-1.
2216acfcd0dSSeongJae Park
2226acfcd0dSSeongJae Parknr_lru_sort_tried_hot_regions
2236acfcd0dSSeongJae Park-----------------------------
2246acfcd0dSSeongJae Park
2256acfcd0dSSeongJae ParkNumber of hot memory regions that tried to be LRU-sorted.
2266acfcd0dSSeongJae Park
2276acfcd0dSSeongJae Parkbytes_lru_sort_tried_hot_regions
2286acfcd0dSSeongJae Park--------------------------------
2296acfcd0dSSeongJae Park
2306acfcd0dSSeongJae ParkTotal bytes of hot memory regions that tried to be LRU-sorted.
2316acfcd0dSSeongJae Park
2326acfcd0dSSeongJae Parknr_lru_sorted_hot_regions
2336acfcd0dSSeongJae Park-------------------------
2346acfcd0dSSeongJae Park
2356acfcd0dSSeongJae ParkNumber of hot memory regions that successfully be LRU-sorted.
2366acfcd0dSSeongJae Park
2376acfcd0dSSeongJae Parkbytes_lru_sorted_hot_regions
2386acfcd0dSSeongJae Park----------------------------
2396acfcd0dSSeongJae Park
2406acfcd0dSSeongJae ParkTotal bytes of hot memory regions that successfully be LRU-sorted.
2416acfcd0dSSeongJae Park
2426acfcd0dSSeongJae Parknr_hot_quota_exceeds
2436acfcd0dSSeongJae Park--------------------
2446acfcd0dSSeongJae Park
2456acfcd0dSSeongJae ParkNumber of times that the time quota limit for hot regions have exceeded.
2466acfcd0dSSeongJae Park
2476acfcd0dSSeongJae Parknr_lru_sort_tried_cold_regions
2486acfcd0dSSeongJae Park------------------------------
2496acfcd0dSSeongJae Park
2506acfcd0dSSeongJae ParkNumber of cold memory regions that tried to be LRU-sorted.
2516acfcd0dSSeongJae Park
2526acfcd0dSSeongJae Parkbytes_lru_sort_tried_cold_regions
2536acfcd0dSSeongJae Park---------------------------------
2546acfcd0dSSeongJae Park
2556acfcd0dSSeongJae ParkTotal bytes of cold memory regions that tried to be LRU-sorted.
2566acfcd0dSSeongJae Park
2576acfcd0dSSeongJae Parknr_lru_sorted_cold_regions
2586acfcd0dSSeongJae Park--------------------------
2596acfcd0dSSeongJae Park
2606acfcd0dSSeongJae ParkNumber of cold memory regions that successfully be LRU-sorted.
2616acfcd0dSSeongJae Park
2626acfcd0dSSeongJae Parkbytes_lru_sorted_cold_regions
2636acfcd0dSSeongJae Park-----------------------------
2646acfcd0dSSeongJae Park
2656acfcd0dSSeongJae ParkTotal bytes of cold memory regions that successfully be LRU-sorted.
2666acfcd0dSSeongJae Park
2676acfcd0dSSeongJae Parknr_cold_quota_exceeds
2686acfcd0dSSeongJae Park---------------------
2696acfcd0dSSeongJae Park
2706acfcd0dSSeongJae ParkNumber of times that the time quota limit for cold regions have exceeded.
2716acfcd0dSSeongJae Park
2726acfcd0dSSeongJae ParkExample
2736acfcd0dSSeongJae Park=======
2746acfcd0dSSeongJae Park
2756acfcd0dSSeongJae ParkBelow runtime example commands make DAMON_LRU_SORT to find memory regions
2766acfcd0dSSeongJae Parkhaving >=50% access frequency and LRU-prioritize while LRU-deprioritizing
2776acfcd0dSSeongJae Parkmemory regions that not accessed for 120 seconds.  The prioritization and
2786acfcd0dSSeongJae Parkdeprioritization is limited to be done using only up to 1% CPU time to avoid
2796acfcd0dSSeongJae ParkDAMON_LRU_SORT consuming too much CPU time for the (de)prioritization.  It also
2806acfcd0dSSeongJae Parkasks DAMON_LRU_SORT to do nothing if the system's free memory rate is more than
2816acfcd0dSSeongJae Park50%, but start the real works if it becomes lower than 40%.  If DAMON_RECLAIM
2826acfcd0dSSeongJae Parkdoesn't make progress and therefore the free memory rate becomes lower than
2836acfcd0dSSeongJae Park20%, it asks DAMON_LRU_SORT to do nothing again, so that we can fall back to
2846acfcd0dSSeongJae Parkthe LRU-list based page granularity reclamation. ::
2856acfcd0dSSeongJae Park
286*b05ada56SHui Su    # cd /sys/module/damon_lru_sort/parameters
2876acfcd0dSSeongJae Park    # echo 500 > hot_thres_access_freq
2886acfcd0dSSeongJae Park    # echo 120000000 > cold_min_age
2896acfcd0dSSeongJae Park    # echo 10 > quota_ms
2906acfcd0dSSeongJae Park    # echo 1000 > quota_reset_interval_ms
2916acfcd0dSSeongJae Park    # echo 500 > wmarks_high
2926acfcd0dSSeongJae Park    # echo 400 > wmarks_mid
2936acfcd0dSSeongJae Park    # echo 200 > wmarks_low
2946acfcd0dSSeongJae Park    # echo Y > enabled
295