1ee65728eSMike Rapoport.. _slub: 2ee65728eSMike Rapoport 3ee65728eSMike Rapoport========================== 4ee65728eSMike RapoportShort users guide for SLUB 5ee65728eSMike Rapoport========================== 6ee65728eSMike Rapoport 7ee65728eSMike RapoportThe basic philosophy of SLUB is very different from SLAB. SLAB 8ee65728eSMike Rapoportrequires rebuilding the kernel to activate debug options for all 9ee65728eSMike Rapoportslab caches. SLUB always includes full debugging but it is off by default. 10ee65728eSMike RapoportSLUB can enable debugging only for selected slabs in order to avoid 11ee65728eSMike Rapoportan impact on overall system performance which may make a bug more 12ee65728eSMike Rapoportdifficult to find. 13ee65728eSMike Rapoport 14ee65728eSMike RapoportIn order to switch debugging on one can add an option ``slub_debug`` 15ee65728eSMike Rapoportto the kernel command line. That will enable full debugging for 16ee65728eSMike Rapoportall slabs. 17ee65728eSMike Rapoport 18ee65728eSMike RapoportTypically one would then use the ``slabinfo`` command to get statistical 19ee65728eSMike Rapoportdata and perform operation on the slabs. By default ``slabinfo`` only lists 20ee65728eSMike Rapoportslabs that have data in them. See "slabinfo -h" for more options when 21ee65728eSMike Rapoportrunning the command. ``slabinfo`` can be compiled with 22ee65728eSMike Rapoport:: 23ee65728eSMike Rapoport 24*799fb82aSSeongJae Park gcc -o slabinfo tools/mm/slabinfo.c 25ee65728eSMike Rapoport 26ee65728eSMike RapoportSome of the modes of operation of ``slabinfo`` require that slub debugging 27ee65728eSMike Rapoportbe enabled on the command line. F.e. no tracking information will be 28ee65728eSMike Rapoportavailable without debugging on and validation can only partially 29ee65728eSMike Rapoportbe performed if debugging was not switched on. 30ee65728eSMike Rapoport 31ee65728eSMike RapoportSome more sophisticated uses of slub_debug: 32ee65728eSMike Rapoport------------------------------------------- 33ee65728eSMike Rapoport 34ee65728eSMike RapoportParameters may be given to ``slub_debug``. If none is specified then full 35ee65728eSMike Rapoportdebugging is enabled. Format: 36ee65728eSMike Rapoport 37ee65728eSMike Rapoportslub_debug=<Debug-Options> 38ee65728eSMike Rapoport Enable options for all slabs 39ee65728eSMike Rapoport 40ee65728eSMike Rapoportslub_debug=<Debug-Options>,<slab name1>,<slab name2>,... 41ee65728eSMike Rapoport Enable options only for select slabs (no spaces 42ee65728eSMike Rapoport after a comma) 43ee65728eSMike Rapoport 44ee65728eSMike RapoportMultiple blocks of options for all slabs or selected slabs can be given, with 45ee65728eSMike Rapoportblocks of options delimited by ';'. The last of "all slabs" blocks is applied 46ee65728eSMike Rapoportto all slabs except those that match one of the "select slabs" block. Options 47ee65728eSMike Rapoportof the first "select slabs" blocks that matches the slab's name are applied. 48ee65728eSMike Rapoport 49ee65728eSMike RapoportPossible debug options are:: 50ee65728eSMike Rapoport 51ee65728eSMike Rapoport F Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS 52ee65728eSMike Rapoport Sorry SLAB legacy issues) 53ee65728eSMike Rapoport Z Red zoning 54ee65728eSMike Rapoport P Poisoning (object and padding) 55ee65728eSMike Rapoport U User tracking (free and alloc) 56ee65728eSMike Rapoport T Trace (please only use on single slabs) 57ee65728eSMike Rapoport A Enable failslab filter mark for the cache 58ee65728eSMike Rapoport O Switch debugging off for caches that would have 59ee65728eSMike Rapoport caused higher minimum slab orders 60ee65728eSMike Rapoport - Switch all debugging off (useful if the kernel is 61ee65728eSMike Rapoport configured with CONFIG_SLUB_DEBUG_ON) 62ee65728eSMike Rapoport 63ee65728eSMike RapoportF.e. in order to boot just with sanity checks and red zoning one would specify:: 64ee65728eSMike Rapoport 65ee65728eSMike Rapoport slub_debug=FZ 66ee65728eSMike Rapoport 67ee65728eSMike RapoportTrying to find an issue in the dentry cache? Try:: 68ee65728eSMike Rapoport 69ee65728eSMike Rapoport slub_debug=,dentry 70ee65728eSMike Rapoport 71ee65728eSMike Rapoportto only enable debugging on the dentry cache. You may use an asterisk at the 72ee65728eSMike Rapoportend of the slab name, in order to cover all slabs with the same prefix. For 73ee65728eSMike Rapoportexample, here's how you can poison the dentry cache as well as all kmalloc 74ee65728eSMike Rapoportslabs:: 75ee65728eSMike Rapoport 76ee65728eSMike Rapoport slub_debug=P,kmalloc-*,dentry 77ee65728eSMike Rapoport 78ee65728eSMike RapoportRed zoning and tracking may realign the slab. We can just apply sanity checks 79ee65728eSMike Rapoportto the dentry cache with:: 80ee65728eSMike Rapoport 81ee65728eSMike Rapoport slub_debug=F,dentry 82ee65728eSMike Rapoport 83ee65728eSMike RapoportDebugging options may require the minimum possible slab order to increase as 84ee65728eSMike Rapoporta result of storing the metadata (for example, caches with PAGE_SIZE object 85ee65728eSMike Rapoportsizes). This has a higher liklihood of resulting in slab allocation errors 86ee65728eSMike Rapoportin low memory situations or if there's high fragmentation of memory. To 87ee65728eSMike Rapoportswitch off debugging for such caches by default, use:: 88ee65728eSMike Rapoport 89ee65728eSMike Rapoport slub_debug=O 90ee65728eSMike Rapoport 91ee65728eSMike RapoportYou can apply different options to different list of slab names, using blocks 92ee65728eSMike Rapoportof options. This will enable red zoning for dentry and user tracking for 93ee65728eSMike Rapoportkmalloc. All other slabs will not get any debugging enabled:: 94ee65728eSMike Rapoport 95ee65728eSMike Rapoport slub_debug=Z,dentry;U,kmalloc-* 96ee65728eSMike Rapoport 97ee65728eSMike RapoportYou can also enable options (e.g. sanity checks and poisoning) for all caches 98ee65728eSMike Rapoportexcept some that are deemed too performance critical and don't need to be 99ee65728eSMike Rapoportdebugged by specifying global debug options followed by a list of slab names 100ee65728eSMike Rapoportwith "-" as options:: 101ee65728eSMike Rapoport 102ee65728eSMike Rapoport slub_debug=FZ;-,zs_handle,zspage 103ee65728eSMike Rapoport 104ee65728eSMike RapoportThe state of each debug option for a slab can be found in the respective files 105ee65728eSMike Rapoportunder:: 106ee65728eSMike Rapoport 107ee65728eSMike Rapoport /sys/kernel/slab/<slab name>/ 108ee65728eSMike Rapoport 109ee65728eSMike RapoportIf the file contains 1, the option is enabled, 0 means disabled. The debug 110ee65728eSMike Rapoportoptions from the ``slub_debug`` parameter translate to the following files:: 111ee65728eSMike Rapoport 112ee65728eSMike Rapoport F sanity_checks 113ee65728eSMike Rapoport Z red_zone 114ee65728eSMike Rapoport P poison 115ee65728eSMike Rapoport U store_user 116ee65728eSMike Rapoport T trace 117ee65728eSMike Rapoport A failslab 118ee65728eSMike Rapoport 1197c82b3b3SAlexander Atanasovfailslab file is writable, so writing 1 or 0 will enable or disable 1207c82b3b3SAlexander Atanasovthe option at runtime. Write returns -EINVAL if cache is an alias. 121ee65728eSMike RapoportCareful with tracing: It may spew out lots of information and never stop if 122ee65728eSMike Rapoportused on the wrong slab. 123ee65728eSMike Rapoport 124ee65728eSMike RapoportSlab merging 125ee65728eSMike Rapoport============ 126ee65728eSMike Rapoport 127ee65728eSMike RapoportIf no debug options are specified then SLUB may merge similar slabs together 128ee65728eSMike Rapoportin order to reduce overhead and increase cache hotness of objects. 129ee65728eSMike Rapoport``slabinfo -a`` displays which slabs were merged together. 130ee65728eSMike Rapoport 131ee65728eSMike RapoportSlab validation 132ee65728eSMike Rapoport=============== 133ee65728eSMike Rapoport 134ee65728eSMike RapoportSLUB can validate all object if the kernel was booted with slub_debug. In 135ee65728eSMike Rapoportorder to do so you must have the ``slabinfo`` tool. Then you can do 136ee65728eSMike Rapoport:: 137ee65728eSMike Rapoport 138ee65728eSMike Rapoport slabinfo -v 139ee65728eSMike Rapoport 140ee65728eSMike Rapoportwhich will test all objects. Output will be generated to the syslog. 141ee65728eSMike Rapoport 142ee65728eSMike RapoportThis also works in a more limited way if boot was without slab debug. 143ee65728eSMike RapoportIn that case ``slabinfo -v`` simply tests all reachable objects. Usually 144ee65728eSMike Rapoportthese are in the cpu slabs and the partial slabs. Full slabs are not 145ee65728eSMike Rapoporttracked by SLUB in a non debug situation. 146ee65728eSMike Rapoport 147ee65728eSMike RapoportGetting more performance 148ee65728eSMike Rapoport======================== 149ee65728eSMike Rapoport 150ee65728eSMike RapoportTo some degree SLUB's performance is limited by the need to take the 151ee65728eSMike Rapoportlist_lock once in a while to deal with partial slabs. That overhead is 152ee65728eSMike Rapoportgoverned by the order of the allocation for each slab. The allocations 153ee65728eSMike Rapoportcan be influenced by kernel parameters: 154ee65728eSMike Rapoport 155ee65728eSMike Rapoport.. slub_min_objects=x (default 4) 156ee65728eSMike Rapoport.. slub_min_order=x (default 0) 157ee65728eSMike Rapoport.. slub_max_order=x (default 3 (PAGE_ALLOC_COSTLY_ORDER)) 158ee65728eSMike Rapoport 159ee65728eSMike Rapoport``slub_min_objects`` 160ee65728eSMike Rapoport allows to specify how many objects must at least fit into one 161ee65728eSMike Rapoport slab in order for the allocation order to be acceptable. In 162ee65728eSMike Rapoport general slub will be able to perform this number of 163ee65728eSMike Rapoport allocations on a slab without consulting centralized resources 164ee65728eSMike Rapoport (list_lock) where contention may occur. 165ee65728eSMike Rapoport 166ee65728eSMike Rapoport``slub_min_order`` 167ee65728eSMike Rapoport specifies a minimum order of slabs. A similar effect like 168ee65728eSMike Rapoport ``slub_min_objects``. 169ee65728eSMike Rapoport 170ee65728eSMike Rapoport``slub_max_order`` 171ee65728eSMike Rapoport specified the order at which ``slub_min_objects`` should no 172ee65728eSMike Rapoport longer be checked. This is useful to avoid SLUB trying to 173ee65728eSMike Rapoport generate super large order pages to fit ``slub_min_objects`` 174ee65728eSMike Rapoport of a slab cache with large object sizes into one high order 175ee65728eSMike Rapoport page. Setting command line parameter 176ee65728eSMike Rapoport ``debug_guardpage_minorder=N`` (N > 0), forces setting 177ee65728eSMike Rapoport ``slub_max_order`` to 0, what cause minimum possible order of 178ee65728eSMike Rapoport slabs allocation. 179ee65728eSMike Rapoport 180ee65728eSMike RapoportSLUB Debug output 181ee65728eSMike Rapoport================= 182ee65728eSMike Rapoport 183ee65728eSMike RapoportHere is a sample of slub debug output:: 184ee65728eSMike Rapoport 185ee65728eSMike Rapoport ==================================================================== 186ee65728eSMike Rapoport BUG kmalloc-8: Right Redzone overwritten 187ee65728eSMike Rapoport -------------------------------------------------------------------- 188ee65728eSMike Rapoport 189ee65728eSMike Rapoport INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc 190ee65728eSMike Rapoport INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58 191ee65728eSMike Rapoport INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58 192ee65728eSMike Rapoport INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554 193ee65728eSMike Rapoport 194ee65728eSMike Rapoport Bytes b4 (0xc90f6d10): 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ 195ee65728eSMike Rapoport Object (0xc90f6d20): 31 30 31 39 2e 30 30 35 1019.005 196ee65728eSMike Rapoport Redzone (0xc90f6d28): 00 cc cc cc . 197ee65728eSMike Rapoport Padding (0xc90f6d50): 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ 198ee65728eSMike Rapoport 199ee65728eSMike Rapoport [<c010523d>] dump_trace+0x63/0x1eb 200ee65728eSMike Rapoport [<c01053df>] show_trace_log_lvl+0x1a/0x2f 201ee65728eSMike Rapoport [<c010601d>] show_trace+0x12/0x14 202ee65728eSMike Rapoport [<c0106035>] dump_stack+0x16/0x18 203ee65728eSMike Rapoport [<c017e0fa>] object_err+0x143/0x14b 204ee65728eSMike Rapoport [<c017e2cc>] check_object+0x66/0x234 205ee65728eSMike Rapoport [<c017eb43>] __slab_free+0x239/0x384 206ee65728eSMike Rapoport [<c017f446>] kfree+0xa6/0xc6 207ee65728eSMike Rapoport [<c02e2335>] get_modalias+0xb9/0xf5 208ee65728eSMike Rapoport [<c02e23b7>] dmi_dev_uevent+0x27/0x3c 209ee65728eSMike Rapoport [<c027866a>] dev_uevent+0x1ad/0x1da 210ee65728eSMike Rapoport [<c0205024>] kobject_uevent_env+0x20a/0x45b 211ee65728eSMike Rapoport [<c020527f>] kobject_uevent+0xa/0xf 212ee65728eSMike Rapoport [<c02779f1>] store_uevent+0x4f/0x58 213ee65728eSMike Rapoport [<c027758e>] dev_attr_store+0x29/0x2f 214ee65728eSMike Rapoport [<c01bec4f>] sysfs_write_file+0x16e/0x19c 215ee65728eSMike Rapoport [<c0183ba7>] vfs_write+0xd1/0x15a 216ee65728eSMike Rapoport [<c01841d7>] sys_write+0x3d/0x72 217ee65728eSMike Rapoport [<c0104112>] sysenter_past_esp+0x5f/0x99 218ee65728eSMike Rapoport [<b7f7b410>] 0xb7f7b410 219ee65728eSMike Rapoport ======================= 220ee65728eSMike Rapoport 221ee65728eSMike Rapoport FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc 222ee65728eSMike Rapoport 223ee65728eSMike RapoportIf SLUB encounters a corrupted object (full detection requires the kernel 224ee65728eSMike Rapoportto be booted with slub_debug) then the following output will be dumped 225ee65728eSMike Rapoportinto the syslog: 226ee65728eSMike Rapoport 227ee65728eSMike Rapoport1. Description of the problem encountered 228ee65728eSMike Rapoport 229ee65728eSMike Rapoport This will be a message in the system log starting with:: 230ee65728eSMike Rapoport 231ee65728eSMike Rapoport =============================================== 232ee65728eSMike Rapoport BUG <slab cache affected>: <What went wrong> 233ee65728eSMike Rapoport ----------------------------------------------- 234ee65728eSMike Rapoport 235ee65728eSMike Rapoport INFO: <corruption start>-<corruption_end> <more info> 236ee65728eSMike Rapoport INFO: Slab <address> <slab information> 237ee65728eSMike Rapoport INFO: Object <address> <object information> 238ee65728eSMike Rapoport INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by 239ee65728eSMike Rapoport cpu> pid=<pid of the process> 240ee65728eSMike Rapoport INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu> 241ee65728eSMike Rapoport pid=<pid of the process> 242ee65728eSMike Rapoport 243ee65728eSMike Rapoport (Object allocation / free information is only available if SLAB_STORE_USER is 244ee65728eSMike Rapoport set for the slab. slub_debug sets that option) 245ee65728eSMike Rapoport 246ee65728eSMike Rapoport2. The object contents if an object was involved. 247ee65728eSMike Rapoport 248ee65728eSMike Rapoport Various types of lines can follow the BUG SLUB line: 249ee65728eSMike Rapoport 250ee65728eSMike Rapoport Bytes b4 <address> : <bytes> 251ee65728eSMike Rapoport Shows a few bytes before the object where the problem was detected. 252ee65728eSMike Rapoport Can be useful if the corruption does not stop with the start of the 253ee65728eSMike Rapoport object. 254ee65728eSMike Rapoport 255ee65728eSMike Rapoport Object <address> : <bytes> 256ee65728eSMike Rapoport The bytes of the object. If the object is inactive then the bytes 257ee65728eSMike Rapoport typically contain poison values. Any non-poison value shows a 258ee65728eSMike Rapoport corruption by a write after free. 259ee65728eSMike Rapoport 260ee65728eSMike Rapoport Redzone <address> : <bytes> 261ee65728eSMike Rapoport The Redzone following the object. The Redzone is used to detect 262ee65728eSMike Rapoport writes after the object. All bytes should always have the same 263ee65728eSMike Rapoport value. If there is any deviation then it is due to a write after 264ee65728eSMike Rapoport the object boundary. 265ee65728eSMike Rapoport 266ee65728eSMike Rapoport (Redzone information is only available if SLAB_RED_ZONE is set. 267ee65728eSMike Rapoport slub_debug sets that option) 268ee65728eSMike Rapoport 269ee65728eSMike Rapoport Padding <address> : <bytes> 270ee65728eSMike Rapoport Unused data to fill up the space in order to get the next object 271ee65728eSMike Rapoport properly aligned. In the debug case we make sure that there are 272ee65728eSMike Rapoport at least 4 bytes of padding. This allows the detection of writes 273ee65728eSMike Rapoport before the object. 274ee65728eSMike Rapoport 275ee65728eSMike Rapoport3. A stackdump 276ee65728eSMike Rapoport 277ee65728eSMike Rapoport The stackdump describes the location where the error was detected. The cause 278ee65728eSMike Rapoport of the corruption is may be more likely found by looking at the function that 279ee65728eSMike Rapoport allocated or freed the object. 280ee65728eSMike Rapoport 281ee65728eSMike Rapoport4. Report on how the problem was dealt with in order to ensure the continued 282ee65728eSMike Rapoport operation of the system. 283ee65728eSMike Rapoport 284ee65728eSMike Rapoport These are messages in the system log beginning with:: 285ee65728eSMike Rapoport 286ee65728eSMike Rapoport FIX <slab cache affected>: <corrective action taken> 287ee65728eSMike Rapoport 288ee65728eSMike Rapoport In the above sample SLUB found that the Redzone of an active object has 289ee65728eSMike Rapoport been overwritten. Here a string of 8 characters was written into a slab that 290ee65728eSMike Rapoport has the length of 8 characters. However, a 8 character string needs a 291ee65728eSMike Rapoport terminating 0. That zero has overwritten the first byte of the Redzone field. 292ee65728eSMike Rapoport After reporting the details of the issue encountered the FIX SLUB message 293ee65728eSMike Rapoport tells us that SLUB has restored the Redzone to its proper value and then 294ee65728eSMike Rapoport system operations continue. 295ee65728eSMike Rapoport 296ee65728eSMike RapoportEmergency operations 297ee65728eSMike Rapoport==================== 298ee65728eSMike Rapoport 299ee65728eSMike RapoportMinimal debugging (sanity checks alone) can be enabled by booting with:: 300ee65728eSMike Rapoport 301ee65728eSMike Rapoport slub_debug=F 302ee65728eSMike Rapoport 303ee65728eSMike RapoportThis will be generally be enough to enable the resiliency features of slub 304ee65728eSMike Rapoportwhich will keep the system running even if a bad kernel component will 305ee65728eSMike Rapoportkeep corrupting objects. This may be important for production systems. 306ee65728eSMike RapoportPerformance will be impacted by the sanity checks and there will be a 307ee65728eSMike Rapoportcontinual stream of error messages to the syslog but no additional memory 308ee65728eSMike Rapoportwill be used (unlike full debugging). 309ee65728eSMike Rapoport 310ee65728eSMike RapoportNo guarantees. The kernel component still needs to be fixed. Performance 311ee65728eSMike Rapoportmay be optimized further by locating the slab that experiences corruption 312ee65728eSMike Rapoportand enabling debugging only for that cache 313ee65728eSMike Rapoport 314ee65728eSMike RapoportI.e.:: 315ee65728eSMike Rapoport 316ee65728eSMike Rapoport slub_debug=F,dentry 317ee65728eSMike Rapoport 318ee65728eSMike RapoportIf the corruption occurs by writing after the end of the object then it 319ee65728eSMike Rapoportmay be advisable to enable a Redzone to avoid corrupting the beginning 320ee65728eSMike Rapoportof other objects:: 321ee65728eSMike Rapoport 322ee65728eSMike Rapoport slub_debug=FZ,dentry 323ee65728eSMike Rapoport 324ee65728eSMike RapoportExtended slabinfo mode and plotting 325ee65728eSMike Rapoport=================================== 326ee65728eSMike Rapoport 327ee65728eSMike RapoportThe ``slabinfo`` tool has a special 'extended' ('-X') mode that includes: 328ee65728eSMike Rapoport - Slabcache Totals 329ee65728eSMike Rapoport - Slabs sorted by size (up to -N <num> slabs, default 1) 330ee65728eSMike Rapoport - Slabs sorted by loss (up to -N <num> slabs, default 1) 331ee65728eSMike Rapoport 332ee65728eSMike RapoportAdditionally, in this mode ``slabinfo`` does not dynamically scale 333ee65728eSMike Rapoportsizes (G/M/K) and reports everything in bytes (this functionality is 334ee65728eSMike Rapoportalso available to other slabinfo modes via '-B' option) which makes 335ee65728eSMike Rapoportreporting more precise and accurate. Moreover, in some sense the `-X' 336ee65728eSMike Rapoportmode also simplifies the analysis of slabs' behaviour, because its 337ee65728eSMike Rapoportoutput can be plotted using the ``slabinfo-gnuplot.sh`` script. So it 338ee65728eSMike Rapoportpushes the analysis from looking through the numbers (tons of numbers) 339ee65728eSMike Rapoportto something easier -- visual analysis. 340ee65728eSMike Rapoport 341ee65728eSMike RapoportTo generate plots: 342ee65728eSMike Rapoport 343ee65728eSMike Rapoporta) collect slabinfo extended records, for example:: 344ee65728eSMike Rapoport 345ee65728eSMike Rapoport while [ 1 ]; do slabinfo -X >> FOO_STATS; sleep 1; done 346ee65728eSMike Rapoport 347ee65728eSMike Rapoportb) pass stats file(-s) to ``slabinfo-gnuplot.sh`` script:: 348ee65728eSMike Rapoport 349ee65728eSMike Rapoport slabinfo-gnuplot.sh FOO_STATS [FOO_STATS2 .. FOO_STATSN] 350ee65728eSMike Rapoport 351ee65728eSMike Rapoport The ``slabinfo-gnuplot.sh`` script will pre-processes the collected records 352ee65728eSMike Rapoport and generates 3 png files (and 3 pre-processing cache files) per STATS 353ee65728eSMike Rapoport file: 354ee65728eSMike Rapoport - Slabcache Totals: FOO_STATS-totals.png 355ee65728eSMike Rapoport - Slabs sorted by size: FOO_STATS-slabs-by-size.png 356ee65728eSMike Rapoport - Slabs sorted by loss: FOO_STATS-slabs-by-loss.png 357ee65728eSMike Rapoport 358ee65728eSMike RapoportAnother use case, when ``slabinfo-gnuplot.sh`` can be useful, is when you 359ee65728eSMike Rapoportneed to compare slabs' behaviour "prior to" and "after" some code 360ee65728eSMike Rapoportmodification. To help you out there, ``slabinfo-gnuplot.sh`` script 361ee65728eSMike Rapoportcan 'merge' the `Slabcache Totals` sections from different 362ee65728eSMike Rapoportmeasurements. To visually compare N plots: 363ee65728eSMike Rapoport 364ee65728eSMike Rapoporta) Collect as many STATS1, STATS2, .. STATSN files as you need:: 365ee65728eSMike Rapoport 366ee65728eSMike Rapoport while [ 1 ]; do slabinfo -X >> STATS<X>; sleep 1; done 367ee65728eSMike Rapoport 368ee65728eSMike Rapoportb) Pre-process those STATS files:: 369ee65728eSMike Rapoport 370ee65728eSMike Rapoport slabinfo-gnuplot.sh STATS1 STATS2 .. STATSN 371ee65728eSMike Rapoport 372ee65728eSMike Rapoportc) Execute ``slabinfo-gnuplot.sh`` in '-t' mode, passing all of the 373ee65728eSMike Rapoport generated pre-processed \*-totals:: 374ee65728eSMike Rapoport 375ee65728eSMike Rapoport slabinfo-gnuplot.sh -t STATS1-totals STATS2-totals .. STATSN-totals 376ee65728eSMike Rapoport 377ee65728eSMike Rapoport This will produce a single plot (png file). 378ee65728eSMike Rapoport 379ee65728eSMike Rapoport Plots, expectedly, can be large so some fluctuations or small spikes 380ee65728eSMike Rapoport can go unnoticed. To deal with that, ``slabinfo-gnuplot.sh`` has two 381ee65728eSMike Rapoport options to 'zoom-in'/'zoom-out': 382ee65728eSMike Rapoport 383ee65728eSMike Rapoport a) ``-s %d,%d`` -- overwrites the default image width and height 384ee65728eSMike Rapoport b) ``-r %d,%d`` -- specifies a range of samples to use (for example, 385ee65728eSMike Rapoport in ``slabinfo -X >> FOO_STATS; sleep 1;`` case, using a ``-r 386ee65728eSMike Rapoport 40,60`` range will plot only samples collected between 40th and 387ee65728eSMike Rapoport 60th seconds). 388ee65728eSMike Rapoport 389ee65728eSMike Rapoport 390ee65728eSMike RapoportDebugFS files for SLUB 391ee65728eSMike Rapoport====================== 392ee65728eSMike Rapoport 393ee65728eSMike RapoportFor more information about current state of SLUB caches with the user tracking 394ee65728eSMike Rapoportdebug option enabled, debugfs files are available, typically under 395ee65728eSMike Rapoport/sys/kernel/debug/slab/<cache>/ (created only for caches with enabled user 396ee65728eSMike Rapoporttracking). There are 2 types of these files with the following debug 397ee65728eSMike Rapoportinformation: 398ee65728eSMike Rapoport 399ee65728eSMike Rapoport1. alloc_traces:: 400ee65728eSMike Rapoport 401ee65728eSMike Rapoport Prints information about unique allocation traces of the currently 402ee65728eSMike Rapoport allocated objects. The output is sorted by frequency of each trace. 403ee65728eSMike Rapoport 404ee65728eSMike Rapoport Information in the output: 4056edf2576SFeng Tang Number of objects, allocating function, possible memory wastage of 4066edf2576SFeng Tang kmalloc objects(total/per-object), minimal/average/maximal jiffies 4076edf2576SFeng Tang since alloc, pid range of the allocating processes, cpu mask of 4086edf2576SFeng Tang allocating cpus, numa node mask of origins of memory, and stack trace. 409ee65728eSMike Rapoport 410ee65728eSMike Rapoport Example::: 411ee65728eSMike Rapoport 4126edf2576SFeng Tang 338 pci_alloc_dev+0x2c/0xa0 waste=521872/1544 age=290837/291891/293509 pid=1 cpus=106 nodes=0-1 4136edf2576SFeng Tang __kmem_cache_alloc_node+0x11f/0x4e0 4146edf2576SFeng Tang kmalloc_trace+0x26/0xa0 4156edf2576SFeng Tang pci_alloc_dev+0x2c/0xa0 4166edf2576SFeng Tang pci_scan_single_device+0xd2/0x150 4176edf2576SFeng Tang pci_scan_slot+0xf7/0x2d0 4186edf2576SFeng Tang pci_scan_child_bus_extend+0x4e/0x360 4196edf2576SFeng Tang acpi_pci_root_create+0x32e/0x3b0 4206edf2576SFeng Tang pci_acpi_scan_root+0x2b9/0x2d0 4216edf2576SFeng Tang acpi_pci_root_add.cold.11+0x110/0xb0a 4226edf2576SFeng Tang acpi_bus_attach+0x262/0x3f0 4236edf2576SFeng Tang device_for_each_child+0xb7/0x110 4246edf2576SFeng Tang acpi_dev_for_each_child+0x77/0xa0 4256edf2576SFeng Tang acpi_bus_attach+0x108/0x3f0 4266edf2576SFeng Tang device_for_each_child+0xb7/0x110 4276edf2576SFeng Tang acpi_dev_for_each_child+0x77/0xa0 4286edf2576SFeng Tang acpi_bus_attach+0x108/0x3f0 429ee65728eSMike Rapoport 430ee65728eSMike Rapoport2. free_traces:: 431ee65728eSMike Rapoport 432ee65728eSMike Rapoport Prints information about unique freeing traces of the currently allocated 433ee65728eSMike Rapoport objects. The freeing traces thus come from the previous life-cycle of the 434ee65728eSMike Rapoport objects and are reported as not available for objects allocated for the first 435ee65728eSMike Rapoport time. The output is sorted by frequency of each trace. 436ee65728eSMike Rapoport 437ee65728eSMike Rapoport Information in the output: 438ee65728eSMike Rapoport Number of objects, freeing function, minimal/average/maximal jiffies since free, 439ee65728eSMike Rapoport pid range of the freeing processes, cpu mask of freeing cpus, and stack trace. 440ee65728eSMike Rapoport 441ee65728eSMike Rapoport Example::: 442ee65728eSMike Rapoport 443ee65728eSMike Rapoport 1980 <not-available> age=4294912290 pid=0 cpus=0 444ee65728eSMike Rapoport 51 acpi_ut_update_ref_count+0x6a6/0x782 age=236886/237027/237772 pid=1 cpus=1 445ee65728eSMike Rapoport kfree+0x2db/0x420 446ee65728eSMike Rapoport acpi_ut_update_ref_count+0x6a6/0x782 447ee65728eSMike Rapoport acpi_ut_update_object_reference+0x1ad/0x234 448ee65728eSMike Rapoport acpi_ut_remove_reference+0x7d/0x84 449ee65728eSMike Rapoport acpi_rs_get_prt_method_data+0x97/0xd6 450ee65728eSMike Rapoport acpi_get_irq_routing_table+0x82/0xc4 451ee65728eSMike Rapoport acpi_pci_irq_find_prt_entry+0x8e/0x2e0 452ee65728eSMike Rapoport acpi_pci_irq_lookup+0x3a/0x1e0 453ee65728eSMike Rapoport acpi_pci_irq_enable+0x77/0x240 454ee65728eSMike Rapoport pcibios_enable_device+0x39/0x40 455ee65728eSMike Rapoport do_pci_enable_device.part.0+0x5d/0xe0 456ee65728eSMike Rapoport pci_enable_device_flags+0xfc/0x120 457ee65728eSMike Rapoport pci_enable_device+0x13/0x20 458ee65728eSMike Rapoport virtio_pci_probe+0x9e/0x170 459ee65728eSMike Rapoport local_pci_probe+0x48/0x80 460ee65728eSMike Rapoport pci_device_probe+0x105/0x1c0 461ee65728eSMike Rapoport 462ee65728eSMike RapoportChristoph Lameter, May 30, 2007 463ee65728eSMike RapoportSergey Senozhatsky, October 23, 2015 464