1*ee65728eSMike Rapoport.. _slub: 2*ee65728eSMike Rapoport 3*ee65728eSMike Rapoport========================== 4*ee65728eSMike RapoportShort users guide for SLUB 5*ee65728eSMike Rapoport========================== 6*ee65728eSMike Rapoport 7*ee65728eSMike RapoportThe basic philosophy of SLUB is very different from SLAB. SLAB 8*ee65728eSMike Rapoportrequires rebuilding the kernel to activate debug options for all 9*ee65728eSMike Rapoportslab caches. SLUB always includes full debugging but it is off by default. 10*ee65728eSMike RapoportSLUB can enable debugging only for selected slabs in order to avoid 11*ee65728eSMike Rapoportan impact on overall system performance which may make a bug more 12*ee65728eSMike Rapoportdifficult to find. 13*ee65728eSMike Rapoport 14*ee65728eSMike RapoportIn order to switch debugging on one can add an option ``slub_debug`` 15*ee65728eSMike Rapoportto the kernel command line. That will enable full debugging for 16*ee65728eSMike Rapoportall slabs. 17*ee65728eSMike Rapoport 18*ee65728eSMike RapoportTypically one would then use the ``slabinfo`` command to get statistical 19*ee65728eSMike Rapoportdata and perform operation on the slabs. By default ``slabinfo`` only lists 20*ee65728eSMike Rapoportslabs that have data in them. See "slabinfo -h" for more options when 21*ee65728eSMike Rapoportrunning the command. ``slabinfo`` can be compiled with 22*ee65728eSMike Rapoport:: 23*ee65728eSMike Rapoport 24*ee65728eSMike Rapoport gcc -o slabinfo tools/vm/slabinfo.c 25*ee65728eSMike Rapoport 26*ee65728eSMike RapoportSome of the modes of operation of ``slabinfo`` require that slub debugging 27*ee65728eSMike Rapoportbe enabled on the command line. F.e. no tracking information will be 28*ee65728eSMike Rapoportavailable without debugging on and validation can only partially 29*ee65728eSMike Rapoportbe performed if debugging was not switched on. 30*ee65728eSMike Rapoport 31*ee65728eSMike RapoportSome more sophisticated uses of slub_debug: 32*ee65728eSMike Rapoport------------------------------------------- 33*ee65728eSMike Rapoport 34*ee65728eSMike RapoportParameters may be given to ``slub_debug``. If none is specified then full 35*ee65728eSMike Rapoportdebugging is enabled. Format: 36*ee65728eSMike Rapoport 37*ee65728eSMike Rapoportslub_debug=<Debug-Options> 38*ee65728eSMike Rapoport Enable options for all slabs 39*ee65728eSMike Rapoport 40*ee65728eSMike Rapoportslub_debug=<Debug-Options>,<slab name1>,<slab name2>,... 41*ee65728eSMike Rapoport Enable options only for select slabs (no spaces 42*ee65728eSMike Rapoport after a comma) 43*ee65728eSMike Rapoport 44*ee65728eSMike RapoportMultiple blocks of options for all slabs or selected slabs can be given, with 45*ee65728eSMike Rapoportblocks of options delimited by ';'. The last of "all slabs" blocks is applied 46*ee65728eSMike Rapoportto all slabs except those that match one of the "select slabs" block. Options 47*ee65728eSMike Rapoportof the first "select slabs" blocks that matches the slab's name are applied. 48*ee65728eSMike Rapoport 49*ee65728eSMike RapoportPossible debug options are:: 50*ee65728eSMike Rapoport 51*ee65728eSMike Rapoport F Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS 52*ee65728eSMike Rapoport Sorry SLAB legacy issues) 53*ee65728eSMike Rapoport Z Red zoning 54*ee65728eSMike Rapoport P Poisoning (object and padding) 55*ee65728eSMike Rapoport U User tracking (free and alloc) 56*ee65728eSMike Rapoport T Trace (please only use on single slabs) 57*ee65728eSMike Rapoport A Enable failslab filter mark for the cache 58*ee65728eSMike Rapoport O Switch debugging off for caches that would have 59*ee65728eSMike Rapoport caused higher minimum slab orders 60*ee65728eSMike Rapoport - Switch all debugging off (useful if the kernel is 61*ee65728eSMike Rapoport configured with CONFIG_SLUB_DEBUG_ON) 62*ee65728eSMike Rapoport 63*ee65728eSMike RapoportF.e. in order to boot just with sanity checks and red zoning one would specify:: 64*ee65728eSMike Rapoport 65*ee65728eSMike Rapoport slub_debug=FZ 66*ee65728eSMike Rapoport 67*ee65728eSMike RapoportTrying to find an issue in the dentry cache? Try:: 68*ee65728eSMike Rapoport 69*ee65728eSMike Rapoport slub_debug=,dentry 70*ee65728eSMike Rapoport 71*ee65728eSMike Rapoportto only enable debugging on the dentry cache. You may use an asterisk at the 72*ee65728eSMike Rapoportend of the slab name, in order to cover all slabs with the same prefix. For 73*ee65728eSMike Rapoportexample, here's how you can poison the dentry cache as well as all kmalloc 74*ee65728eSMike Rapoportslabs:: 75*ee65728eSMike Rapoport 76*ee65728eSMike Rapoport slub_debug=P,kmalloc-*,dentry 77*ee65728eSMike Rapoport 78*ee65728eSMike RapoportRed zoning and tracking may realign the slab. We can just apply sanity checks 79*ee65728eSMike Rapoportto the dentry cache with:: 80*ee65728eSMike Rapoport 81*ee65728eSMike Rapoport slub_debug=F,dentry 82*ee65728eSMike Rapoport 83*ee65728eSMike RapoportDebugging options may require the minimum possible slab order to increase as 84*ee65728eSMike Rapoporta result of storing the metadata (for example, caches with PAGE_SIZE object 85*ee65728eSMike Rapoportsizes). This has a higher liklihood of resulting in slab allocation errors 86*ee65728eSMike Rapoportin low memory situations or if there's high fragmentation of memory. To 87*ee65728eSMike Rapoportswitch off debugging for such caches by default, use:: 88*ee65728eSMike Rapoport 89*ee65728eSMike Rapoport slub_debug=O 90*ee65728eSMike Rapoport 91*ee65728eSMike RapoportYou can apply different options to different list of slab names, using blocks 92*ee65728eSMike Rapoportof options. This will enable red zoning for dentry and user tracking for 93*ee65728eSMike Rapoportkmalloc. All other slabs will not get any debugging enabled:: 94*ee65728eSMike Rapoport 95*ee65728eSMike Rapoport slub_debug=Z,dentry;U,kmalloc-* 96*ee65728eSMike Rapoport 97*ee65728eSMike RapoportYou can also enable options (e.g. sanity checks and poisoning) for all caches 98*ee65728eSMike Rapoportexcept some that are deemed too performance critical and don't need to be 99*ee65728eSMike Rapoportdebugged by specifying global debug options followed by a list of slab names 100*ee65728eSMike Rapoportwith "-" as options:: 101*ee65728eSMike Rapoport 102*ee65728eSMike Rapoport slub_debug=FZ;-,zs_handle,zspage 103*ee65728eSMike Rapoport 104*ee65728eSMike RapoportThe state of each debug option for a slab can be found in the respective files 105*ee65728eSMike Rapoportunder:: 106*ee65728eSMike Rapoport 107*ee65728eSMike Rapoport /sys/kernel/slab/<slab name>/ 108*ee65728eSMike Rapoport 109*ee65728eSMike RapoportIf the file contains 1, the option is enabled, 0 means disabled. The debug 110*ee65728eSMike Rapoportoptions from the ``slub_debug`` parameter translate to the following files:: 111*ee65728eSMike Rapoport 112*ee65728eSMike Rapoport F sanity_checks 113*ee65728eSMike Rapoport Z red_zone 114*ee65728eSMike Rapoport P poison 115*ee65728eSMike Rapoport U store_user 116*ee65728eSMike Rapoport T trace 117*ee65728eSMike Rapoport A failslab 118*ee65728eSMike Rapoport 119*ee65728eSMike RapoportCareful with tracing: It may spew out lots of information and never stop if 120*ee65728eSMike Rapoportused on the wrong slab. 121*ee65728eSMike Rapoport 122*ee65728eSMike RapoportSlab merging 123*ee65728eSMike Rapoport============ 124*ee65728eSMike Rapoport 125*ee65728eSMike RapoportIf no debug options are specified then SLUB may merge similar slabs together 126*ee65728eSMike Rapoportin order to reduce overhead and increase cache hotness of objects. 127*ee65728eSMike Rapoport``slabinfo -a`` displays which slabs were merged together. 128*ee65728eSMike Rapoport 129*ee65728eSMike RapoportSlab validation 130*ee65728eSMike Rapoport=============== 131*ee65728eSMike Rapoport 132*ee65728eSMike RapoportSLUB can validate all object if the kernel was booted with slub_debug. In 133*ee65728eSMike Rapoportorder to do so you must have the ``slabinfo`` tool. Then you can do 134*ee65728eSMike Rapoport:: 135*ee65728eSMike Rapoport 136*ee65728eSMike Rapoport slabinfo -v 137*ee65728eSMike Rapoport 138*ee65728eSMike Rapoportwhich will test all objects. Output will be generated to the syslog. 139*ee65728eSMike Rapoport 140*ee65728eSMike RapoportThis also works in a more limited way if boot was without slab debug. 141*ee65728eSMike RapoportIn that case ``slabinfo -v`` simply tests all reachable objects. Usually 142*ee65728eSMike Rapoportthese are in the cpu slabs and the partial slabs. Full slabs are not 143*ee65728eSMike Rapoporttracked by SLUB in a non debug situation. 144*ee65728eSMike Rapoport 145*ee65728eSMike RapoportGetting more performance 146*ee65728eSMike Rapoport======================== 147*ee65728eSMike Rapoport 148*ee65728eSMike RapoportTo some degree SLUB's performance is limited by the need to take the 149*ee65728eSMike Rapoportlist_lock once in a while to deal with partial slabs. That overhead is 150*ee65728eSMike Rapoportgoverned by the order of the allocation for each slab. The allocations 151*ee65728eSMike Rapoportcan be influenced by kernel parameters: 152*ee65728eSMike Rapoport 153*ee65728eSMike Rapoport.. slub_min_objects=x (default 4) 154*ee65728eSMike Rapoport.. slub_min_order=x (default 0) 155*ee65728eSMike Rapoport.. slub_max_order=x (default 3 (PAGE_ALLOC_COSTLY_ORDER)) 156*ee65728eSMike Rapoport 157*ee65728eSMike Rapoport``slub_min_objects`` 158*ee65728eSMike Rapoport allows to specify how many objects must at least fit into one 159*ee65728eSMike Rapoport slab in order for the allocation order to be acceptable. In 160*ee65728eSMike Rapoport general slub will be able to perform this number of 161*ee65728eSMike Rapoport allocations on a slab without consulting centralized resources 162*ee65728eSMike Rapoport (list_lock) where contention may occur. 163*ee65728eSMike Rapoport 164*ee65728eSMike Rapoport``slub_min_order`` 165*ee65728eSMike Rapoport specifies a minimum order of slabs. A similar effect like 166*ee65728eSMike Rapoport ``slub_min_objects``. 167*ee65728eSMike Rapoport 168*ee65728eSMike Rapoport``slub_max_order`` 169*ee65728eSMike Rapoport specified the order at which ``slub_min_objects`` should no 170*ee65728eSMike Rapoport longer be checked. This is useful to avoid SLUB trying to 171*ee65728eSMike Rapoport generate super large order pages to fit ``slub_min_objects`` 172*ee65728eSMike Rapoport of a slab cache with large object sizes into one high order 173*ee65728eSMike Rapoport page. Setting command line parameter 174*ee65728eSMike Rapoport ``debug_guardpage_minorder=N`` (N > 0), forces setting 175*ee65728eSMike Rapoport ``slub_max_order`` to 0, what cause minimum possible order of 176*ee65728eSMike Rapoport slabs allocation. 177*ee65728eSMike Rapoport 178*ee65728eSMike RapoportSLUB Debug output 179*ee65728eSMike Rapoport================= 180*ee65728eSMike Rapoport 181*ee65728eSMike RapoportHere is a sample of slub debug output:: 182*ee65728eSMike Rapoport 183*ee65728eSMike Rapoport ==================================================================== 184*ee65728eSMike Rapoport BUG kmalloc-8: Right Redzone overwritten 185*ee65728eSMike Rapoport -------------------------------------------------------------------- 186*ee65728eSMike Rapoport 187*ee65728eSMike Rapoport INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc 188*ee65728eSMike Rapoport INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58 189*ee65728eSMike Rapoport INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58 190*ee65728eSMike Rapoport INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554 191*ee65728eSMike Rapoport 192*ee65728eSMike Rapoport Bytes b4 (0xc90f6d10): 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ 193*ee65728eSMike Rapoport Object (0xc90f6d20): 31 30 31 39 2e 30 30 35 1019.005 194*ee65728eSMike Rapoport Redzone (0xc90f6d28): 00 cc cc cc . 195*ee65728eSMike Rapoport Padding (0xc90f6d50): 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ 196*ee65728eSMike Rapoport 197*ee65728eSMike Rapoport [<c010523d>] dump_trace+0x63/0x1eb 198*ee65728eSMike Rapoport [<c01053df>] show_trace_log_lvl+0x1a/0x2f 199*ee65728eSMike Rapoport [<c010601d>] show_trace+0x12/0x14 200*ee65728eSMike Rapoport [<c0106035>] dump_stack+0x16/0x18 201*ee65728eSMike Rapoport [<c017e0fa>] object_err+0x143/0x14b 202*ee65728eSMike Rapoport [<c017e2cc>] check_object+0x66/0x234 203*ee65728eSMike Rapoport [<c017eb43>] __slab_free+0x239/0x384 204*ee65728eSMike Rapoport [<c017f446>] kfree+0xa6/0xc6 205*ee65728eSMike Rapoport [<c02e2335>] get_modalias+0xb9/0xf5 206*ee65728eSMike Rapoport [<c02e23b7>] dmi_dev_uevent+0x27/0x3c 207*ee65728eSMike Rapoport [<c027866a>] dev_uevent+0x1ad/0x1da 208*ee65728eSMike Rapoport [<c0205024>] kobject_uevent_env+0x20a/0x45b 209*ee65728eSMike Rapoport [<c020527f>] kobject_uevent+0xa/0xf 210*ee65728eSMike Rapoport [<c02779f1>] store_uevent+0x4f/0x58 211*ee65728eSMike Rapoport [<c027758e>] dev_attr_store+0x29/0x2f 212*ee65728eSMike Rapoport [<c01bec4f>] sysfs_write_file+0x16e/0x19c 213*ee65728eSMike Rapoport [<c0183ba7>] vfs_write+0xd1/0x15a 214*ee65728eSMike Rapoport [<c01841d7>] sys_write+0x3d/0x72 215*ee65728eSMike Rapoport [<c0104112>] sysenter_past_esp+0x5f/0x99 216*ee65728eSMike Rapoport [<b7f7b410>] 0xb7f7b410 217*ee65728eSMike Rapoport ======================= 218*ee65728eSMike Rapoport 219*ee65728eSMike Rapoport FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc 220*ee65728eSMike Rapoport 221*ee65728eSMike RapoportIf SLUB encounters a corrupted object (full detection requires the kernel 222*ee65728eSMike Rapoportto be booted with slub_debug) then the following output will be dumped 223*ee65728eSMike Rapoportinto the syslog: 224*ee65728eSMike Rapoport 225*ee65728eSMike Rapoport1. Description of the problem encountered 226*ee65728eSMike Rapoport 227*ee65728eSMike Rapoport This will be a message in the system log starting with:: 228*ee65728eSMike Rapoport 229*ee65728eSMike Rapoport =============================================== 230*ee65728eSMike Rapoport BUG <slab cache affected>: <What went wrong> 231*ee65728eSMike Rapoport ----------------------------------------------- 232*ee65728eSMike Rapoport 233*ee65728eSMike Rapoport INFO: <corruption start>-<corruption_end> <more info> 234*ee65728eSMike Rapoport INFO: Slab <address> <slab information> 235*ee65728eSMike Rapoport INFO: Object <address> <object information> 236*ee65728eSMike Rapoport INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by 237*ee65728eSMike Rapoport cpu> pid=<pid of the process> 238*ee65728eSMike Rapoport INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu> 239*ee65728eSMike Rapoport pid=<pid of the process> 240*ee65728eSMike Rapoport 241*ee65728eSMike Rapoport (Object allocation / free information is only available if SLAB_STORE_USER is 242*ee65728eSMike Rapoport set for the slab. slub_debug sets that option) 243*ee65728eSMike Rapoport 244*ee65728eSMike Rapoport2. The object contents if an object was involved. 245*ee65728eSMike Rapoport 246*ee65728eSMike Rapoport Various types of lines can follow the BUG SLUB line: 247*ee65728eSMike Rapoport 248*ee65728eSMike Rapoport Bytes b4 <address> : <bytes> 249*ee65728eSMike Rapoport Shows a few bytes before the object where the problem was detected. 250*ee65728eSMike Rapoport Can be useful if the corruption does not stop with the start of the 251*ee65728eSMike Rapoport object. 252*ee65728eSMike Rapoport 253*ee65728eSMike Rapoport Object <address> : <bytes> 254*ee65728eSMike Rapoport The bytes of the object. If the object is inactive then the bytes 255*ee65728eSMike Rapoport typically contain poison values. Any non-poison value shows a 256*ee65728eSMike Rapoport corruption by a write after free. 257*ee65728eSMike Rapoport 258*ee65728eSMike Rapoport Redzone <address> : <bytes> 259*ee65728eSMike Rapoport The Redzone following the object. The Redzone is used to detect 260*ee65728eSMike Rapoport writes after the object. All bytes should always have the same 261*ee65728eSMike Rapoport value. If there is any deviation then it is due to a write after 262*ee65728eSMike Rapoport the object boundary. 263*ee65728eSMike Rapoport 264*ee65728eSMike Rapoport (Redzone information is only available if SLAB_RED_ZONE is set. 265*ee65728eSMike Rapoport slub_debug sets that option) 266*ee65728eSMike Rapoport 267*ee65728eSMike Rapoport Padding <address> : <bytes> 268*ee65728eSMike Rapoport Unused data to fill up the space in order to get the next object 269*ee65728eSMike Rapoport properly aligned. In the debug case we make sure that there are 270*ee65728eSMike Rapoport at least 4 bytes of padding. This allows the detection of writes 271*ee65728eSMike Rapoport before the object. 272*ee65728eSMike Rapoport 273*ee65728eSMike Rapoport3. A stackdump 274*ee65728eSMike Rapoport 275*ee65728eSMike Rapoport The stackdump describes the location where the error was detected. The cause 276*ee65728eSMike Rapoport of the corruption is may be more likely found by looking at the function that 277*ee65728eSMike Rapoport allocated or freed the object. 278*ee65728eSMike Rapoport 279*ee65728eSMike Rapoport4. Report on how the problem was dealt with in order to ensure the continued 280*ee65728eSMike Rapoport operation of the system. 281*ee65728eSMike Rapoport 282*ee65728eSMike Rapoport These are messages in the system log beginning with:: 283*ee65728eSMike Rapoport 284*ee65728eSMike Rapoport FIX <slab cache affected>: <corrective action taken> 285*ee65728eSMike Rapoport 286*ee65728eSMike Rapoport In the above sample SLUB found that the Redzone of an active object has 287*ee65728eSMike Rapoport been overwritten. Here a string of 8 characters was written into a slab that 288*ee65728eSMike Rapoport has the length of 8 characters. However, a 8 character string needs a 289*ee65728eSMike Rapoport terminating 0. That zero has overwritten the first byte of the Redzone field. 290*ee65728eSMike Rapoport After reporting the details of the issue encountered the FIX SLUB message 291*ee65728eSMike Rapoport tells us that SLUB has restored the Redzone to its proper value and then 292*ee65728eSMike Rapoport system operations continue. 293*ee65728eSMike Rapoport 294*ee65728eSMike RapoportEmergency operations 295*ee65728eSMike Rapoport==================== 296*ee65728eSMike Rapoport 297*ee65728eSMike RapoportMinimal debugging (sanity checks alone) can be enabled by booting with:: 298*ee65728eSMike Rapoport 299*ee65728eSMike Rapoport slub_debug=F 300*ee65728eSMike Rapoport 301*ee65728eSMike RapoportThis will be generally be enough to enable the resiliency features of slub 302*ee65728eSMike Rapoportwhich will keep the system running even if a bad kernel component will 303*ee65728eSMike Rapoportkeep corrupting objects. This may be important for production systems. 304*ee65728eSMike RapoportPerformance will be impacted by the sanity checks and there will be a 305*ee65728eSMike Rapoportcontinual stream of error messages to the syslog but no additional memory 306*ee65728eSMike Rapoportwill be used (unlike full debugging). 307*ee65728eSMike Rapoport 308*ee65728eSMike RapoportNo guarantees. The kernel component still needs to be fixed. Performance 309*ee65728eSMike Rapoportmay be optimized further by locating the slab that experiences corruption 310*ee65728eSMike Rapoportand enabling debugging only for that cache 311*ee65728eSMike Rapoport 312*ee65728eSMike RapoportI.e.:: 313*ee65728eSMike Rapoport 314*ee65728eSMike Rapoport slub_debug=F,dentry 315*ee65728eSMike Rapoport 316*ee65728eSMike RapoportIf the corruption occurs by writing after the end of the object then it 317*ee65728eSMike Rapoportmay be advisable to enable a Redzone to avoid corrupting the beginning 318*ee65728eSMike Rapoportof other objects:: 319*ee65728eSMike Rapoport 320*ee65728eSMike Rapoport slub_debug=FZ,dentry 321*ee65728eSMike Rapoport 322*ee65728eSMike RapoportExtended slabinfo mode and plotting 323*ee65728eSMike Rapoport=================================== 324*ee65728eSMike Rapoport 325*ee65728eSMike RapoportThe ``slabinfo`` tool has a special 'extended' ('-X') mode that includes: 326*ee65728eSMike Rapoport - Slabcache Totals 327*ee65728eSMike Rapoport - Slabs sorted by size (up to -N <num> slabs, default 1) 328*ee65728eSMike Rapoport - Slabs sorted by loss (up to -N <num> slabs, default 1) 329*ee65728eSMike Rapoport 330*ee65728eSMike RapoportAdditionally, in this mode ``slabinfo`` does not dynamically scale 331*ee65728eSMike Rapoportsizes (G/M/K) and reports everything in bytes (this functionality is 332*ee65728eSMike Rapoportalso available to other slabinfo modes via '-B' option) which makes 333*ee65728eSMike Rapoportreporting more precise and accurate. Moreover, in some sense the `-X' 334*ee65728eSMike Rapoportmode also simplifies the analysis of slabs' behaviour, because its 335*ee65728eSMike Rapoportoutput can be plotted using the ``slabinfo-gnuplot.sh`` script. So it 336*ee65728eSMike Rapoportpushes the analysis from looking through the numbers (tons of numbers) 337*ee65728eSMike Rapoportto something easier -- visual analysis. 338*ee65728eSMike Rapoport 339*ee65728eSMike RapoportTo generate plots: 340*ee65728eSMike Rapoport 341*ee65728eSMike Rapoporta) collect slabinfo extended records, for example:: 342*ee65728eSMike Rapoport 343*ee65728eSMike Rapoport while [ 1 ]; do slabinfo -X >> FOO_STATS; sleep 1; done 344*ee65728eSMike Rapoport 345*ee65728eSMike Rapoportb) pass stats file(-s) to ``slabinfo-gnuplot.sh`` script:: 346*ee65728eSMike Rapoport 347*ee65728eSMike Rapoport slabinfo-gnuplot.sh FOO_STATS [FOO_STATS2 .. FOO_STATSN] 348*ee65728eSMike Rapoport 349*ee65728eSMike Rapoport The ``slabinfo-gnuplot.sh`` script will pre-processes the collected records 350*ee65728eSMike Rapoport and generates 3 png files (and 3 pre-processing cache files) per STATS 351*ee65728eSMike Rapoport file: 352*ee65728eSMike Rapoport - Slabcache Totals: FOO_STATS-totals.png 353*ee65728eSMike Rapoport - Slabs sorted by size: FOO_STATS-slabs-by-size.png 354*ee65728eSMike Rapoport - Slabs sorted by loss: FOO_STATS-slabs-by-loss.png 355*ee65728eSMike Rapoport 356*ee65728eSMike RapoportAnother use case, when ``slabinfo-gnuplot.sh`` can be useful, is when you 357*ee65728eSMike Rapoportneed to compare slabs' behaviour "prior to" and "after" some code 358*ee65728eSMike Rapoportmodification. To help you out there, ``slabinfo-gnuplot.sh`` script 359*ee65728eSMike Rapoportcan 'merge' the `Slabcache Totals` sections from different 360*ee65728eSMike Rapoportmeasurements. To visually compare N plots: 361*ee65728eSMike Rapoport 362*ee65728eSMike Rapoporta) Collect as many STATS1, STATS2, .. STATSN files as you need:: 363*ee65728eSMike Rapoport 364*ee65728eSMike Rapoport while [ 1 ]; do slabinfo -X >> STATS<X>; sleep 1; done 365*ee65728eSMike Rapoport 366*ee65728eSMike Rapoportb) Pre-process those STATS files:: 367*ee65728eSMike Rapoport 368*ee65728eSMike Rapoport slabinfo-gnuplot.sh STATS1 STATS2 .. STATSN 369*ee65728eSMike Rapoport 370*ee65728eSMike Rapoportc) Execute ``slabinfo-gnuplot.sh`` in '-t' mode, passing all of the 371*ee65728eSMike Rapoport generated pre-processed \*-totals:: 372*ee65728eSMike Rapoport 373*ee65728eSMike Rapoport slabinfo-gnuplot.sh -t STATS1-totals STATS2-totals .. STATSN-totals 374*ee65728eSMike Rapoport 375*ee65728eSMike Rapoport This will produce a single plot (png file). 376*ee65728eSMike Rapoport 377*ee65728eSMike Rapoport Plots, expectedly, can be large so some fluctuations or small spikes 378*ee65728eSMike Rapoport can go unnoticed. To deal with that, ``slabinfo-gnuplot.sh`` has two 379*ee65728eSMike Rapoport options to 'zoom-in'/'zoom-out': 380*ee65728eSMike Rapoport 381*ee65728eSMike Rapoport a) ``-s %d,%d`` -- overwrites the default image width and height 382*ee65728eSMike Rapoport b) ``-r %d,%d`` -- specifies a range of samples to use (for example, 383*ee65728eSMike Rapoport in ``slabinfo -X >> FOO_STATS; sleep 1;`` case, using a ``-r 384*ee65728eSMike Rapoport 40,60`` range will plot only samples collected between 40th and 385*ee65728eSMike Rapoport 60th seconds). 386*ee65728eSMike Rapoport 387*ee65728eSMike Rapoport 388*ee65728eSMike RapoportDebugFS files for SLUB 389*ee65728eSMike Rapoport====================== 390*ee65728eSMike Rapoport 391*ee65728eSMike RapoportFor more information about current state of SLUB caches with the user tracking 392*ee65728eSMike Rapoportdebug option enabled, debugfs files are available, typically under 393*ee65728eSMike Rapoport/sys/kernel/debug/slab/<cache>/ (created only for caches with enabled user 394*ee65728eSMike Rapoporttracking). There are 2 types of these files with the following debug 395*ee65728eSMike Rapoportinformation: 396*ee65728eSMike Rapoport 397*ee65728eSMike Rapoport1. alloc_traces:: 398*ee65728eSMike Rapoport 399*ee65728eSMike Rapoport Prints information about unique allocation traces of the currently 400*ee65728eSMike Rapoport allocated objects. The output is sorted by frequency of each trace. 401*ee65728eSMike Rapoport 402*ee65728eSMike Rapoport Information in the output: 403*ee65728eSMike Rapoport Number of objects, allocating function, minimal/average/maximal jiffies since alloc, 404*ee65728eSMike Rapoport pid range of the allocating processes, cpu mask of allocating cpus, and stack trace. 405*ee65728eSMike Rapoport 406*ee65728eSMike Rapoport Example::: 407*ee65728eSMike Rapoport 408*ee65728eSMike Rapoport 1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1:: 409*ee65728eSMike Rapoport __slab_alloc+0x6d/0x90 410*ee65728eSMike Rapoport kmem_cache_alloc_trace+0x2eb/0x300 411*ee65728eSMike Rapoport populate_error_injection_list+0x97/0x110 412*ee65728eSMike Rapoport init_error_injection+0x1b/0x71 413*ee65728eSMike Rapoport do_one_initcall+0x5f/0x2d0 414*ee65728eSMike Rapoport kernel_init_freeable+0x26f/0x2d7 415*ee65728eSMike Rapoport kernel_init+0xe/0x118 416*ee65728eSMike Rapoport ret_from_fork+0x22/0x30 417*ee65728eSMike Rapoport 418*ee65728eSMike Rapoport 419*ee65728eSMike Rapoport2. free_traces:: 420*ee65728eSMike Rapoport 421*ee65728eSMike Rapoport Prints information about unique freeing traces of the currently allocated 422*ee65728eSMike Rapoport objects. The freeing traces thus come from the previous life-cycle of the 423*ee65728eSMike Rapoport objects and are reported as not available for objects allocated for the first 424*ee65728eSMike Rapoport time. The output is sorted by frequency of each trace. 425*ee65728eSMike Rapoport 426*ee65728eSMike Rapoport Information in the output: 427*ee65728eSMike Rapoport Number of objects, freeing function, minimal/average/maximal jiffies since free, 428*ee65728eSMike Rapoport pid range of the freeing processes, cpu mask of freeing cpus, and stack trace. 429*ee65728eSMike Rapoport 430*ee65728eSMike Rapoport Example::: 431*ee65728eSMike Rapoport 432*ee65728eSMike Rapoport 1980 <not-available> age=4294912290 pid=0 cpus=0 433*ee65728eSMike Rapoport 51 acpi_ut_update_ref_count+0x6a6/0x782 age=236886/237027/237772 pid=1 cpus=1 434*ee65728eSMike Rapoport kfree+0x2db/0x420 435*ee65728eSMike Rapoport acpi_ut_update_ref_count+0x6a6/0x782 436*ee65728eSMike Rapoport acpi_ut_update_object_reference+0x1ad/0x234 437*ee65728eSMike Rapoport acpi_ut_remove_reference+0x7d/0x84 438*ee65728eSMike Rapoport acpi_rs_get_prt_method_data+0x97/0xd6 439*ee65728eSMike Rapoport acpi_get_irq_routing_table+0x82/0xc4 440*ee65728eSMike Rapoport acpi_pci_irq_find_prt_entry+0x8e/0x2e0 441*ee65728eSMike Rapoport acpi_pci_irq_lookup+0x3a/0x1e0 442*ee65728eSMike Rapoport acpi_pci_irq_enable+0x77/0x240 443*ee65728eSMike Rapoport pcibios_enable_device+0x39/0x40 444*ee65728eSMike Rapoport do_pci_enable_device.part.0+0x5d/0xe0 445*ee65728eSMike Rapoport pci_enable_device_flags+0xfc/0x120 446*ee65728eSMike Rapoport pci_enable_device+0x13/0x20 447*ee65728eSMike Rapoport virtio_pci_probe+0x9e/0x170 448*ee65728eSMike Rapoport local_pci_probe+0x48/0x80 449*ee65728eSMike Rapoport pci_device_probe+0x105/0x1c0 450*ee65728eSMike Rapoport 451*ee65728eSMike RapoportChristoph Lameter, May 30, 2007 452*ee65728eSMike RapoportSergey Senozhatsky, October 23, 2015 453