xref: /linux/Documentation/mm/slub.rst (revision ee65728e103bb7dd99d8604bf6c7aa89c7d7e446)
1*ee65728eSMike Rapoport.. _slub:
2*ee65728eSMike Rapoport
3*ee65728eSMike Rapoport==========================
4*ee65728eSMike RapoportShort users guide for SLUB
5*ee65728eSMike Rapoport==========================
6*ee65728eSMike Rapoport
7*ee65728eSMike RapoportThe basic philosophy of SLUB is very different from SLAB. SLAB
8*ee65728eSMike Rapoportrequires rebuilding the kernel to activate debug options for all
9*ee65728eSMike Rapoportslab caches. SLUB always includes full debugging but it is off by default.
10*ee65728eSMike RapoportSLUB can enable debugging only for selected slabs in order to avoid
11*ee65728eSMike Rapoportan impact on overall system performance which may make a bug more
12*ee65728eSMike Rapoportdifficult to find.
13*ee65728eSMike Rapoport
14*ee65728eSMike RapoportIn order to switch debugging on one can add an option ``slub_debug``
15*ee65728eSMike Rapoportto the kernel command line. That will enable full debugging for
16*ee65728eSMike Rapoportall slabs.
17*ee65728eSMike Rapoport
18*ee65728eSMike RapoportTypically one would then use the ``slabinfo`` command to get statistical
19*ee65728eSMike Rapoportdata and perform operation on the slabs. By default ``slabinfo`` only lists
20*ee65728eSMike Rapoportslabs that have data in them. See "slabinfo -h" for more options when
21*ee65728eSMike Rapoportrunning the command. ``slabinfo`` can be compiled with
22*ee65728eSMike Rapoport::
23*ee65728eSMike Rapoport
24*ee65728eSMike Rapoport	gcc -o slabinfo tools/vm/slabinfo.c
25*ee65728eSMike Rapoport
26*ee65728eSMike RapoportSome of the modes of operation of ``slabinfo`` require that slub debugging
27*ee65728eSMike Rapoportbe enabled on the command line. F.e. no tracking information will be
28*ee65728eSMike Rapoportavailable without debugging on and validation can only partially
29*ee65728eSMike Rapoportbe performed if debugging was not switched on.
30*ee65728eSMike Rapoport
31*ee65728eSMike RapoportSome more sophisticated uses of slub_debug:
32*ee65728eSMike Rapoport-------------------------------------------
33*ee65728eSMike Rapoport
34*ee65728eSMike RapoportParameters may be given to ``slub_debug``. If none is specified then full
35*ee65728eSMike Rapoportdebugging is enabled. Format:
36*ee65728eSMike Rapoport
37*ee65728eSMike Rapoportslub_debug=<Debug-Options>
38*ee65728eSMike Rapoport	Enable options for all slabs
39*ee65728eSMike Rapoport
40*ee65728eSMike Rapoportslub_debug=<Debug-Options>,<slab name1>,<slab name2>,...
41*ee65728eSMike Rapoport	Enable options only for select slabs (no spaces
42*ee65728eSMike Rapoport	after a comma)
43*ee65728eSMike Rapoport
44*ee65728eSMike RapoportMultiple blocks of options for all slabs or selected slabs can be given, with
45*ee65728eSMike Rapoportblocks of options delimited by ';'. The last of "all slabs" blocks is applied
46*ee65728eSMike Rapoportto all slabs except those that match one of the "select slabs" block. Options
47*ee65728eSMike Rapoportof the first "select slabs" blocks that matches the slab's name are applied.
48*ee65728eSMike Rapoport
49*ee65728eSMike RapoportPossible debug options are::
50*ee65728eSMike Rapoport
51*ee65728eSMike Rapoport	F		Sanity checks on (enables SLAB_DEBUG_CONSISTENCY_CHECKS
52*ee65728eSMike Rapoport			Sorry SLAB legacy issues)
53*ee65728eSMike Rapoport	Z		Red zoning
54*ee65728eSMike Rapoport	P		Poisoning (object and padding)
55*ee65728eSMike Rapoport	U		User tracking (free and alloc)
56*ee65728eSMike Rapoport	T		Trace (please only use on single slabs)
57*ee65728eSMike Rapoport	A		Enable failslab filter mark for the cache
58*ee65728eSMike Rapoport	O		Switch debugging off for caches that would have
59*ee65728eSMike Rapoport			caused higher minimum slab orders
60*ee65728eSMike Rapoport	-		Switch all debugging off (useful if the kernel is
61*ee65728eSMike Rapoport			configured with CONFIG_SLUB_DEBUG_ON)
62*ee65728eSMike Rapoport
63*ee65728eSMike RapoportF.e. in order to boot just with sanity checks and red zoning one would specify::
64*ee65728eSMike Rapoport
65*ee65728eSMike Rapoport	slub_debug=FZ
66*ee65728eSMike Rapoport
67*ee65728eSMike RapoportTrying to find an issue in the dentry cache? Try::
68*ee65728eSMike Rapoport
69*ee65728eSMike Rapoport	slub_debug=,dentry
70*ee65728eSMike Rapoport
71*ee65728eSMike Rapoportto only enable debugging on the dentry cache.  You may use an asterisk at the
72*ee65728eSMike Rapoportend of the slab name, in order to cover all slabs with the same prefix.  For
73*ee65728eSMike Rapoportexample, here's how you can poison the dentry cache as well as all kmalloc
74*ee65728eSMike Rapoportslabs::
75*ee65728eSMike Rapoport
76*ee65728eSMike Rapoport	slub_debug=P,kmalloc-*,dentry
77*ee65728eSMike Rapoport
78*ee65728eSMike RapoportRed zoning and tracking may realign the slab.  We can just apply sanity checks
79*ee65728eSMike Rapoportto the dentry cache with::
80*ee65728eSMike Rapoport
81*ee65728eSMike Rapoport	slub_debug=F,dentry
82*ee65728eSMike Rapoport
83*ee65728eSMike RapoportDebugging options may require the minimum possible slab order to increase as
84*ee65728eSMike Rapoporta result of storing the metadata (for example, caches with PAGE_SIZE object
85*ee65728eSMike Rapoportsizes).  This has a higher liklihood of resulting in slab allocation errors
86*ee65728eSMike Rapoportin low memory situations or if there's high fragmentation of memory.  To
87*ee65728eSMike Rapoportswitch off debugging for such caches by default, use::
88*ee65728eSMike Rapoport
89*ee65728eSMike Rapoport	slub_debug=O
90*ee65728eSMike Rapoport
91*ee65728eSMike RapoportYou can apply different options to different list of slab names, using blocks
92*ee65728eSMike Rapoportof options. This will enable red zoning for dentry and user tracking for
93*ee65728eSMike Rapoportkmalloc. All other slabs will not get any debugging enabled::
94*ee65728eSMike Rapoport
95*ee65728eSMike Rapoport	slub_debug=Z,dentry;U,kmalloc-*
96*ee65728eSMike Rapoport
97*ee65728eSMike RapoportYou can also enable options (e.g. sanity checks and poisoning) for all caches
98*ee65728eSMike Rapoportexcept some that are deemed too performance critical and don't need to be
99*ee65728eSMike Rapoportdebugged by specifying global debug options followed by a list of slab names
100*ee65728eSMike Rapoportwith "-" as options::
101*ee65728eSMike Rapoport
102*ee65728eSMike Rapoport	slub_debug=FZ;-,zs_handle,zspage
103*ee65728eSMike Rapoport
104*ee65728eSMike RapoportThe state of each debug option for a slab can be found in the respective files
105*ee65728eSMike Rapoportunder::
106*ee65728eSMike Rapoport
107*ee65728eSMike Rapoport	/sys/kernel/slab/<slab name>/
108*ee65728eSMike Rapoport
109*ee65728eSMike RapoportIf the file contains 1, the option is enabled, 0 means disabled. The debug
110*ee65728eSMike Rapoportoptions from the ``slub_debug`` parameter translate to the following files::
111*ee65728eSMike Rapoport
112*ee65728eSMike Rapoport	F	sanity_checks
113*ee65728eSMike Rapoport	Z	red_zone
114*ee65728eSMike Rapoport	P	poison
115*ee65728eSMike Rapoport	U	store_user
116*ee65728eSMike Rapoport	T	trace
117*ee65728eSMike Rapoport	A	failslab
118*ee65728eSMike Rapoport
119*ee65728eSMike RapoportCareful with tracing: It may spew out lots of information and never stop if
120*ee65728eSMike Rapoportused on the wrong slab.
121*ee65728eSMike Rapoport
122*ee65728eSMike RapoportSlab merging
123*ee65728eSMike Rapoport============
124*ee65728eSMike Rapoport
125*ee65728eSMike RapoportIf no debug options are specified then SLUB may merge similar slabs together
126*ee65728eSMike Rapoportin order to reduce overhead and increase cache hotness of objects.
127*ee65728eSMike Rapoport``slabinfo -a`` displays which slabs were merged together.
128*ee65728eSMike Rapoport
129*ee65728eSMike RapoportSlab validation
130*ee65728eSMike Rapoport===============
131*ee65728eSMike Rapoport
132*ee65728eSMike RapoportSLUB can validate all object if the kernel was booted with slub_debug. In
133*ee65728eSMike Rapoportorder to do so you must have the ``slabinfo`` tool. Then you can do
134*ee65728eSMike Rapoport::
135*ee65728eSMike Rapoport
136*ee65728eSMike Rapoport	slabinfo -v
137*ee65728eSMike Rapoport
138*ee65728eSMike Rapoportwhich will test all objects. Output will be generated to the syslog.
139*ee65728eSMike Rapoport
140*ee65728eSMike RapoportThis also works in a more limited way if boot was without slab debug.
141*ee65728eSMike RapoportIn that case ``slabinfo -v`` simply tests all reachable objects. Usually
142*ee65728eSMike Rapoportthese are in the cpu slabs and the partial slabs. Full slabs are not
143*ee65728eSMike Rapoporttracked by SLUB in a non debug situation.
144*ee65728eSMike Rapoport
145*ee65728eSMike RapoportGetting more performance
146*ee65728eSMike Rapoport========================
147*ee65728eSMike Rapoport
148*ee65728eSMike RapoportTo some degree SLUB's performance is limited by the need to take the
149*ee65728eSMike Rapoportlist_lock once in a while to deal with partial slabs. That overhead is
150*ee65728eSMike Rapoportgoverned by the order of the allocation for each slab. The allocations
151*ee65728eSMike Rapoportcan be influenced by kernel parameters:
152*ee65728eSMike Rapoport
153*ee65728eSMike Rapoport.. slub_min_objects=x		(default 4)
154*ee65728eSMike Rapoport.. slub_min_order=x		(default 0)
155*ee65728eSMike Rapoport.. slub_max_order=x		(default 3 (PAGE_ALLOC_COSTLY_ORDER))
156*ee65728eSMike Rapoport
157*ee65728eSMike Rapoport``slub_min_objects``
158*ee65728eSMike Rapoport	allows to specify how many objects must at least fit into one
159*ee65728eSMike Rapoport	slab in order for the allocation order to be acceptable.  In
160*ee65728eSMike Rapoport	general slub will be able to perform this number of
161*ee65728eSMike Rapoport	allocations on a slab without consulting centralized resources
162*ee65728eSMike Rapoport	(list_lock) where contention may occur.
163*ee65728eSMike Rapoport
164*ee65728eSMike Rapoport``slub_min_order``
165*ee65728eSMike Rapoport	specifies a minimum order of slabs. A similar effect like
166*ee65728eSMike Rapoport	``slub_min_objects``.
167*ee65728eSMike Rapoport
168*ee65728eSMike Rapoport``slub_max_order``
169*ee65728eSMike Rapoport	specified the order at which ``slub_min_objects`` should no
170*ee65728eSMike Rapoport	longer be checked. This is useful to avoid SLUB trying to
171*ee65728eSMike Rapoport	generate super large order pages to fit ``slub_min_objects``
172*ee65728eSMike Rapoport	of a slab cache with large object sizes into one high order
173*ee65728eSMike Rapoport	page. Setting command line parameter
174*ee65728eSMike Rapoport	``debug_guardpage_minorder=N`` (N > 0), forces setting
175*ee65728eSMike Rapoport	``slub_max_order`` to 0, what cause minimum possible order of
176*ee65728eSMike Rapoport	slabs allocation.
177*ee65728eSMike Rapoport
178*ee65728eSMike RapoportSLUB Debug output
179*ee65728eSMike Rapoport=================
180*ee65728eSMike Rapoport
181*ee65728eSMike RapoportHere is a sample of slub debug output::
182*ee65728eSMike Rapoport
183*ee65728eSMike Rapoport ====================================================================
184*ee65728eSMike Rapoport BUG kmalloc-8: Right Redzone overwritten
185*ee65728eSMike Rapoport --------------------------------------------------------------------
186*ee65728eSMike Rapoport
187*ee65728eSMike Rapoport INFO: 0xc90f6d28-0xc90f6d2b. First byte 0x00 instead of 0xcc
188*ee65728eSMike Rapoport INFO: Slab 0xc528c530 flags=0x400000c3 inuse=61 fp=0xc90f6d58
189*ee65728eSMike Rapoport INFO: Object 0xc90f6d20 @offset=3360 fp=0xc90f6d58
190*ee65728eSMike Rapoport INFO: Allocated in get_modalias+0x61/0xf5 age=53 cpu=1 pid=554
191*ee65728eSMike Rapoport
192*ee65728eSMike Rapoport Bytes b4 (0xc90f6d10): 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a 5a ........ZZZZZZZZ
193*ee65728eSMike Rapoport Object   (0xc90f6d20): 31 30 31 39 2e 30 30 35                         1019.005
194*ee65728eSMike Rapoport Redzone  (0xc90f6d28): 00 cc cc cc                                     .
195*ee65728eSMike Rapoport Padding  (0xc90f6d50): 5a 5a 5a 5a 5a 5a 5a 5a                         ZZZZZZZZ
196*ee65728eSMike Rapoport
197*ee65728eSMike Rapoport   [<c010523d>] dump_trace+0x63/0x1eb
198*ee65728eSMike Rapoport   [<c01053df>] show_trace_log_lvl+0x1a/0x2f
199*ee65728eSMike Rapoport   [<c010601d>] show_trace+0x12/0x14
200*ee65728eSMike Rapoport   [<c0106035>] dump_stack+0x16/0x18
201*ee65728eSMike Rapoport   [<c017e0fa>] object_err+0x143/0x14b
202*ee65728eSMike Rapoport   [<c017e2cc>] check_object+0x66/0x234
203*ee65728eSMike Rapoport   [<c017eb43>] __slab_free+0x239/0x384
204*ee65728eSMike Rapoport   [<c017f446>] kfree+0xa6/0xc6
205*ee65728eSMike Rapoport   [<c02e2335>] get_modalias+0xb9/0xf5
206*ee65728eSMike Rapoport   [<c02e23b7>] dmi_dev_uevent+0x27/0x3c
207*ee65728eSMike Rapoport   [<c027866a>] dev_uevent+0x1ad/0x1da
208*ee65728eSMike Rapoport   [<c0205024>] kobject_uevent_env+0x20a/0x45b
209*ee65728eSMike Rapoport   [<c020527f>] kobject_uevent+0xa/0xf
210*ee65728eSMike Rapoport   [<c02779f1>] store_uevent+0x4f/0x58
211*ee65728eSMike Rapoport   [<c027758e>] dev_attr_store+0x29/0x2f
212*ee65728eSMike Rapoport   [<c01bec4f>] sysfs_write_file+0x16e/0x19c
213*ee65728eSMike Rapoport   [<c0183ba7>] vfs_write+0xd1/0x15a
214*ee65728eSMike Rapoport   [<c01841d7>] sys_write+0x3d/0x72
215*ee65728eSMike Rapoport   [<c0104112>] sysenter_past_esp+0x5f/0x99
216*ee65728eSMike Rapoport   [<b7f7b410>] 0xb7f7b410
217*ee65728eSMike Rapoport   =======================
218*ee65728eSMike Rapoport
219*ee65728eSMike Rapoport FIX kmalloc-8: Restoring Redzone 0xc90f6d28-0xc90f6d2b=0xcc
220*ee65728eSMike Rapoport
221*ee65728eSMike RapoportIf SLUB encounters a corrupted object (full detection requires the kernel
222*ee65728eSMike Rapoportto be booted with slub_debug) then the following output will be dumped
223*ee65728eSMike Rapoportinto the syslog:
224*ee65728eSMike Rapoport
225*ee65728eSMike Rapoport1. Description of the problem encountered
226*ee65728eSMike Rapoport
227*ee65728eSMike Rapoport   This will be a message in the system log starting with::
228*ee65728eSMike Rapoport
229*ee65728eSMike Rapoport     ===============================================
230*ee65728eSMike Rapoport     BUG <slab cache affected>: <What went wrong>
231*ee65728eSMike Rapoport     -----------------------------------------------
232*ee65728eSMike Rapoport
233*ee65728eSMike Rapoport     INFO: <corruption start>-<corruption_end> <more info>
234*ee65728eSMike Rapoport     INFO: Slab <address> <slab information>
235*ee65728eSMike Rapoport     INFO: Object <address> <object information>
236*ee65728eSMike Rapoport     INFO: Allocated in <kernel function> age=<jiffies since alloc> cpu=<allocated by
237*ee65728eSMike Rapoport	cpu> pid=<pid of the process>
238*ee65728eSMike Rapoport     INFO: Freed in <kernel function> age=<jiffies since free> cpu=<freed by cpu>
239*ee65728eSMike Rapoport	pid=<pid of the process>
240*ee65728eSMike Rapoport
241*ee65728eSMike Rapoport   (Object allocation / free information is only available if SLAB_STORE_USER is
242*ee65728eSMike Rapoport   set for the slab. slub_debug sets that option)
243*ee65728eSMike Rapoport
244*ee65728eSMike Rapoport2. The object contents if an object was involved.
245*ee65728eSMike Rapoport
246*ee65728eSMike Rapoport   Various types of lines can follow the BUG SLUB line:
247*ee65728eSMike Rapoport
248*ee65728eSMike Rapoport   Bytes b4 <address> : <bytes>
249*ee65728eSMike Rapoport	Shows a few bytes before the object where the problem was detected.
250*ee65728eSMike Rapoport	Can be useful if the corruption does not stop with the start of the
251*ee65728eSMike Rapoport	object.
252*ee65728eSMike Rapoport
253*ee65728eSMike Rapoport   Object <address> : <bytes>
254*ee65728eSMike Rapoport	The bytes of the object. If the object is inactive then the bytes
255*ee65728eSMike Rapoport	typically contain poison values. Any non-poison value shows a
256*ee65728eSMike Rapoport	corruption by a write after free.
257*ee65728eSMike Rapoport
258*ee65728eSMike Rapoport   Redzone <address> : <bytes>
259*ee65728eSMike Rapoport	The Redzone following the object. The Redzone is used to detect
260*ee65728eSMike Rapoport	writes after the object. All bytes should always have the same
261*ee65728eSMike Rapoport	value. If there is any deviation then it is due to a write after
262*ee65728eSMike Rapoport	the object boundary.
263*ee65728eSMike Rapoport
264*ee65728eSMike Rapoport	(Redzone information is only available if SLAB_RED_ZONE is set.
265*ee65728eSMike Rapoport	slub_debug sets that option)
266*ee65728eSMike Rapoport
267*ee65728eSMike Rapoport   Padding <address> : <bytes>
268*ee65728eSMike Rapoport	Unused data to fill up the space in order to get the next object
269*ee65728eSMike Rapoport	properly aligned. In the debug case we make sure that there are
270*ee65728eSMike Rapoport	at least 4 bytes of padding. This allows the detection of writes
271*ee65728eSMike Rapoport	before the object.
272*ee65728eSMike Rapoport
273*ee65728eSMike Rapoport3. A stackdump
274*ee65728eSMike Rapoport
275*ee65728eSMike Rapoport   The stackdump describes the location where the error was detected. The cause
276*ee65728eSMike Rapoport   of the corruption is may be more likely found by looking at the function that
277*ee65728eSMike Rapoport   allocated or freed the object.
278*ee65728eSMike Rapoport
279*ee65728eSMike Rapoport4. Report on how the problem was dealt with in order to ensure the continued
280*ee65728eSMike Rapoport   operation of the system.
281*ee65728eSMike Rapoport
282*ee65728eSMike Rapoport   These are messages in the system log beginning with::
283*ee65728eSMike Rapoport
284*ee65728eSMike Rapoport	FIX <slab cache affected>: <corrective action taken>
285*ee65728eSMike Rapoport
286*ee65728eSMike Rapoport   In the above sample SLUB found that the Redzone of an active object has
287*ee65728eSMike Rapoport   been overwritten. Here a string of 8 characters was written into a slab that
288*ee65728eSMike Rapoport   has the length of 8 characters. However, a 8 character string needs a
289*ee65728eSMike Rapoport   terminating 0. That zero has overwritten the first byte of the Redzone field.
290*ee65728eSMike Rapoport   After reporting the details of the issue encountered the FIX SLUB message
291*ee65728eSMike Rapoport   tells us that SLUB has restored the Redzone to its proper value and then
292*ee65728eSMike Rapoport   system operations continue.
293*ee65728eSMike Rapoport
294*ee65728eSMike RapoportEmergency operations
295*ee65728eSMike Rapoport====================
296*ee65728eSMike Rapoport
297*ee65728eSMike RapoportMinimal debugging (sanity checks alone) can be enabled by booting with::
298*ee65728eSMike Rapoport
299*ee65728eSMike Rapoport	slub_debug=F
300*ee65728eSMike Rapoport
301*ee65728eSMike RapoportThis will be generally be enough to enable the resiliency features of slub
302*ee65728eSMike Rapoportwhich will keep the system running even if a bad kernel component will
303*ee65728eSMike Rapoportkeep corrupting objects. This may be important for production systems.
304*ee65728eSMike RapoportPerformance will be impacted by the sanity checks and there will be a
305*ee65728eSMike Rapoportcontinual stream of error messages to the syslog but no additional memory
306*ee65728eSMike Rapoportwill be used (unlike full debugging).
307*ee65728eSMike Rapoport
308*ee65728eSMike RapoportNo guarantees. The kernel component still needs to be fixed. Performance
309*ee65728eSMike Rapoportmay be optimized further by locating the slab that experiences corruption
310*ee65728eSMike Rapoportand enabling debugging only for that cache
311*ee65728eSMike Rapoport
312*ee65728eSMike RapoportI.e.::
313*ee65728eSMike Rapoport
314*ee65728eSMike Rapoport	slub_debug=F,dentry
315*ee65728eSMike Rapoport
316*ee65728eSMike RapoportIf the corruption occurs by writing after the end of the object then it
317*ee65728eSMike Rapoportmay be advisable to enable a Redzone to avoid corrupting the beginning
318*ee65728eSMike Rapoportof other objects::
319*ee65728eSMike Rapoport
320*ee65728eSMike Rapoport	slub_debug=FZ,dentry
321*ee65728eSMike Rapoport
322*ee65728eSMike RapoportExtended slabinfo mode and plotting
323*ee65728eSMike Rapoport===================================
324*ee65728eSMike Rapoport
325*ee65728eSMike RapoportThe ``slabinfo`` tool has a special 'extended' ('-X') mode that includes:
326*ee65728eSMike Rapoport - Slabcache Totals
327*ee65728eSMike Rapoport - Slabs sorted by size (up to -N <num> slabs, default 1)
328*ee65728eSMike Rapoport - Slabs sorted by loss (up to -N <num> slabs, default 1)
329*ee65728eSMike Rapoport
330*ee65728eSMike RapoportAdditionally, in this mode ``slabinfo`` does not dynamically scale
331*ee65728eSMike Rapoportsizes (G/M/K) and reports everything in bytes (this functionality is
332*ee65728eSMike Rapoportalso available to other slabinfo modes via '-B' option) which makes
333*ee65728eSMike Rapoportreporting more precise and accurate. Moreover, in some sense the `-X'
334*ee65728eSMike Rapoportmode also simplifies the analysis of slabs' behaviour, because its
335*ee65728eSMike Rapoportoutput can be plotted using the ``slabinfo-gnuplot.sh`` script. So it
336*ee65728eSMike Rapoportpushes the analysis from looking through the numbers (tons of numbers)
337*ee65728eSMike Rapoportto something easier -- visual analysis.
338*ee65728eSMike Rapoport
339*ee65728eSMike RapoportTo generate plots:
340*ee65728eSMike Rapoport
341*ee65728eSMike Rapoporta) collect slabinfo extended records, for example::
342*ee65728eSMike Rapoport
343*ee65728eSMike Rapoport	while [ 1 ]; do slabinfo -X >> FOO_STATS; sleep 1; done
344*ee65728eSMike Rapoport
345*ee65728eSMike Rapoportb) pass stats file(-s) to ``slabinfo-gnuplot.sh`` script::
346*ee65728eSMike Rapoport
347*ee65728eSMike Rapoport	slabinfo-gnuplot.sh FOO_STATS [FOO_STATS2 .. FOO_STATSN]
348*ee65728eSMike Rapoport
349*ee65728eSMike Rapoport   The ``slabinfo-gnuplot.sh`` script will pre-processes the collected records
350*ee65728eSMike Rapoport   and generates 3 png files (and 3 pre-processing cache files) per STATS
351*ee65728eSMike Rapoport   file:
352*ee65728eSMike Rapoport   - Slabcache Totals: FOO_STATS-totals.png
353*ee65728eSMike Rapoport   - Slabs sorted by size: FOO_STATS-slabs-by-size.png
354*ee65728eSMike Rapoport   - Slabs sorted by loss: FOO_STATS-slabs-by-loss.png
355*ee65728eSMike Rapoport
356*ee65728eSMike RapoportAnother use case, when ``slabinfo-gnuplot.sh`` can be useful, is when you
357*ee65728eSMike Rapoportneed to compare slabs' behaviour "prior to" and "after" some code
358*ee65728eSMike Rapoportmodification.  To help you out there, ``slabinfo-gnuplot.sh`` script
359*ee65728eSMike Rapoportcan 'merge' the `Slabcache Totals` sections from different
360*ee65728eSMike Rapoportmeasurements. To visually compare N plots:
361*ee65728eSMike Rapoport
362*ee65728eSMike Rapoporta) Collect as many STATS1, STATS2, .. STATSN files as you need::
363*ee65728eSMike Rapoport
364*ee65728eSMike Rapoport	while [ 1 ]; do slabinfo -X >> STATS<X>; sleep 1; done
365*ee65728eSMike Rapoport
366*ee65728eSMike Rapoportb) Pre-process those STATS files::
367*ee65728eSMike Rapoport
368*ee65728eSMike Rapoport	slabinfo-gnuplot.sh STATS1 STATS2 .. STATSN
369*ee65728eSMike Rapoport
370*ee65728eSMike Rapoportc) Execute ``slabinfo-gnuplot.sh`` in '-t' mode, passing all of the
371*ee65728eSMike Rapoport   generated pre-processed \*-totals::
372*ee65728eSMike Rapoport
373*ee65728eSMike Rapoport	slabinfo-gnuplot.sh -t STATS1-totals STATS2-totals .. STATSN-totals
374*ee65728eSMike Rapoport
375*ee65728eSMike Rapoport   This will produce a single plot (png file).
376*ee65728eSMike Rapoport
377*ee65728eSMike Rapoport   Plots, expectedly, can be large so some fluctuations or small spikes
378*ee65728eSMike Rapoport   can go unnoticed. To deal with that, ``slabinfo-gnuplot.sh`` has two
379*ee65728eSMike Rapoport   options to 'zoom-in'/'zoom-out':
380*ee65728eSMike Rapoport
381*ee65728eSMike Rapoport   a) ``-s %d,%d`` -- overwrites the default image width and height
382*ee65728eSMike Rapoport   b) ``-r %d,%d`` -- specifies a range of samples to use (for example,
383*ee65728eSMike Rapoport      in ``slabinfo -X >> FOO_STATS; sleep 1;`` case, using a ``-r
384*ee65728eSMike Rapoport      40,60`` range will plot only samples collected between 40th and
385*ee65728eSMike Rapoport      60th seconds).
386*ee65728eSMike Rapoport
387*ee65728eSMike Rapoport
388*ee65728eSMike RapoportDebugFS files for SLUB
389*ee65728eSMike Rapoport======================
390*ee65728eSMike Rapoport
391*ee65728eSMike RapoportFor more information about current state of SLUB caches with the user tracking
392*ee65728eSMike Rapoportdebug option enabled, debugfs files are available, typically under
393*ee65728eSMike Rapoport/sys/kernel/debug/slab/<cache>/ (created only for caches with enabled user
394*ee65728eSMike Rapoporttracking). There are 2 types of these files with the following debug
395*ee65728eSMike Rapoportinformation:
396*ee65728eSMike Rapoport
397*ee65728eSMike Rapoport1. alloc_traces::
398*ee65728eSMike Rapoport
399*ee65728eSMike Rapoport    Prints information about unique allocation traces of the currently
400*ee65728eSMike Rapoport    allocated objects. The output is sorted by frequency of each trace.
401*ee65728eSMike Rapoport
402*ee65728eSMike Rapoport    Information in the output:
403*ee65728eSMike Rapoport    Number of objects, allocating function, minimal/average/maximal jiffies since alloc,
404*ee65728eSMike Rapoport    pid range of the allocating processes, cpu mask of allocating cpus, and stack trace.
405*ee65728eSMike Rapoport
406*ee65728eSMike Rapoport    Example:::
407*ee65728eSMike Rapoport
408*ee65728eSMike Rapoport    1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1::
409*ee65728eSMike Rapoport	__slab_alloc+0x6d/0x90
410*ee65728eSMike Rapoport	kmem_cache_alloc_trace+0x2eb/0x300
411*ee65728eSMike Rapoport	populate_error_injection_list+0x97/0x110
412*ee65728eSMike Rapoport	init_error_injection+0x1b/0x71
413*ee65728eSMike Rapoport	do_one_initcall+0x5f/0x2d0
414*ee65728eSMike Rapoport	kernel_init_freeable+0x26f/0x2d7
415*ee65728eSMike Rapoport	kernel_init+0xe/0x118
416*ee65728eSMike Rapoport	ret_from_fork+0x22/0x30
417*ee65728eSMike Rapoport
418*ee65728eSMike Rapoport
419*ee65728eSMike Rapoport2. free_traces::
420*ee65728eSMike Rapoport
421*ee65728eSMike Rapoport    Prints information about unique freeing traces of the currently allocated
422*ee65728eSMike Rapoport    objects. The freeing traces thus come from the previous life-cycle of the
423*ee65728eSMike Rapoport    objects and are reported as not available for objects allocated for the first
424*ee65728eSMike Rapoport    time. The output is sorted by frequency of each trace.
425*ee65728eSMike Rapoport
426*ee65728eSMike Rapoport    Information in the output:
427*ee65728eSMike Rapoport    Number of objects, freeing function, minimal/average/maximal jiffies since free,
428*ee65728eSMike Rapoport    pid range of the freeing processes, cpu mask of freeing cpus, and stack trace.
429*ee65728eSMike Rapoport
430*ee65728eSMike Rapoport    Example:::
431*ee65728eSMike Rapoport
432*ee65728eSMike Rapoport    1980 <not-available> age=4294912290 pid=0 cpus=0
433*ee65728eSMike Rapoport    51 acpi_ut_update_ref_count+0x6a6/0x782 age=236886/237027/237772 pid=1 cpus=1
434*ee65728eSMike Rapoport	kfree+0x2db/0x420
435*ee65728eSMike Rapoport	acpi_ut_update_ref_count+0x6a6/0x782
436*ee65728eSMike Rapoport	acpi_ut_update_object_reference+0x1ad/0x234
437*ee65728eSMike Rapoport	acpi_ut_remove_reference+0x7d/0x84
438*ee65728eSMike Rapoport	acpi_rs_get_prt_method_data+0x97/0xd6
439*ee65728eSMike Rapoport	acpi_get_irq_routing_table+0x82/0xc4
440*ee65728eSMike Rapoport	acpi_pci_irq_find_prt_entry+0x8e/0x2e0
441*ee65728eSMike Rapoport	acpi_pci_irq_lookup+0x3a/0x1e0
442*ee65728eSMike Rapoport	acpi_pci_irq_enable+0x77/0x240
443*ee65728eSMike Rapoport	pcibios_enable_device+0x39/0x40
444*ee65728eSMike Rapoport	do_pci_enable_device.part.0+0x5d/0xe0
445*ee65728eSMike Rapoport	pci_enable_device_flags+0xfc/0x120
446*ee65728eSMike Rapoport	pci_enable_device+0x13/0x20
447*ee65728eSMike Rapoport	virtio_pci_probe+0x9e/0x170
448*ee65728eSMike Rapoport	local_pci_probe+0x48/0x80
449*ee65728eSMike Rapoport	pci_device_probe+0x105/0x1c0
450*ee65728eSMike Rapoport
451*ee65728eSMike RapoportChristoph Lameter, May 30, 2007
452*ee65728eSMike RapoportSergey Senozhatsky, October 23, 2015
453