xref: /linux/Documentation/accounting/delay-accounting.rst (revision f09fc24dd9a5ec989dfdde7090624924ede6ddc7)
1================
2Delay accounting
3================
4
5Tasks encounter delays in execution when they wait
6for some kernel resource to become available e.g. a
7runnable task may wait for a free CPU to run on.
8
9The per-task delay accounting functionality measures
10the delays experienced by a task while
11
12a) waiting for a CPU (while being runnable)
13b) completion of synchronous block I/O initiated by the task
14c) swapping in pages
15d) memory reclaim
16e) thrashing
17f) direct compact
18g) write-protect copy
19h) IRQ/SOFTIRQ
20
21and makes these statistics available to userspace through
22the taskstats interface.
23
24Such delays provide feedback for setting a task's cpu priority,
25io priority and rss limit values appropriately. Long delays for
26important tasks could be a trigger for raising its corresponding priority.
27
28The functionality, through its use of the taskstats interface, also provides
29delay statistics aggregated for all tasks (or threads) belonging to a
30thread group (corresponding to a traditional Unix process). This is a commonly
31needed aggregation that is more efficiently done by the kernel.
32
33Userspace utilities, particularly resource management applications, can also
34aggregate delay statistics into arbitrary groups. To enable this, delay
35statistics of a task are available both during its lifetime as well as on its
36exit, ensuring continuous and complete monitoring can be done.
37
38
39Interface
40---------
41
42Delay accounting uses the taskstats interface which is described
43in detail in a separate document in this directory. Taskstats returns a
44generic data structure to userspace corresponding to per-pid and per-tgid
45statistics. The delay accounting functionality populates specific fields of
46this structure. See
47
48     include/uapi/linux/taskstats.h
49
50for a description of the fields pertaining to delay accounting.
51It will generally be in the form of counters returning the cumulative
52delay seen for cpu, sync block I/O, swapin, memory reclaim, thrash page
53cache, direct compact, write-protect copy, IRQ/SOFTIRQ etc.
54
55Taking the difference of two successive readings of a given
56counter (say cpu_delay_total) for a task will give the delay
57experienced by the task waiting for the corresponding resource
58in that interval.
59
60When a task exits, records containing the per-task statistics
61are sent to userspace without requiring a command. If it is the last exiting
62task of a thread group, the per-tgid statistics are also sent. More details
63are given in the taskstats interface description.
64
65The getdelays.c userspace utility in tools/accounting directory allows simple
66commands to be run and the corresponding delay statistics to be displayed. It
67also serves as an example of using the taskstats interface.
68
69Usage
70-----
71
72Compile the kernel with::
73
74	CONFIG_TASK_DELAY_ACCT=y
75	CONFIG_TASKSTATS=y
76
77Delay accounting is disabled by default at boot up.
78To enable, add::
79
80   delayacct
81
82to the kernel boot options. The rest of the instructions below assume this has
83been done. Alternatively, use sysctl kernel.task_delayacct to switch the state
84at runtime. Note however that only tasks started after enabling it will have
85delayacct information.
86
87After the system has booted up, use a utility
88similar to  getdelays.c to access the delays
89seen by a given task or a task group (tgid).
90The utility also allows a given command to be
91executed and the corresponding delays to be
92seen.
93
94General format of the getdelays command::
95
96	getdelays [-dilv] [-t tgid] [-p pid]
97
98Get delays, since system boot, for pid 10::
99
100	# ./getdelays -d -p 10
101	(output similar to next case)
102
103Get sum and peak of delays, since system boot, for all pids with tgid 242::
104
105	bash-4.4# ./getdelays -d -t 242
106	print delayacct stats ON
107	TGID    242
108
109
110	CPU         count     real total  virtual total    delay total  delay average      delay max      delay min
111	               39      156000000      156576579        2111069          0.054ms     0.212296ms     0.031307ms
112	IO          count    delay total  delay average      delay max      delay min
113	                0              0          0.000ms     0.000000ms     0.000000ms
114	SWAP        count    delay total  delay average      delay max      delay min
115	                0              0          0.000ms     0.000000ms     0.000000ms
116	RECLAIM     count    delay total  delay average      delay max      delay min
117	                0              0          0.000ms     0.000000ms     0.000000ms
118	THRASHING   count    delay total  delay average      delay max      delay min
119	                0              0          0.000ms     0.000000ms     0.000000ms
120	COMPACT     count    delay total  delay average      delay max      delay min
121	                0              0          0.000ms     0.000000ms     0.000000ms
122	WPCOPY      count    delay total  delay average      delay max      delay min
123	              156       11215873          0.072ms     0.207403ms     0.033913ms
124	IRQ         count    delay total  delay average      delay max      delay min
125	                0              0          0.000ms     0.000000ms     0.000000ms
126
127Get IO accounting for pid 1, it works only with -p::
128
129	# ./getdelays -i -p 1
130	printing IO accounting
131	linuxrc: read=65536, write=0, cancelled_write=0
132
133The above command can be used with -v to get more debug information.
134
135After the system starts, use `delaytop` to get the system-wide delay information,
136which includes system-wide PSI information and Top-N high-latency tasks.
137
138`delaytop` supports sorting by CPU latency in descending order by default,
139displays the top 20 high-latency tasks by default, and refreshes the latency
140data every 2 seconds by default.
141
142Get PSI information and Top-N tasks delay, since system boot::
143
144	bash# ./delaytop
145	System Pressure Information: (avg10/avg60/avg300/total)
146	CPU some:       0.0%/   0.0%/   0.0%/     345(ms)
147	CPU full:       0.0%/   0.0%/   0.0%/       0(ms)
148	Memory full:    0.0%/   0.0%/   0.0%/       0(ms)
149	Memory some:    0.0%/   0.0%/   0.0%/       0(ms)
150	IO full:        0.0%/   0.0%/   0.0%/      65(ms)
151	IO some:        0.0%/   0.0%/   0.0%/      79(ms)
152	IRQ full:       0.0%/   0.0%/   0.0%/       0(ms)
153	Top 20 processes (sorted by CPU delay):
154	  PID   TGID  COMMAND          CPU(ms)  IO(ms) SWAP(ms) RCL(ms) THR(ms) CMP(ms)  WP(ms) IRQ(ms)
155	----------------------------------------------------------------------------------------------
156	  161    161  zombie_memcg_re   1.40    0.00    0.00    0.00    0.00    0.00    0.00    0.00
157	  130    130  blkcg_punt_bio    1.37    0.00    0.00    0.00    0.00    0.00    0.00    0.00
158	  444    444  scsi_tmf_0        0.73    0.00    0.00    0.00    0.00    0.00    0.00    0.00
159	 1280   1280  rsyslogd          0.53    0.04    0.00    0.00    0.00    0.00    0.00    0.00
160	   12     12  ksoftirqd/0       0.47    0.00    0.00    0.00    0.00    0.00    0.00    0.00
161	 1277   1277  nbd-server        0.44    0.00    0.00    0.00    0.00    0.00    0.00    0.00
162	  308    308  kworker/2:2-sys   0.41    0.00    0.00    0.00    0.00    0.00    0.00    0.00
163	   55     55  netns             0.36    0.00    0.00    0.00    0.00    0.00    0.00    0.00
164	 1187   1187  acpid             0.31    0.03    0.00    0.00    0.00    0.00    0.00    0.00
165	 6184   6184  kworker/1:2-sys   0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
166	  186    186  kaluad            0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
167	   18     18  ksoftirqd/1       0.24    0.00    0.00    0.00    0.00    0.00    0.00    0.00
168	  185    185  kmpath_rdacd      0.23    0.00    0.00    0.00    0.00    0.00    0.00    0.00
169	  190    190  kstrp             0.23    0.00    0.00    0.00    0.00    0.00    0.00    0.00
170	 2759   2759  agetty            0.20    0.03    0.00    0.00    0.00    0.00    0.00    0.00
171	 1190   1190  kworker/0:3-sys   0.19    0.00    0.00    0.00    0.00    0.00    0.00    0.00
172	 1272   1272  sshd              0.15    0.04    0.00    0.00    0.00    0.00    0.00    0.00
173	 1156   1156  license           0.15    0.11    0.00    0.00    0.00    0.00    0.00    0.00
174	  134    134  md                0.13    0.00    0.00    0.00    0.00    0.00    0.00    0.00
175	 6142   6142  kworker/3:2-xfs   0.13    0.00    0.00    0.00    0.00    0.00    0.00    0.00
176
177Dynamic interactive interface of delaytop::
178
179	# ./delaytop -p pid
180	Print delayacct stats
181
182	# ./delaytop -P num
183	Display the top N tasks
184
185	# ./delaytop -n num
186	Set delaytop refresh frequency (num times)
187
188	# ./delaytop -d secs
189	Specify refresh interval as secs
190