xref: /linux/Documentation/trace/rv/monitor_synthesis.rst (revision 3f2a5ba784b808109cac0aac921213e43143a216)
1Runtime Verification Monitor Synthesis
2======================================
3
4The starting point for the application of runtime verification (RV) techniques
5is the *specification* or *modeling* of the desired (or undesired) behavior
6of the system under scrutiny.
7
8The formal representation needs to be then *synthesized* into a *monitor*
9that can then be used in the analysis of the trace of the system. The
10*monitor* connects to the system via an *instrumentation* that converts
11the events from the *system* to the events of the *specification*.
12
13
14In Linux terms, the runtime verification monitors are encapsulated inside
15the *RV monitor* abstraction. The RV monitor includes a set of instances
16of the monitor (per-cpu monitor, per-task monitor, and so on), the helper
17functions that glue the monitor to the system reference model, and the
18trace output as a reaction to event parsing and exceptions, as depicted
19below::
20
21 Linux  +----- RV Monitor ----------------------------------+ Formal
22  Realm |                                                   |  Realm
23  +-------------------+     +----------------+     +-----------------+
24  |   Linux kernel    |     |     Monitor    |     |     Reference   |
25  |     Tracing       |  -> |   Instance(s)  | <-  |       Model     |
26  | (instrumentation) |     | (verification) |     | (specification) |
27  +-------------------+     +----------------+     +-----------------+
28         |                          |                       |
29         |                          V                       |
30         |                     +----------+                 |
31         |                     | Reaction |                 |
32         |                     +--+--+--+-+                 |
33         |                        |  |  |                   |
34         |                        |  |  +-> trace output ?  |
35         +------------------------|--|----------------------+
36                                  |  +----> panic ?
37                                  +-------> <user-specified>
38
39RV monitor synthesis
40--------------------
41
42The synthesis of a specification into the Linux *RV monitor* abstraction is
43automated by the rvgen tool and the header file containing common code for
44creating monitors. The header files are:
45
46  * rv/da_monitor.h for deterministic automaton monitor.
47  * rv/ltl_monitor.h for linear temporal logic monitor.
48
49rvgen
50-----
51
52The rvgen utility converts a specification into the C presentation and creating
53the skeleton of a kernel monitor in C.
54
55For example, it is possible to transform the wip.dot model present in
56[1] into a per-cpu monitor with the following command::
57
58  $ rvgen monitor -c da -s wip.dot -t per_cpu
59
60This will create a directory named wip/ with the following files:
61
62- wip.h: the wip model in C
63- wip.c: the RV monitor
64
65The wip.c file contains the monitor declaration and the starting point for
66the system instrumentation.
67
68Similarly, a linear temporal logic monitor can be generated with the following
69command::
70
71  $ rvgen monitor -c ltl -s pagefault.ltl -t per_task
72
73This generates pagefault/ directory with:
74
75- pagefault.h: The Buchi automaton (the non-deterministic state machine to
76  verify the specification)
77- pagefault.c: The skeleton for the RV monitor
78
79Monitor header files
80--------------------
81
82The header files:
83
84- `rv/da_monitor.h` for deterministic automaton monitor
85- `rv/ltl_monitor` for linear temporal logic monitor
86
87include common macros and static functions for implementing *Monitor
88Instance(s)*.
89
90The benefits of having all common functionalities in a single header file are
913-fold:
92
93  - Reduce the code duplication;
94  - Facilitate the bug fix/improvement;
95  - Avoid the case of developers changing the core of the monitor code to
96    manipulate the model in a (let's say) non-standard way.
97
98rv/da_monitor.h
99+++++++++++++++
100
101This initial implementation presents three different types of monitor instances:
102
103- ``#define DECLARE_DA_MON_GLOBAL(name, type)``
104- ``#define DECLARE_DA_MON_PER_CPU(name, type)``
105- ``#define DECLARE_DA_MON_PER_TASK(name, type)``
106
107The first declares the functions for a global deterministic automata monitor,
108the second for monitors with per-cpu instances, and the third with per-task
109instances.
110
111In all cases, the 'name' argument is a string that identifies the monitor, and
112the 'type' argument is the data type used by rvgen on the representation of
113the model in C.
114
115For example, the wip model with two states and three events can be
116stored in an 'unsigned char' type. Considering that the preemption control
117is a per-cpu behavior, the monitor declaration in the 'wip.c' file is::
118
119  DECLARE_DA_MON_PER_CPU(wip, unsigned char);
120
121The monitor is executed by sending events to be processed via the functions
122presented below::
123
124  da_handle_event_$(MONITOR_NAME)($(event from event enum));
125  da_handle_start_event_$(MONITOR_NAME)($(event from event enum));
126  da_handle_start_run_event_$(MONITOR_NAME)($(event from event enum));
127
128The function ``da_handle_event_$(MONITOR_NAME)()`` is the regular case where
129the event will be processed if the monitor is processing events.
130
131When a monitor is enabled, it is placed in the initial state of the automata.
132However, the monitor does not know if the system is in the *initial state*.
133
134The ``da_handle_start_event_$(MONITOR_NAME)()`` function is used to notify the
135monitor that the system is returning to the initial state, so the monitor can
136start monitoring the next event.
137
138The ``da_handle_start_run_event_$(MONITOR_NAME)()`` function is used to notify
139the monitor that the system is known to be in the initial state, so the
140monitor can start monitoring and monitor the current event.
141
142Using the wip model as example, the events "preempt_disable" and
143"sched_waking" should be sent to monitor, respectively, via [2]::
144
145  da_handle_event_wip(preempt_disable_wip);
146  da_handle_event_wip(sched_waking_wip);
147
148While the event "preempt_enabled" will use::
149
150  da_handle_start_event_wip(preempt_enable_wip);
151
152To notify the monitor that the system will be returning to the initial state,
153so the system and the monitor should be in sync.
154
155rv/ltl_monitor.h
156++++++++++++++++
157This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`)
158to be complete. For example, for the `pagefault` monitor, the `pagefault.c`
159source file must include::
160
161  #include "pagefault.h"
162  #include <rv/ltl_monitor.h>
163
164(the skeleton monitor file generated by `rvgen` already does this).
165
166`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the
167implementation of the Buchi automaton - a non-deterministic state machine that
168verifies the LTL specification. While `rv/ltl_monitor.h` includes the common
169helper functions to interact with the Buchi automaton and to implement an RV
170monitor. An important definition in `$(MODEL_NAME).h` is::
171
172  enum ltl_atom {
173      LTL_$(FIRST_ATOMIC_PROPOSITION),
174      LTL_$(SECOND_ATOMIC_PROPOSITION),
175      ...
176      LTL_NUM_ATOM
177  };
178
179which is the list of atomic propositions present in the LTL specification
180(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the
181functions interacting with the Buchi automaton.
182
183While generating code, `rvgen` cannot understand the meaning of the atomic
184propositions. Thus, that task is left for manual work. The recommended pratice
185is adding tracepoints to places where the atomic propositions change; and in the
186tracepoints' handlers: the Buchi automaton is executed using::
187
188  void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value)
189
190which tells the Buchi automaton that the atomic proposition `atom` is now
191`value`. The Buchi automaton checks whether the LTL specification is still
192satisfied, and invokes the monitor's error tracepoint and the reactor if
193violation is detected.
194
195Tracepoints and `ltl_atom_update()` should be used whenever possible. However,
196it is sometimes not the most convenient. For some atomic propositions which are
197changed in multiple places in the kernel, it is cumbersome to trace all those
198places. Furthermore, it may not be important that the atomic propositions are
199updated at precise times. For example, considering the following linear temporal
200logic::
201
202  RULE = always (RT imply not PAGEFAULT)
203
204This LTL states that a real-time task does not raise page faults. For this
205specification, it is not important when `RT` changes, as long as it has the
206correct value when `PAGEFAULT` is true.  Motivated by this case, another
207function is introduced::
208
209  void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)
210
211This function is called whenever the Buchi automaton is triggered. Therefore, it
212can be manually implemented to "fetch" `RT`::
213
214  void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon)
215  {
216      ltl_atom_set(mon, LTL_RT, rt_task(task));
217  }
218
219Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`,
220`RT` is also fetched. Thus, the LTL specification can be verified without
221tracing `RT` everywhere.
222
223For atomic propositions which act like events, they usually need to be set (or
224cleared) and then immediately cleared (or set). A convenient function is
225provided::
226
227  void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value)
228
229which is equivalent to::
230
231  ltl_atom_update(task, atom, value);
232  ltl_atom_update(task, atom, !value);
233
234To initialize the atomic propositions, the following function must be
235implemented::
236
237  ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
238
239This function is called for all running tasks when the monitor is enabled. It is
240also called for new tasks created after the enabling the monitor. It should
241initialize as many atomic propositions as possible, for example::
242
243  void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation)
244  {
245      ltl_atom_set(mon, LTL_RT, rt_task(task));
246      if (task_creation)
247          ltl_atom_set(mon, LTL_PAGEFAULT, false);
248  }
249
250Atomic propositions not initialized by `ltl_atom_init()` will stay in the
251unknown state until relevant tracepoints are hit, which can take some time. As
252monitoring for a task cannot be done until all atomic propositions is known for
253the task, the monitor may need some time to start validating tasks which have
254been running before the monitor is enabled. Therefore, it is recommended to
255start the tasks of interest after enabling the monitor.
256
257Final remarks
258-------------
259
260With the monitor synthesis in place using the header files and
261rvgen, the developer's work should be limited to the instrumentation
262of the system, increasing the confidence in the overall approach.
263
264[1] For details about deterministic automata format and the translation
265from one representation to another, see::
266
267  Documentation/trace/rv/deterministic_automata.rst
268
269[2] rvgen appends the monitor's name suffix to the events enums to
270avoid conflicting variables when exporting the global vmlinux.h
271use by BPF programs.
272