1Runtime Verification Monitor Synthesis 2====================================== 3 4The starting point for the application of runtime verification (RV) techniques 5is the *specification* or *modeling* of the desired (or undesired) behavior 6of the system under scrutiny. 7 8The formal representation needs to be then *synthesized* into a *monitor* 9that can then be used in the analysis of the trace of the system. The 10*monitor* connects to the system via an *instrumentation* that converts 11the events from the *system* to the events of the *specification*. 12 13 14In Linux terms, the runtime verification monitors are encapsulated inside 15the *RV monitor* abstraction. The RV monitor includes a set of instances 16of the monitor (per-cpu monitor, per-task monitor, and so on), the helper 17functions that glue the monitor to the system reference model, and the 18trace output as a reaction to event parsing and exceptions, as depicted 19below:: 20 21 Linux +----- RV Monitor ----------------------------------+ Formal 22 Realm | | Realm 23 +-------------------+ +----------------+ +-----------------+ 24 | Linux kernel | | Monitor | | Reference | 25 | Tracing | -> | Instance(s) | <- | Model | 26 | (instrumentation) | | (verification) | | (specification) | 27 +-------------------+ +----------------+ +-----------------+ 28 | | | 29 | V | 30 | +----------+ | 31 | | Reaction | | 32 | +--+--+--+-+ | 33 | | | | | 34 | | | +-> trace output ? | 35 +------------------------|--|----------------------+ 36 | +----> panic ? 37 +-------> <user-specified> 38 39RV monitor synthesis 40-------------------- 41 42The synthesis of a specification into the Linux *RV monitor* abstraction is 43automated by the rvgen tool and the header file containing common code for 44creating monitors. The header files are: 45 46 * rv/da_monitor.h for deterministic automaton monitor. 47 * rv/ltl_monitor.h for linear temporal logic monitor. 48 49rvgen 50----- 51 52The rvgen utility converts a specification into the C presentation and creating 53the skeleton of a kernel monitor in C. 54 55For example, it is possible to transform the wip.dot model present in 56[1] into a per-cpu monitor with the following command:: 57 58 $ rvgen monitor -c da -s wip.dot -t per_cpu 59 60This will create a directory named wip/ with the following files: 61 62- wip.h: the wip model in C 63- wip.c: the RV monitor 64 65The wip.c file contains the monitor declaration and the starting point for 66the system instrumentation. 67 68Similarly, a linear temporal logic monitor can be generated with the following 69command:: 70 71 $ rvgen monitor -c ltl -s pagefault.ltl -t per_task 72 73This generates pagefault/ directory with: 74 75- pagefault.h: The Buchi automaton (the non-deterministic state machine to 76 verify the specification) 77- pagefault.c: The skeleton for the RV monitor 78 79Monitor header files 80-------------------- 81 82The header files: 83 84- `rv/da_monitor.h` for deterministic automaton monitor 85- `rv/ltl_monitor` for linear temporal logic monitor 86 87include common macros and static functions for implementing *Monitor 88Instance(s)*. 89 90The benefits of having all common functionalities in a single header file are 913-fold: 92 93 - Reduce the code duplication; 94 - Facilitate the bug fix/improvement; 95 - Avoid the case of developers changing the core of the monitor code to 96 manipulate the model in a (let's say) non-standard way. 97 98rv/da_monitor.h 99+++++++++++++++ 100 101This initial implementation presents three different types of monitor instances: 102 103- ``#define RV_MON_TYPE RV_MON_GLOBAL`` 104- ``#define RV_MON_TYPE RV_MON_PER_CPU`` 105- ``#define RV_MON_TYPE RV_MON_PER_TASK`` 106 107The first sets up functions declaration for a global deterministic automata 108monitor, the second for monitors with per-cpu instances, and the third with 109per-task instances. 110 111In all cases, the C file must include the $(MODEL_NAME).h file (generated by 112`rvgen`), for example, to define the per-cpu 'wip' monitor, the `wip.c` source 113file must include:: 114 115 #define RV_MON_TYPE RV_MON_PER_CPU 116 #include "wip.h" 117 #include <rv/da_monitor.h> 118 119The monitor is executed by sending events to be processed via the functions 120presented below:: 121 122 da_handle_event($(event from event enum)); 123 da_handle_start_event($(event from event enum)); 124 da_handle_start_run_event($(event from event enum)); 125 126The function ``da_handle_event()`` is the regular case where 127the event will be processed if the monitor is processing events. 128 129When a monitor is enabled, it is placed in the initial state of the automata. 130However, the monitor does not know if the system is in the *initial state*. 131 132The ``da_handle_start_event()`` function is used to notify the 133monitor that the system is returning to the initial state, so the monitor can 134start monitoring the next event. 135 136The ``da_handle_start_run_event()`` function is used to notify 137the monitor that the system is known to be in the initial state, so the 138monitor can start monitoring and monitor the current event. 139 140Using the wip model as example, the events "preempt_disable" and 141"sched_waking" should be sent to monitor, respectively, via [2]:: 142 143 da_handle_event(preempt_disable_wip); 144 da_handle_event(sched_waking_wip); 145 146While the event "preempt_enabled" will use:: 147 148 da_handle_start_event(preempt_enable_wip); 149 150To notify the monitor that the system will be returning to the initial state, 151so the system and the monitor should be in sync. 152 153rv/ltl_monitor.h 154++++++++++++++++ 155This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`) 156to be complete. For example, for the `pagefault` monitor, the `pagefault.c` 157source file must include:: 158 159 #include "pagefault.h" 160 #include <rv/ltl_monitor.h> 161 162(the skeleton monitor file generated by `rvgen` already does this). 163 164`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the 165implementation of the Buchi automaton - a non-deterministic state machine that 166verifies the LTL specification. While `rv/ltl_monitor.h` includes the common 167helper functions to interact with the Buchi automaton and to implement an RV 168monitor. An important definition in `$(MODEL_NAME).h` is:: 169 170 enum ltl_atom { 171 LTL_$(FIRST_ATOMIC_PROPOSITION), 172 LTL_$(SECOND_ATOMIC_PROPOSITION), 173 ... 174 LTL_NUM_ATOM 175 }; 176 177which is the list of atomic propositions present in the LTL specification 178(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the 179functions interacting with the Buchi automaton. 180 181While generating code, `rvgen` cannot understand the meaning of the atomic 182propositions. Thus, that task is left for manual work. The recommended practice 183is adding tracepoints to places where the atomic propositions change; and in the 184tracepoints' handlers: the Buchi automaton is executed using:: 185 186 void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value) 187 188which tells the Buchi automaton that the atomic proposition `atom` is now 189`value`. The Buchi automaton checks whether the LTL specification is still 190satisfied, and invokes the monitor's error tracepoint and the reactor if 191violation is detected. 192 193Tracepoints and `ltl_atom_update()` should be used whenever possible. However, 194it is sometimes not the most convenient. For some atomic propositions which are 195changed in multiple places in the kernel, it is cumbersome to trace all those 196places. Furthermore, it may not be important that the atomic propositions are 197updated at precise times. For example, considering the following linear temporal 198logic:: 199 200 RULE = always (RT imply not PAGEFAULT) 201 202This LTL states that a real-time task does not raise page faults. For this 203specification, it is not important when `RT` changes, as long as it has the 204correct value when `PAGEFAULT` is true. Motivated by this case, another 205function is introduced:: 206 207 void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) 208 209This function is called whenever the Buchi automaton is triggered. Therefore, it 210can be manually implemented to "fetch" `RT`:: 211 212 void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) 213 { 214 ltl_atom_set(mon, LTL_RT, rt_task(task)); 215 } 216 217Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`, 218`RT` is also fetched. Thus, the LTL specification can be verified without 219tracing `RT` everywhere. 220 221For atomic propositions which act like events, they usually need to be set (or 222cleared) and then immediately cleared (or set). A convenient function is 223provided:: 224 225 void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value) 226 227which is equivalent to:: 228 229 ltl_atom_update(task, atom, value); 230 ltl_atom_update(task, atom, !value); 231 232To initialize the atomic propositions, the following function must be 233implemented:: 234 235 ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) 236 237This function is called for all running tasks when the monitor is enabled. It is 238also called for new tasks created after the enabling the monitor. It should 239initialize as many atomic propositions as possible, for example:: 240 241 void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) 242 { 243 ltl_atom_set(mon, LTL_RT, rt_task(task)); 244 if (task_creation) 245 ltl_atom_set(mon, LTL_PAGEFAULT, false); 246 } 247 248Atomic propositions not initialized by `ltl_atom_init()` will stay in the 249unknown state until relevant tracepoints are hit, which can take some time. As 250monitoring for a task cannot be done until all atomic propositions is known for 251the task, the monitor may need some time to start validating tasks which have 252been running before the monitor is enabled. Therefore, it is recommended to 253start the tasks of interest after enabling the monitor. 254 255Final remarks 256------------- 257 258With the monitor synthesis in place using the header files and 259rvgen, the developer's work should be limited to the instrumentation 260of the system, increasing the confidence in the overall approach. 261 262[1] For details about deterministic automata format and the translation 263from one representation to another, see:: 264 265 Documentation/trace/rv/deterministic_automata.rst 266 267[2] rvgen appends the monitor's name suffix to the events enums to 268avoid conflicting variables when exporting the global vmlinux.h 269use by BPF programs. 270