1Runtime Verification Monitor Synthesis 2====================================== 3 4The starting point for the application of runtime verification (RV) techniques 5is the *specification* or *modeling* of the desired (or undesired) behavior 6of the system under scrutiny. 7 8The formal representation needs to be then *synthesized* into a *monitor* 9that can then be used in the analysis of the trace of the system. The 10*monitor* connects to the system via an *instrumentation* that converts 11the events from the *system* to the events of the *specification*. 12 13 14In Linux terms, the runtime verification monitors are encapsulated inside 15the *RV monitor* abstraction. The RV monitor includes a set of instances 16of the monitor (per-cpu monitor, per-task monitor, and so on), the helper 17functions that glue the monitor to the system reference model, and the 18trace output as a reaction to event parsing and exceptions, as depicted 19below:: 20 21 Linux +---- RV Monitor ----------------------------------+ Formal 22 Realm | | Realm 23 +-------------------+ +----------------+ +-----------------+ 24 | Linux kernel | | Monitor | | Reference | 25 | Tracing | -> | Instance(s) | <- | Model | 26 | (instrumentation) | | (verification) | | (specification) | 27 +-------------------+ +----------------+ +-----------------+ 28 | | | 29 | V | 30 | +----------+ | 31 | | Reaction | | 32 | +--+--+--+-+ | 33 | | | | | 34 | | | +-> trace output ? | 35 +------------------------|--|----------------------+ 36 | +----> panic ? 37 +-------> <user-specified> 38 39RV monitor synthesis 40-------------------- 41 42The synthesis of a specification into the Linux *RV monitor* abstraction is 43automated by the rvgen tool and the header file containing common code for 44creating monitors. The header files are: 45 46 * rv/da_monitor.h for deterministic automaton monitor. 47 * rv/ltl_monitor.h for linear temporal logic monitor. 48 * rv/ha_monitor.h for hybrid automaton monitor. 49 50rvgen 51----- 52 53The rvgen utility converts a specification into the C presentation and creating 54the skeleton of a kernel monitor in C. 55 56For example, it is possible to transform the wip.dot model present in 57[1] into a per-cpu monitor with the following command:: 58 59 $ rvgen monitor -c da -s wip.dot -t per_cpu 60 61This will create a directory named wip/ with the following files: 62 63- wip.h: the wip model in C 64- wip.c: the RV monitor 65 66The wip.c file contains the monitor declaration and the starting point for 67the system instrumentation. 68 69Similarly, a linear temporal logic monitor can be generated with the following 70command:: 71 72 $ rvgen monitor -c ltl -s pagefault.ltl -t per_task 73 74This generates pagefault/ directory with: 75 76- pagefault.h: The Buchi automaton (the non-deterministic state machine to 77 verify the specification) 78- pagefault.c: The skeleton for the RV monitor 79 80Monitor header files 81-------------------- 82 83The header files: 84 85- `rv/da_monitor.h` for deterministic automaton monitor 86- `rv/ltl_monitor` for linear temporal logic monitor 87 88include common macros and static functions for implementing *Monitor 89Instance(s)*. 90 91The benefits of having all common functionalities in a single header file are 923-fold: 93 94 - Reduce the code duplication; 95 - Facilitate the bug fix/improvement; 96 - Avoid the case of developers changing the core of the monitor code to 97 manipulate the model in a (let's say) non-standard way. 98 99rv/da_monitor.h 100+++++++++++++++ 101 102This initial implementation presents three different types of monitor instances: 103 104- ``#define RV_MON_TYPE RV_MON_GLOBAL`` 105- ``#define RV_MON_TYPE RV_MON_PER_CPU`` 106- ``#define RV_MON_TYPE RV_MON_PER_TASK`` 107 108The first sets up functions declaration for a global deterministic automata 109monitor, the second for monitors with per-cpu instances, and the third with 110per-task instances. 111 112In all cases, the C file must include the $(MODEL_NAME).h file (generated by 113`rvgen`), for example, to define the per-cpu 'wip' monitor, the `wip.c` source 114file must include:: 115 116 #define RV_MON_TYPE RV_MON_PER_CPU 117 #include "wip.h" 118 #include <rv/da_monitor.h> 119 120The monitor is executed by sending events to be processed via the functions 121presented below:: 122 123 da_handle_event($(event from event enum)); 124 da_handle_start_event($(event from event enum)); 125 da_handle_start_run_event($(event from event enum)); 126 127The function ``da_handle_event()`` is the regular case where 128the event will be processed if the monitor is processing events. 129 130When a monitor is enabled, it is placed in the initial state of the automata. 131However, the monitor does not know if the system is in the *initial state*. 132 133The ``da_handle_start_event()`` function is used to notify the 134monitor that the system is returning to the initial state, so the monitor can 135start monitoring the next event. 136 137The ``da_handle_start_run_event()`` function is used to notify 138the monitor that the system is known to be in the initial state, so the 139monitor can start monitoring and monitor the current event. 140 141Using the wip model as example, the events "preempt_disable" and 142"sched_waking" should be sent to monitor, respectively, via [2]:: 143 144 da_handle_event(preempt_disable_wip); 145 da_handle_event(sched_waking_wip); 146 147While the event "preempt_enabled" will use:: 148 149 da_handle_start_event(preempt_enable_wip); 150 151To notify the monitor that the system will be returning to the initial state, 152so the system and the monitor should be in sync. 153 154rv/ltl_monitor.h 155++++++++++++++++ 156This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`) 157to be complete. For example, for the `pagefault` monitor, the `pagefault.c` 158source file must include:: 159 160 #include "pagefault.h" 161 #include <rv/ltl_monitor.h> 162 163(the skeleton monitor file generated by `rvgen` already does this). 164 165`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the 166implementation of the Buchi automaton - a non-deterministic state machine that 167verifies the LTL specification. While `rv/ltl_monitor.h` includes the common 168helper functions to interact with the Buchi automaton and to implement an RV 169monitor. An important definition in `$(MODEL_NAME).h` is:: 170 171 enum ltl_atom { 172 LTL_$(FIRST_ATOMIC_PROPOSITION), 173 LTL_$(SECOND_ATOMIC_PROPOSITION), 174 ... 175 LTL_NUM_ATOM 176 }; 177 178which is the list of atomic propositions present in the LTL specification 179(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the 180functions interacting with the Buchi automaton. 181 182While generating code, `rvgen` cannot understand the meaning of the atomic 183propositions. Thus, that task is left for manual work. The recommended practice 184is adding tracepoints to places where the atomic propositions change; and in the 185tracepoints' handlers: the Buchi automaton is executed using:: 186 187 void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value) 188 189which tells the Buchi automaton that the atomic proposition `atom` is now 190`value`. The Buchi automaton checks whether the LTL specification is still 191satisfied, and invokes the monitor's error tracepoint and the reactor if 192violation is detected. 193 194Tracepoints and `ltl_atom_update()` should be used whenever possible. However, 195it is sometimes not the most convenient. For some atomic propositions which are 196changed in multiple places in the kernel, it is cumbersome to trace all those 197places. Furthermore, it may not be important that the atomic propositions are 198updated at precise times. For example, considering the following linear temporal 199logic:: 200 201 RULE = always (RT imply not PAGEFAULT) 202 203This LTL states that a real-time task does not raise page faults. For this 204specification, it is not important when `RT` changes, as long as it has the 205correct value when `PAGEFAULT` is true. Motivated by this case, another 206function is introduced:: 207 208 void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) 209 210This function is called whenever the Buchi automaton is triggered. Therefore, it 211can be manually implemented to "fetch" `RT`:: 212 213 void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) 214 { 215 ltl_atom_set(mon, LTL_RT, rt_task(task)); 216 } 217 218Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`, 219`RT` is also fetched. Thus, the LTL specification can be verified without 220tracing `RT` everywhere. 221 222For atomic propositions which act like events, they usually need to be set (or 223cleared) and then immediately cleared (or set). A convenient function is 224provided:: 225 226 void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value) 227 228which is equivalent to:: 229 230 ltl_atom_update(task, atom, value); 231 ltl_atom_update(task, atom, !value); 232 233To initialize the atomic propositions, the following function must be 234implemented:: 235 236 ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) 237 238This function is called for all running tasks when the monitor is enabled. It is 239also called for new tasks created after the enabling the monitor. It should 240initialize as many atomic propositions as possible, for example:: 241 242 void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) 243 { 244 ltl_atom_set(mon, LTL_RT, rt_task(task)); 245 if (task_creation) 246 ltl_atom_set(mon, LTL_PAGEFAULT, false); 247 } 248 249Atomic propositions not initialized by `ltl_atom_init()` will stay in the 250unknown state until relevant tracepoints are hit, which can take some time. As 251monitoring for a task cannot be done until all atomic propositions is known for 252the task, the monitor may need some time to start validating tasks which have 253been running before the monitor is enabled. Therefore, it is recommended to 254start the tasks of interest after enabling the monitor. 255 256rv/ha_monitor.h 257+++++++++++++++ 258 259The implementation of hybrid automaton monitors derives directly from the 260deterministic automaton one. Despite using a different header 261(``ha_monitor.h``) the functions to handle events are the same (e.g. 262``da_handle_event``). 263 264Additionally, the `rvgen` tool populates skeletons for the 265``ha_verify_constraint``, ``ha_get_env`` and ``ha_reset_env`` based on the 266monitor specification in the monitor source file. 267 268``ha_verify_constraint`` is typically ready as it is generated by `rvgen`: 269 270* standard constraints on edges are turned into the form:: 271 272 res = ha_get_env(ha_mon, ENV) < VALUE; 273 274* reset constraints are turned into the form:: 275 276 ha_reset_env(ha_mon, ENV); 277 278* constraints on the state are implemented using timers 279 280 - armed before entering the state 281 282 - cancelled while entering any other state 283 284 - untouched if the state does not change as a result of the event 285 286 - checked if the timer expired but the callback did not run 287 288 - available implementation are `HA_TIMER_HRTIMER` and `HA_TIMER_WHEEL` 289 290 - hrtimers are more precise but may have higher overhead 291 292 - select by defining `HA_TIMER_TYPE` before including the header:: 293 294 #define HA_TIMER_TYPE HA_TIMER_HRTIMER 295 296Constraint values can be specified in different forms: 297 298* literal value (with optional unit). E.g.:: 299 300 preemptive == 0 301 clk < 100ns 302 threshold <= 10j 303 304* constant value (uppercase string). E.g.:: 305 306 clk < MAX_NS 307 308* parameter (lowercase string). E.g.:: 309 310 clk <= threshold_jiffies 311 312* macro (uppercase string with parentheses). E.g.:: 313 314 clk < MAX_NS() 315 316* function (lowercase string with parentheses). E.g.:: 317 318 clk <= threshold_jiffies() 319 320In all cases, `rvgen` will try to understand the type of the environment 321variable from the name or unit. For instance, constants or parameters 322terminating with ``_NS`` or ``_jiffies`` are intended as clocks with ns and jiffy 323granularity, respectively. Literals with measure unit `j` are jiffies and if a 324time unit is specified (`ns` to `s`), `rvgen` will convert the value to `ns`. 325 326Constants need to be defined by the user (but unlike the name, they don't 327necessarily need to be defined as constants). Parameters get converted to 328module parameters and the user needs to provide a default value. 329Also function and macros are defined by the user, by default they get as an 330argument the ``ha_monitor``, a common usage would be to get the required value 331from the target, e.g. the task in per-task monitors, using the helper 332``ha_get_target(ha_mon)``. 333 334If `rvgen` determines that the variable is a clock, it provides the getter and 335resetter based on the unit. Otherwise, the user needs to provide an appropriate 336definition. 337Typically non-clock environment variables are not reset. In such case only the 338getter skeleton will be present in the file generated by `rvgen`. 339For instance, the getter for preemptive can be filled as:: 340 341 static u64 ha_get_env(struct ha_monitor *ha_mon, enum envs env) 342 { 343 if (env == preemptible) 344 return preempt_count() == 0; 345 return ENV_INVALID_VALUE; 346 } 347 348The function is supplied the ``ha_mon`` parameter in case some storage is 349required (as it is for clocks), but environment variables without reset do not 350require a storage and can ignore that argument. 351The number of environment variables requiring a storage is limited by 352``MAX_HA_ENV_LEN``, however such limitation doesn't stand for other variables. 353 354Finally, constraints on states are only valid for clocks and only if the 355constraint is of the form `clk < N`. This is because such constraints are 356implemented with the expiration of a timer. 357Typically the clock variables are reset just before arming the timer, but this 358doesn't have to be the case and the available functions take care of it. 359It is a responsibility of per-task monitors to make sure no timer is left 360running when the task exits. 361 362By default the generator implements timers with hrtimers (setting 363``HA_TIMER_TYPE`` to ``HA_TIMER_HRTIMER``), this gives better responsiveness 364but higher overhead. The timer wheel (``HA_TIMER_WHEEL``) is a good alternative 365for monitors with several instances (e.g. per-task) that achieves lower 366overhead with increased latency, yet without compromising precision. 367 368Final remarks 369------------- 370 371With the monitor synthesis in place using the header files and 372rvgen, the developer's work should be limited to the instrumentation 373of the system, increasing the confidence in the overall approach. 374 375[1] For details about deterministic automata format and the translation 376from one representation to another, see:: 377 378 Documentation/trace/rv/deterministic_automata.rst 379 380[2] rvgen appends the monitor's name suffix to the events enums to 381avoid conflicting variables when exporting the global vmlinux.h 382use by BPF programs. 383