1QUIC Concurrency Architecture 2============================= 3 4Introduction 5------------ 6 7Most QUIC implementations in C are offered as a simple state machine without any 8included I/O solution. Applications must do significant integration work to 9provide the necessary infrastructure for a QUIC implementation to integrate 10with. Moreover, blocking I/O at an application level may not be supported. 11 12OpenSSL QUIC seeks to offer a QUIC solution which can serve multiple use cases: 13 14- Firstly, it seeks to offer the simple state machine model and a fully 15 customisable network path (via a BIO) for those who want it; 16 17- Secondly, it seeks to offer a turnkey solution with an in-the-box I/O 18 and polling solution which can support blocking API calls in a Berkeley 19 sockets-like way. 20 21These usage modes are somewhat diametrically opposed. One involves libssl 22consuming no resources but those it is given, with an application responsible 23for synchronisation and a potentially custom network I/O path. This usage model 24is not “smart”. Network traffic is connected to the state machine and state is 25input and output from the state machine as needed by an application on a purely 26non-blocking basis. Determining *when* to do anything is largely the 27application's responsibility. 28 29The other diametrically opposed usage mode involves libssl managing more things 30internally to provide an easier to use solution. For example, it may involve 31spinning up background threads to ensure connections are serviced regularly (as 32in our existing client-side thread assisted mode). 33 34In order to provide for these different use cases, the concept of concurrency 35models is introduced. A concurrency model defines how “cleverly” the QUIC engine 36will operate and how many background resources (e.g. threads, other OS 37resources) will be established to support operation. 38 39Concurrency Models 40------------------ 41 42- **Unsynchronised Concurrency Model (UCM):** In the Unsynchronised Concurrency 43 Model, calls to SSL objects are not synchronised. There is no locking on any 44 APL call (the omission of which is purely an optimisation). The application is 45 either single-threaded or is otherwise responsible for doing synchronisation 46 itself. 47 48 Blocking API calls are not supported under this model. This model is intended 49 primarily for single-threaded use as a simple state machine by advanced 50 applications, and many applications will be likely to disable autoticking. 51 52- **Contentive Concurrency Model (CCM):** In the 53 Contentive Concurrency Model, calls to SSL objects are wrapped in locks and 54 multi-threaded usage of a QUIC connection (for example, parallel writes to 55 different QUIC stream SSL objects belonging to the same QUIC connection) is 56 synchronised by a mutex. 57 58 This is contentive in the sense that if a large number of threads are trying 59 to write to different streams on the same connection, a large amount of lock 60 contention will occur. As such, this concurrency model will not scale and 61 provide good performance, at least within the context of concurrent use 62 of a single connection. 63 64 Under this model, APL calls by the application result in lock-wrapped 65 mutations of QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) on the 66 same thread. 67 68 This model may be used either in a variant which does not support blocking 69 (NB-CCM) or which does support blocking (B-CCM). The blocking variant must 70 spin up additional OS resources to correctly support blocking semantics. 71 72- **Thread Assisted Contentive Concurrency Model (TA-CCM):** This is currently 73 implemented by our thread assisted mode for client-side QUIC usage. It does 74 not realise the full state separation or performance of the Worker Concurrency 75 Model (WCM) below. Instead, it simply spawns a background thread which ensures 76 QUIC timer events are handled as needed. It makes use of the Contentive 77 Concurrency Model for performing that handling, in that it obtains a lock when 78 ticking a QUIC connection just as any call by an application would. 79 80 This mode is likely to be deprecated in favour of the full Worker Concurrency 81 Model (WCM), which it will naturally be subsumed by. 82 83- **Worker Concurrency Model (WCM):** In the Worker Concurrency Model, 84 a background worker thread is spawned to manage connection processing. All 85 interaction with a SSL object goes through this thread in some way. 86 Interactions with SSL objects are essentially translated into commands and 87 handled by the worker thread. To optimise performance and minimise lock 88 contention, there is an emphasis on message passing over locking. 89 Internal dataflow for application data can be managed in a zero-copy way to 90 minimise the costs of this message passing. 91 92 Under this model, QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) will 93 live solely on the worker thread and access to these objects by an application 94 thread will be entirely forbidden. 95 96 Blocking API calls are supported under this model. 97 98These concurrency models are summarised as follows: 99 100| Model | Sophistication | Concurrency | Blocking Supported | OS Resources | Timer Events | RX Steering | Core State Affinity | 101|--------|----------------|-----------------------|--------------------|---------------------------|-----------------|-------------|----------------------| 102| UCM | Lowest | ST only | No | None | App Responsible | None | App Thread | 103| CCM | | MT (Contentive) | Optional | Mutex, (Notifier) | App Responsible | TBD | App Threads | 104| TA-CCM† | | MT (Contentive) | Optional | Mutex, Thread, (Notifier) | Managed | TBD | App & Assist Threads | 105| WCM | Highest | MT (High Performance) | Yes | Mutex, Thread, Notifier | Managed | Futureproof | Worker Thread | 106 107† To eventually be deprecated in favour of WCM. 108 109Legend: 110 111- **Blocking Supported:** Whether blocking calls to e.g. `SSL_read` can be 112 supported. If this is listed as “optional”, extra resources are required to 113 support this under the listed model and these resources could be omitted if an 114 application indicates it does not need this functionality at initialisation 115 time. 116 117- **OS Resources:** “Mutex” refers to mutex and condition variable resources. 118 “Notifier” refers to a kind of OS resource needed to allow one thread to wake 119 another thread which is currently blocking in an OS socket polling call such 120 as poll(2) (e.g. an eventfd or socketpair). Resources listed in parentheses in 121 the table above are required only if blocking support is desired. 122 123- **Timer Events:** Is an application responsible for ensuring QUIC timeout 124 events are handled in a timely manner? 125 126- **RX Steering:** The matter of RX steering will be discussed in detail in a 127 future document. Broadly speaking, RX steering concerns whether incoming 128 traffic for multiple different QUIC connections on the same local port (e.g. 129 for a server) can be vectored *by the OS* to different threads or whether the 130 demuxing of incoming traffic for different connections has to be done manually 131 on an in-process basis. 132 133 The WCM model most readily supports RX steering and is futureproof in this 134 regard. The feasibility of having the UCM and CCM models support RX steering 135 is left for future analysis. 136 137- **Core State Affinity:** Which threads are allowed to touch the QUIC core 138 objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) 139 140Architecture 141------------ 142 143To recap, the API Personality Layer (APL) refers to the code in `quic_impl.c` 144which implements the libssl API personality (`SSL_write`, etc.). The APL is 145cleanly separated from the QUIC core implementation (`QUIC_CHANNEL`, etc.). 146 147Since UCM is basically a slight optimisation of CCM in which unnecessary locking 148is elided, discussion from hereon in will focus on CCM and WCM except where 149there are specific differences between CCM and UCM. 150 151Supporting both CCM and WCM creates significant architectural challenges. Under 152CCM, QUIC core objects have their state mutated under lock by arbitrary 153application threads and these mutations happen during APL calls. By contrast, a 154performant WCM architecture requires that APL calls be recorded and serviced in 155an asynchronous fashion involving message passing to a worker thread. This 156threatens to require highly divergent dispatch architectures for the two 157concurrency models. 158 159As such, the concept of a **Concurrency Management Layer (CML)** is introduced. 160The CML lives between the APL and the QUIC core code. It is responsible for 161dispatching in-thread mutations of QUIC core objects when operating under CCM, 162and for dispatching messages to a worker thread under WCM. 163 164 165 166There are two different CMLs: 167 168- **Direct CML (DCML)**, in which core objects are worked on in the same thread 169 which made an APL call, under lock; 170 171- **Worker CML (WCML)**, in which core objects are managed by a worker thread 172 with communication via message passing. This CML is split into a front end 173 (WCML-FE) and back end (WCML-BE). 174 175The legacy thread assisted mode uses a bespoke method which is similar to the 176approach used by the DCML. 177 178CML Design 179---------- 180 181The CML is designed to have as small an API surface area as possible to enable 182unified handling of as many kinds of (APL) API operations as possible. The idea 183is that complex APL calls are translated into simple operations on the CML. 184 185At its core, the CML exposes some number of *pipes*. The number of pipes which 186can be accessed via the CML varies as connections and streams are created and 187destroyed. A pipe is a *unidirectional* transport for byte streams. Zero-copy 188optimisations are expected to be implemented in future but are deferred. 189 190The CML (`QUIC_CML`) allows the caller to refer to a pipe by providing an opaque 191pipe handle (`QUIC_CML_PIPE`). If the pipe is a sending pipe, the caller can use 192`ossl_cml_write` to try and add bytes to it. Conversely, if it is a receiving 193pipe, the caller can use `ossl_cml_read` to try and read bytes from it. 194 195The method `ossl_cml_block_until` allows the caller to block until at least one 196of the provided pipe handles is ready. Ready means that at least one byte can be 197written (for a sending pipe) or at least one byte can be read (for a receiving 198pipe). 199 200Note that there is only expected to be one `QUIC_CML` instance per QUIC event 201processing domain (i.e., per `QUIC_DOMAIN` / `QUIC_ENGINE` instance). The CML 202fully abstracts the QUIC core objects such as `QUIC_ENGINE` or `QUIC_CHANNEL` so 203that the APL never sees them. 204 205The caller retrieves a pipe handle using `ossl_cml_get_pipe`. This function 206retrieves a pipe based on two values: 207 208 - a CML pipe class; 209 - a CML *selector*. 210 211The CML selector is a tagged union structure which specifies what pipe is to be 212retrieved. Abstractly, examples of selectors include: 213 214```text 215 Domain () 216 Listener (listener_id: uint) 217 Conn (conn_id: uint) 218 Stream (conn_id: uint, stream_id: u64) 219``` 220 221In other words, the CML selector selects the “object” to retrieve a pipe from. 222 223The CML pipe class is one of the following values: 224 225- Request 226- Notification 227- App Send 228- App Recv 229 230The pipe classes available for a given selector vary. For example, the “App 231Send” and “App Recv” pipes only exist on a stream, so it is invalid to request 232such a pipe in conjunction with a different type of selector. 233 234The “Request” and “App Send” classes expose send-only streams, and the 235“Notification” and “App Recv” classes expose receive-only streams. 236 237For any given CML selector, the Request pipe is used to send serialized commands 238for asynchronous processing in relation to the entity selected by that selector. 239Conversely, the Notification pipe returns asynchronous notifications. These 240could be in relation to a previous Command (e.g. indicating whether a command 241succeeded), or unprompted notifications about other events. 242 243The underlying pattern here is that there is a bidirectional channel for control 244messages, and a bidirectional channel for application data, both comprised of 245two unidirectional pipes in turn. 246 247Pipe handles are stable for as long as the pipe they reference exists, so an APL 248object can cache a pipe handle if desired. 249 250All CML methods are thread safe. The CML implementation handles any necessary 251locking (if any) internally. 252 253The `ossl_cml_write_available` and `ossl_cml_read_available` calls determine the 254number of bytes which can currently be written to a send-only pipe, or read from 255a receive-only pipe, respectively. 256 257**Race conditions.** Because these are separate calls to `ossl_cml_write` and 258`ossl_cml_read`, the values returned by these functions may become out of date 259before the caller has a chance to read `ossl_cml_write` or `ossl_cml_read`. 260However, such changes are guaranteed to be monotonically in favour of the 261caller; for example, the value returned by `ossl_cml_write_available` will only 262ever increase asynchronously (and only decrease as a result of an 263`ossl_cml_write` call). Conversely, the value returned by 264`ossl_cml_read_available` will only ever increase asynchronously (and only 265decrease as a result of an `ossl_cml_read` call). Assuming that only one thread 266makes calls to CML functions at a given time *for a given pipe*, this therefore 267poses no issue for callers. 268 269Concurrent use of `ossl_cml_write` or `ossl_cml_read` for a given pipe is not 270intended (and would not make sense in any case). The caller is responsible for 271synchronising such calls. 272 273**Examples of pipe usage.** The application data pipes are used to serialize the 274actual application data sent or received on a QUIC stream. The usage of the 275request/notification pipes is more varied and used for control activity. There 276is therefore a “control/data” separation here. The request and notification 277pipes transport tagged unions. Abstractly, commands and notifications might 278include: 279 280- Request: Reset Stream (error code: u64) 281- Notification: Connection Terminated by Peer 282 283**Example implementation of `SSL_write`.** An `SSL_write`-like API might be 284implemented in the APL like this: 285 286```c 287int do_write(QUIC_CML *cml, 288 QUIC_CML_PIPE notification_pipe, 289 QUIC_CML_PIPE app_send_pipe, 290 const void *buf, size_t buf_len) 291{ 292 size_t bytes_written = 0; 293 294 for (;;) { 295 /* e.g. connection termination */ 296 process_any_notifications(notification_pipe); 297 298 /* state checks, etc. */ 299 if (...->conn_terminated) 300 return 0; 301 302 if (buf_len == 0) 303 return 1; 304 305 if (!ossl_cml_write(cml, app_send_pipe, buf, buf_len, &bytes_written)) 306 return 0; 307 308 if (bytes_written == 0) { 309 if (!should_block()) 310 break; 311 312 ossl_cml_block_until(cml, {notification_pipe, app_send_pipe}); 313 continue; /* try again */ 314 } 315 316 buf += bytes_written; 317 buf_len -= bytes_written; 318 } 319 320 return 1; 321} 322``` 323 324```c 325/* 326 * Creates a new CML using the Direct CML (DCML) implementation. need_locking 327 * may be 0 to elide mutex usage if the application is guaranteed to synchronise 328 * access or is purely single-threaded. 329 */ 330QUIC_CML *ossl_cml_new_direct(int need_locking); 331 332/* Creates a new CML using the Worker CML (WCML) implementation. */ 333QUIC_CML *ossl_cml_new_worker(size_t num_worker_threads); 334 335/* 336 * Starts the CML operating. Idempotent after it returns successfully. For the 337 * WCML this might e.g. start background threads; for the DCML it is likely to 338 * be a no-op (but must still be called). 339 */ 340int ossl_cml_start(QUIC_CML *cml); 341 342/* 343 * Begins the CML shutdown process. Returns 1 once shutdown is complete; may 344 * need to be called multiple times until shutdown is done. 345 */ 346int ossl_cml_shutdown(QUIC_CML *cml); 347 348/* 349 * Immediate free of the CML. This is always safe but may cause handling 350 * of a connection to be aborted abruptly as it is an immediate teardown 351 * of all state. 352 */ 353void ossl_cml_free(QUIC_CML *cml); 354 355/* 356 * Retrieves a pipe for a logical CML object described by selector. The pipe 357 * handle, which is stable over the life of the logical CML object, is written 358 * to *pipe_handle. class_ is a QUIC_CML_CLASS value. 359 */ 360enum { 361 QUIC_CML_CLASS_REQUEST, /* control; send */ 362 QUIC_CML_CLASS_NOTIFICATION, /* control; recv */ 363 QUIC_CML_CLASS_APP_SEND, /* data; send */ 364 QUIC_CML_CLASS_APP_RECV /* data; recv */ 365}; 366 367int ossl_cml_get_pipe(QUIC_CML *cml, 368 int class_, 369 const QUIC_CML_SELECTOR *selector, 370 QUIC_CML_PIPE *pipe_handle); 371 372/* 373 * Returns the number of bytes a sending pipe can currently accept. The returned 374 * value may increase over time asynchronously but will only decrease in 375 * response to an ossl_cml_write call. 376 */ 377size_t ossl_cml_write_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle); 378 379/* 380 * Appends bytes into a sending pipe by copying them. The buffer can be freed 381 * as soon as this call returns. 382 */ 383int ossl_cml_write(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle, 384 const void *buf, size_t buf_len); 385 386/* 387 * Returns the number of bytes a receiving pipe currently has waiting to be 388 * read. The returned value may increase over time asynchronously but will only 389 * decreate in response to an ossl_cml_read call. 390 */ 391size_t ossl_cml_read_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle); 392 393/* 394 * Reads bytes from a receiving pipe by copying them. 395 */ 396int ossl_cml_read(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle, 397 void *buf, size_t buf_len); 398 399/* 400 * Blocks until at least one of the pipes in the array specified by 401 * pipe_handles is ready, or until the deadline given is reached. 402 * 403 * A pipe is ready if: 404 * 405 * - it is a sending pipe and one or more bytes can now be written; 406 * - it is a receiving pipe and one or more bytes can now be read. 407 */ 408int ossl_cml_block_until(QUIC_CML *cml, 409 const QUIC_CML_PIPE *pipe_handles, 410 size_t num_pipe_handles, 411 OSSL_TIME deadline); 412``` 413