xref: /freebsd/crypto/openssl/doc/designs/quic-design/quic-concurrency.md (revision 24e4dcf4ba5e9dedcf89efd358ea3e1fe5867020)
1QUIC Concurrency Architecture
2=============================
3
4Introduction
5------------
6
7Most QUIC implementations in C are offered as a simple state machine without any
8included I/O solution. Applications must do significant integration work to
9provide the necessary infrastructure for a QUIC implementation to integrate
10with. Moreover, blocking I/O at an application level may not be supported.
11
12OpenSSL QUIC seeks to offer a QUIC solution which can serve multiple use cases:
13
14- Firstly, it seeks to offer the simple state machine model and a fully
15  customisable network path (via a BIO) for those who want it;
16
17- Secondly, it seeks to offer a turnkey solution with an in-the-box I/O
18  and polling solution which can support blocking API calls in a Berkeley
19  sockets-like way.
20
21These usage modes are somewhat diametrically opposed. One involves libssl
22consuming no resources but those it is given, with an application responsible
23for synchronisation and a potentially custom network I/O path. This usage model
24is not “smart”. Network traffic is connected to the state machine and state is
25input and output from the state machine as needed by an application on a purely
26non-blocking basis. Determining *when* to do anything is largely the
27application's responsibility.
28
29The other diametrically opposed usage mode involves libssl managing more things
30internally to provide an easier to use solution. For example, it may involve
31spinning up background threads to ensure connections are serviced regularly (as
32in our existing client-side thread assisted mode).
33
34In order to provide for these different use cases, the concept of concurrency
35models is introduced. A concurrency model defines how “cleverly” the QUIC engine
36will operate and how many background resources (e.g. threads, other OS
37resources) will be established to support operation.
38
39Concurrency Models
40------------------
41
42- **Unsynchronised Concurrency Model (UCM):** In the Unsynchronised Concurrency
43  Model, calls to SSL objects are not synchronised. There is no locking on any
44  APL call (the omission of which is purely an optimisation). The application is
45  either single-threaded or is otherwise responsible for doing synchronisation
46  itself.
47
48  Blocking API calls are not supported under this model. This model is intended
49  primarily for single-threaded use as a simple state machine by advanced
50  applications, and many applications will be likely to disable autoticking.
51
52- **Contentive Concurrency Model (CCM):** In the
53  Contentive Concurrency Model, calls to SSL objects are wrapped in locks and
54  multi-threaded usage of a QUIC connection (for example, parallel writes to
55  different QUIC stream SSL objects belonging to the same QUIC connection) is
56  synchronised by a mutex.
57
58  This is contentive in the sense that if a large number of threads are trying
59  to write to different streams on the same connection, a large amount of lock
60  contention will occur. As such, this concurrency model will not scale and
61  provide good performance, at least within the context of concurrent use
62  of a single connection.
63
64  Under this model, APL calls by the application result in lock-wrapped
65  mutations of QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) on the
66  same thread.
67
68  This model may be used either in a variant which does not support blocking
69  (NB-CCM) or which does support blocking (B-CCM). The blocking variant must
70  spin up additional OS resources to correctly support blocking semantics.
71
72- **Thread Assisted Contentive Concurrency Model (TA-CCM):** This is currently
73  implemented by our thread assisted mode for client-side QUIC usage. It does
74  not realise the full state separation or performance of the Worker Concurrency
75  Model (WCM) below. Instead, it simply spawns a background thread which ensures
76  QUIC timer events are handled as needed. It makes use of the Contentive
77  Concurrency Model for performing that handling, in that it obtains a lock when
78  ticking a QUIC connection just as any call by an application would.
79
80  This mode is likely to be deprecated in favour of the full Worker Concurrency
81  Model (WCM), which it will naturally be subsumed by.
82
83- **Worker Concurrency Model (WCM):** In the Worker Concurrency Model,
84  a background worker thread is spawned to manage connection processing. All
85  interaction with a SSL object goes through this thread in some way.
86  Interactions with SSL objects are essentially translated into commands and
87  handled by the worker thread. To optimise performance and minimise lock
88  contention, there is an emphasis on message passing over locking.
89  Internal dataflow for application data can be managed in a zero-copy way to
90  minimise the costs of this message passing.
91
92  Under this model, QUIC core objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.) will
93  live solely on the worker thread and access to these objects by an application
94  thread will be entirely forbidden.
95
96  Blocking API calls are supported under this model.
97
98These concurrency models are summarised as follows:
99
100| Model  | Sophistication | Concurrency           | Blocking Supported | OS Resources              | Timer Events    | RX Steering | Core State Affinity  |
101|--------|----------------|-----------------------|--------------------|---------------------------|-----------------|-------------|----------------------|
102| UCM    | Lowest         | ST only               | No                 | None                      | App Responsible | None        | App Thread           |
103| CCM    |                | MT (Contentive)       | Optional           | Mutex, (Notifier)         | App Responsible | TBD         | App Threads          |
104| TA-CCM† |                | MT (Contentive)       | Optional           | Mutex, Thread, (Notifier) | Managed         | TBD         | App & Assist Threads |
105| WCM    | Highest        | MT (High Performance) | Yes                | Mutex, Thread, Notifier   | Managed         | Futureproof | Worker Thread        |
106
107† To eventually be deprecated in favour of WCM.
108
109Legend:
110
111- **Blocking Supported:** Whether blocking calls to e.g. `SSL_read` can be
112  supported. If this is listed as “optional”, extra resources are required to
113  support this under the listed model and these resources could be omitted if an
114  application indicates it does not need this functionality at initialisation
115  time.
116
117- **OS Resources:** “Mutex” refers to mutex and condition variable resources.
118  “Notifier” refers to a kind of OS resource needed to allow one thread to wake
119  another thread which is currently blocking in an OS socket polling call such
120  as poll(2) (e.g. an eventfd or socketpair). Resources listed in parentheses in
121  the table above are required only if blocking support is desired.
122
123- **Timer Events:** Is an application responsible for ensuring QUIC timeout
124  events are handled in a timely manner?
125
126- **RX Steering:** The matter of RX steering will be discussed in detail in a
127  future document. Broadly speaking, RX steering concerns whether incoming
128  traffic for multiple different QUIC connections on the same local port (e.g.
129  for a server) can be vectored *by the OS* to different threads or whether the
130  demuxing of incoming traffic for different connections has to be done manually
131  on an in-process basis.
132
133  The WCM model most readily supports RX steering and is futureproof in this
134  regard. The feasibility of having the UCM and CCM models support RX steering
135  is left for future analysis.
136
137- **Core State Affinity:** Which threads are allowed to touch the QUIC core
138  objects (`QUIC_CHANNEL`, `QUIC_STREAM`, etc.)
139
140Architecture
141------------
142
143To recap, the API Personality Layer (APL) refers to the code in `quic_impl.c`
144which implements the libssl API personality (`SSL_write`, etc.). The APL is
145cleanly separated from the QUIC core implementation (`QUIC_CHANNEL`, etc.).
146
147Since UCM is basically a slight optimisation of CCM in which unnecessary locking
148is elided, discussion from hereon in will focus on CCM and WCM except where
149there are specific differences between CCM and UCM.
150
151Supporting both CCM and WCM creates significant architectural challenges. Under
152CCM, QUIC core objects have their state mutated under lock by arbitrary
153application threads and these mutations happen during APL calls. By contrast, a
154performant WCM architecture requires that APL calls be recorded and serviced in
155an asynchronous fashion involving message passing to a worker thread. This
156threatens to require highly divergent dispatch architectures for the two
157concurrency models.
158
159As such, the concept of a **Concurrency Management Layer (CML)** is introduced.
160The CML lives between the APL and the QUIC core code. It is responsible for
161dispatching in-thread mutations of QUIC core objects when operating under CCM,
162and for dispatching messages to a worker thread under WCM.
163
164![Concurrency Models Diagram](images/quic-concurrency-models.svg)
165
166There are two different CMLs:
167
168- **Direct CML (DCML)**, in which core objects are worked on in the same thread
169  which made an APL call, under lock;
170
171- **Worker CML (WCML)**, in which core objects are managed by a worker thread
172  with communication via message passing. This CML is split into a front end
173  (WCML-FE) and back end (WCML-BE).
174
175The legacy thread assisted mode uses a bespoke method which is similar to the
176approach used by the DCML.
177
178CML Design
179----------
180
181The CML is designed to have as small an API surface area as possible to enable
182unified handling of as many kinds of (APL) API operations as possible. The idea
183is that complex APL calls are translated into simple operations on the CML.
184
185At its core, the CML exposes some number of *pipes*. The number of pipes which
186can be accessed via the CML varies as connections and streams are created and
187destroyed. A pipe is a *unidirectional* transport for byte streams. Zero-copy
188optimisations are expected to be implemented in future but are deferred.
189
190The CML (`QUIC_CML`) allows the caller to refer to a pipe by providing an opaque
191pipe handle (`QUIC_CML_PIPE`). If the pipe is a sending pipe, the caller can use
192`ossl_cml_write` to try and add bytes to it. Conversely, if it is a receiving
193pipe, the caller can use `ossl_cml_read` to try and read bytes from it.
194
195The method `ossl_cml_block_until` allows the caller to block until at least one
196of the provided pipe handles is ready. Ready means that at least one byte can be
197written (for a sending pipe) or at least one byte can be read (for a receiving
198pipe).
199
200Note that there is only expected to be one `QUIC_CML` instance per QUIC event
201processing domain (i.e., per `QUIC_DOMAIN` / `QUIC_ENGINE` instance). The CML
202fully abstracts the QUIC core objects such as `QUIC_ENGINE` or `QUIC_CHANNEL` so
203that the APL never sees them.
204
205The caller retrieves a pipe handle using `ossl_cml_get_pipe`. This function
206retrieves a pipe based on two values:
207
208  - a CML pipe class;
209  - a CML *selector*.
210
211The CML selector is a tagged union structure which specifies what pipe is to be
212retrieved. Abstractly, examples of selectors include:
213
214```text
215    Domain      ()
216    Listener    (listener_id: uint)
217    Conn        (conn_id:     uint)
218    Stream      (conn_id:     uint, stream_id: u64)
219```
220
221In other words, the CML selector selects the “object” to retrieve a pipe from.
222
223The CML pipe class is one of the following values:
224
225- Request
226- Notification
227- App Send
228- App Recv
229
230The pipe classes available for a given selector vary. For example, the “App
231Send” and “App Recv” pipes only exist on a stream, so it is invalid to request
232such a pipe in conjunction with a different type of selector.
233
234The “Request” and “App Send” classes expose send-only streams, and the
235“Notification” and “App Recv” classes expose receive-only streams.
236
237For any given CML selector, the Request pipe is used to send serialized commands
238for asynchronous processing in relation to the entity selected by that selector.
239Conversely, the Notification pipe returns asynchronous notifications. These
240could be in relation to a previous Command (e.g. indicating whether a command
241succeeded), or unprompted notifications about other events.
242
243The underlying pattern here is that there is a bidirectional channel for control
244messages, and a bidirectional channel for application data, both comprised of
245two unidirectional pipes in turn.
246
247Pipe handles are stable for as long as the pipe they reference exists, so an APL
248object can cache a pipe handle if desired.
249
250All CML methods are thread safe. The CML implementation handles any necessary
251locking (if any) internally.
252
253The `ossl_cml_write_available` and `ossl_cml_read_available` calls determine the
254number of bytes which can currently be written to a send-only pipe, or read from
255a receive-only pipe, respectively.
256
257**Race conditions.** Because these are separate calls to `ossl_cml_write` and
258`ossl_cml_read`, the values returned by these functions may become out of date
259before the caller has a chance to read `ossl_cml_write` or `ossl_cml_read`.
260However, such changes are guaranteed to be monotonically in favour of the
261caller; for example, the value returned by `ossl_cml_write_available` will only
262ever increase asynchronously (and only decrease as a result of an
263`ossl_cml_write` call). Conversely, the value returned by
264`ossl_cml_read_available` will only ever increase asynchronously (and only
265decrease as a result of an `ossl_cml_read` call). Assuming that only one thread
266makes calls to CML functions at a given time *for a given pipe*, this therefore
267poses no issue for callers.
268
269Concurrent use of `ossl_cml_write` or `ossl_cml_read` for a given pipe is not
270intended (and would not make sense in any case). The caller is responsible for
271synchronising such calls.
272
273**Examples of pipe usage.** The application data pipes are used to serialize the
274actual application data sent or received on a QUIC stream. The usage of the
275request/notification pipes is more varied and used for control activity. There
276is therefore a “control/data” separation here. The request and notification
277pipes transport tagged unions. Abstractly, commands and notifications might
278include:
279
280- Request: Reset Stream (error code: u64)
281- Notification: Connection Terminated by Peer
282
283**Example implementation of `SSL_write`.** An `SSL_write`-like API might be
284implemented in the APL like this:
285
286```c
287int do_write(QUIC_CML *cml,
288             QUIC_CML_PIPE notification_pipe,
289             QUIC_CML_PIPE app_send_pipe,
290             const void *buf, size_t buf_len)
291{
292    size_t bytes_written = 0;
293
294    for (;;) {
295        /* e.g. connection termination */
296        process_any_notifications(notification_pipe);
297
298        /* state checks, etc. */
299        if (...->conn_terminated)
300            return 0;
301
302        if (buf_len == 0)
303            return 1;
304
305        if (!ossl_cml_write(cml, app_send_pipe, buf, buf_len, &bytes_written))
306            return 0;
307
308        if (bytes_written == 0) {
309            if (!should_block())
310                break;
311
312            ossl_cml_block_until(cml, {notification_pipe, app_send_pipe});
313            continue; /* try again */
314        }
315
316        buf     += bytes_written;
317        buf_len -= bytes_written;
318    }
319
320    return 1;
321}
322```
323
324```c
325/*
326 * Creates a new CML using the Direct CML (DCML) implementation. need_locking
327 * may be 0 to elide mutex usage if the application is guaranteed to synchronise
328 * access or is purely single-threaded.
329 */
330QUIC_CML *ossl_cml_new_direct(int need_locking);
331
332/* Creates a new CML using the Worker CML (WCML) implementation. */
333QUIC_CML *ossl_cml_new_worker(size_t num_worker_threads);
334
335/*
336 * Starts the CML operating. Idempotent after it returns successfully. For the
337 * WCML this might e.g. start background threads; for the DCML it is likely to
338 * be a no-op (but must still be called).
339 */
340int ossl_cml_start(QUIC_CML *cml);
341
342/*
343 * Begins the CML shutdown process. Returns 1 once shutdown is complete; may
344 * need to be called multiple times until shutdown is done.
345 */
346int ossl_cml_shutdown(QUIC_CML *cml);
347
348/*
349 * Immediate free of the CML. This is always safe but may cause handling
350 * of a connection to be aborted abruptly as it is an immediate teardown
351 * of all state.
352 */
353void ossl_cml_free(QUIC_CML *cml);
354
355/*
356 * Retrieves a pipe for a logical CML object described by selector. The pipe
357 * handle, which is stable over the life of the logical CML object, is written
358 * to *pipe_handle. class_ is a QUIC_CML_CLASS value.
359 */
360enum {
361    QUIC_CML_CLASS_REQUEST,         /* control; send */
362    QUIC_CML_CLASS_NOTIFICATION,    /* control; recv */
363    QUIC_CML_CLASS_APP_SEND,        /* data; send */
364    QUIC_CML_CLASS_APP_RECV         /* data; recv */
365};
366
367int ossl_cml_get_pipe(QUIC_CML                  *cml,
368                      int                       class_,
369                      const QUIC_CML_SELECTOR   *selector,
370                      QUIC_CML_PIPE             *pipe_handle);
371
372/*
373 * Returns the number of bytes a sending pipe can currently accept. The returned
374 * value may increase over time asynchronously but will only decrease in
375 * response to an ossl_cml_write call.
376 */
377size_t ossl_cml_write_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle);
378
379/*
380 * Appends bytes into a sending pipe by copying them. The buffer can be freed
381 * as soon as this call returns.
382 */
383int ossl_cml_write(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle,
384                   const void *buf, size_t buf_len);
385
386/*
387 * Returns the number of bytes a receiving pipe currently has waiting to be
388 * read. The returned value may increase over time asynchronously but will only
389 * decreate in response to an ossl_cml_read call.
390 */
391size_t ossl_cml_read_available(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle);
392
393/*
394 * Reads bytes from a receiving pipe by copying them.
395 */
396int ossl_cml_read(QUIC_CML *cml, QUIC_CML_PIPE pipe_handle,
397                  void *buf, size_t buf_len);
398
399/*
400 * Blocks until at least one of the pipes in the array specified by
401 * pipe_handles is ready, or until the deadline given is reached.
402 *
403 * A pipe is ready if:
404 *
405 *   - it is a sending pipe and one or more bytes can now be written;
406 *   - it is a receiving pipe and one or more bytes can now be read.
407 */
408int ossl_cml_block_until(QUIC_CML *cml,
409                         const QUIC_CML_PIPE *pipe_handles,
410                         size_t num_pipe_handles,
411                         OSSL_TIME deadline);
412```
413