Lines Matching +full:1 +full:- +full:bit +full:- +full:only
8 * 1. Redistributions of source code must retain the above copyright
31 * ck_ec implements 32- and 64- bit event counts. Event counts let us
32 * easily integrate OS-level blocking (e.g., futexes) in lock-free
36 * Event counts come in four variants: 32 and 64 bit (with one bit
37 * stolen for internal signaling, so 31 and 63 bit counters), and
39 * consumers. The 32 bit variants are smaller, and more efficient,
40 * especially in single producer mode. The 64 bit variants are larger,
43 * The 32 bit variant is always available. The 64 bit variant is only
44 * available if CK supports 64-bit atomic operations. Currently,
45 * specialization for single producer is only implemented for x86 and
46 * x86-64, on compilers that support GCC extended inline assembly;
51 * 1. On the producer side:
53 * - Make changes to some shared data structure, without involving
55 * - After each change, call ck_ec_inc on the event count. The call
56 * acts as a write-write barrier, and wakes up any consumer blocked
61 * - Snapshot ck_ec_value of the event count. The call acts as a
63 * - Read and process the shared data structure.
64 * - Wait for new changes by calling ck_ec_wait with the snapshot value.
89 * When compiled as C11 or later, this header defines type-generic
91 * type-generic API.
137 * success, and -1 if ops->gettime failed (without touching errno).
141 * provided and non-NULL, until the current time is after that
142 * deadline. Use a deadline with tv_sec = 0 for a non-blocking
143 * execution. Returns 0 if the event counter has changed, and -1 on
148 * `pred` returns non-zero, or, if `deadline` is provided and
149 * non-NULL, until the current time is after that deadline. Use a
150 * deadline with tv_sec = 0 for a non-blocking execution. Returns 0 if
151 * the event counter has changed, `pred`'s return value if non-zero,
152 * and -1 on timeout. This function acts as a read (acquire) barrier.
157 * `pred` returns a non-zero value, that value is immediately returned
166 * count, with a single flag bit to denote the need to wake up waiting
170 * [x86-TSO](https://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf), and
171 * to non-atomic read-modify-write instructions (e.g., `inc mem`);
172 * these non-atomic RMW let us write to the same memory locations with
173 * atomic and non-atomic instructions, without suffering from process
176 * The reason we can mix atomic and non-atomic writes to the `counter`
177 * word is that every non-atomic write obviates the need for the
178 * atomically flipped flag bit: we only use non-atomic writes to
179 * update the event count, and the atomic flag only informs the
181 * We only require the non-atomic RMW counter update to prevent
186 * non-atomic writes. The key is instead x86-TSO's guarantee that a
191 * x86-TSO's constraint on reads suffices to guarantee that the
201 * woken up because the flag bit was silently cleared.
203 * In reality, the store queue in x86-TSO stands for in-flight
204 * instructions in the chip's out-of-order backend. In the vast
205 * majority of cases, instructions will only remain in flight for a
207 * the `counter` word for ~100 iterations after flipping its flag bit:
214 * than 1 second... if only because some interrupt will have forced
217 * particularly the pre-emption timer, are why single-producer updates
218 * must happen in a single non-atomic read-modify-write instruction.
219 * Having a single instruction as the critical section means we only
220 * have to consider the worst-case execution time for that
222 * instructions, which an unlucky pre-emption could delay for
227 * backoff is only necessary to handle rare cases where the flag flip
230 * becomes infinite: since the flag bit has been set for much longer
234 * The 64 bit ck_ec_wait pulls another trick: futexes only handle 32
235 * bit ints, so we must treat the 64 bit counter's low 32 bits as an
236 * int in futex_wait. That's a bit dodgy, but fine in practice, given
240 * the store queue (the out-of-order execution backend).
243 * or otherwise pre-empted? Migration must already incur a barrier, so
245 * pre-emption, that requires storing the architectural state, which
247 * all when pre-emption happens.
261 * support 63 bit counters.
268 * GCC inline assembly lets us exploit non-atomic read-modify-write
269 * instructions on x86/x86_64 for a fast single-producer mode.
289 * ck_ec_ops define system-specific functions to get the current time,
297 /* Populates out with the current time. Returns non-zero on failure. */
302 * deadline is non-NULL, stops waiting once that deadline is
309 * Same as wait32, but for a 64 bit counter. Only used if
312 * If underlying blocking primitive only supports 32 bit
314 * significant half of the 64 bit address.
323 * Same as wake32, but for a 64 bit counter. Only used if
326 * When wait64 truncates the control word at address to `only`
378 * ck_ec_mode structs are only passed to inline functions defined in
393 /* Flag is "sign" bit, value in bits 0:30. */
402 * Flag is bottom bit, value in bits 1:63. Eventcount only
403 * works on x86-64 (i.e., little endian), so the futex int
523 * Returns 0 on success, and -1 if clock_gettime failed, in which
532 * old_value, or, if deadline is non-NULL, until CLOCK_MONOTONIC is
535 * Returns 0 on success, and -1 on timeout.
562 * old_value, pred returns non-zero, or, if deadline is non-NULL,
565 * Returns 0 on success, -1 on timeout, and the return value of pred
566 * if it returns non-zero.
599 * Inline implementation details. 32 bit first, then 64 bit
604 ec->counter = value & ~(1UL << 31); in ck_ec32_init()
610 uint32_t ret = ck_pr_load_32(&ec->counter) & ~(1UL << 31); in ck_ec32_value()
618 return ck_pr_load_32(&ec->counter) & (1UL << 31); in ck_ec32_has_waiters()
629 ck_ec32_add(ec, mode, 1); in ck_ec32_inc()
636 * We don't want to wake if the sign bit is 0. We do want to in ck_ec32_inc()
637 * wake if the sign bit just flipped from 1 to 0. We don't in ck_ec32_inc()
638 * care what happens when our increment caused the sign bit to in ck_ec32_inc()
639 * flip from 0 to 1 (that's once per 2^31 increment). in ck_ec32_inc()
643 * old sign bit | new sign bit | SF | OF | ZF in ck_ec32_inc()
644 * ------------------------------------------- in ck_ec32_inc()
646 * 0 | 1 | 1 | 0 | ? in ck_ec32_inc()
647 * 1 | 1 | 1 | 0 | ? in ck_ec32_inc()
648 * 1 | 0 | 0 | 0 | 1 in ck_ec32_inc()
655 * The "le" condition checks if SF != OF, or ZF == 1, which in ck_ec32_inc()
660 : "+m"(ec->counter), "=@ccle"(flagged) \ in ck_ec32_inc()
664 __asm__ volatile(PREFIX " incl %0; setle %1" \ in ck_ec32_inc()
665 : "+m"(ec->counter), "=r"(flagged) \ in ck_ec32_inc()
669 if (mode->single_producer == true) { in ck_ec32_inc()
679 ck_ec32_wake(ec, mode->ops); in ck_ec32_inc()
690 const uint32_t flag_mask = 1U << 31; in ck_ec32_add_epilogue()
694 /* These two only differ if the flag bit is set. */ in ck_ec32_add_epilogue()
696 ck_ec32_wake(ec, mode->ops); in ck_ec32_add_epilogue()
709 old = ck_pr_faa_32(&ec->counter, delta); in ck_ec32_add_mp()
723 * is a no-op. in ck_ec32_add_sp()
731 __asm__ volatile("xaddl %1, %0" in ck_ec32_add_sp()
732 : "+m"(ec->counter), "+r"(old) in ck_ec32_add_sp()
743 if (mode->single_producer == true) { in ck_ec32_add()
759 return ck_ec_deadline_impl(new_deadline, mode->ops, timeout); in ck_ec_deadline()
777 return ck_ec32_wait_slow(ec, mode->ops, old_value, deadline); in ck_ec32_wait()
801 return ck_ec32_wait_pred_slow(ec, mode->ops, old_value, in ck_ec32_wait_pred()
808 ec->counter = value << 1; in ck_ec64_init()
814 uint64_t ret = ck_pr_load_64(&ec->counter) >> 1; in ck_ec64_value()
822 return ck_pr_load_64(&ec->counter) & 1; in ck_ec64_has_waiters()
831 (void)ck_ec64_add(ec, mode, 1); in ck_ec64_inc()
839 uint64_t ret = old >> 1; in ck_ec_add64_epilogue()
841 if (CK_CC_UNLIKELY(old & 1)) { in ck_ec_add64_epilogue()
842 ck_ec64_wake(ec, mode->ops); in ck_ec_add64_epilogue()
852 uint64_t inc = 2 * delta; /* The low bit is the flag bit. */ in ck_ec64_add_mp()
855 return ck_ec_add64_epilogue(ec, mode, ck_pr_faa_64(&ec->counter, inc)); in ck_ec64_add_mp()
859 /* Single-producer specialisation. */
869 * is a no-op. in ck_ec64_add_sp()
876 old = 2 * delta; /* The low bit is the flag bit. */ in ck_ec64_add_sp()
877 __asm__ volatile("xaddq %1, %0" in ck_ec64_add_sp()
878 : "+m"(ec->counter), "+r"(old) in ck_ec64_add_sp()
885 * Dispatch on mode->single_producer in this FORCE_INLINE function:
894 if (mode->single_producer == true) { in ck_ec64_add()
916 return ck_ec64_wait_slow(ec, mode->ops, old_value, deadline); in ck_ec64_wait()
941 return ck_ec64_wait_pred_slow(ec, mode->ops, old_value, in ck_ec64_wait_pred()