1=================================== 2NT synchronization primitive driver 3=================================== 4 5This page documents the user-space API for the ntsync driver. 6 7ntsync is a support driver for emulation of NT synchronization 8primitives by user-space NT emulators. It exists because implementation 9in user-space, using existing tools, cannot match Windows performance 10while offering accurate semantics. It is implemented entirely in 11software, and does not drive any hardware device. 12 13This interface is meant as a compatibility tool only, and should not 14be used for general synchronization. Instead use generic, versatile 15interfaces such as futex(2) and poll(2). 16 17Synchronization primitives 18========================== 19 20The ntsync driver exposes three types of synchronization primitives: 21semaphores, mutexes, and events. 22 23A semaphore holds a single volatile 32-bit counter, and a static 32-bit 24integer denoting the maximum value. It is considered signaled (that is, 25can be acquired without contention, or will wake up a waiting thread) 26when the counter is nonzero. The counter is decremented by one when a 27wait is satisfied. Both the initial and maximum count are established 28when the semaphore is created. 29 30A mutex holds a volatile 32-bit recursion count, and a volatile 32-bit 31identifier denoting its owner. A mutex is considered signaled when its 32owner is zero (indicating that it is not owned). The recursion count is 33incremented when a wait is satisfied, and ownership is set to the given 34identifier. 35 36A mutex also holds an internal flag denoting whether its previous owner 37has died; such a mutex is said to be abandoned. Owner death is not 38tracked automatically based on thread death, but rather must be 39communicated using ``NTSYNC_IOC_MUTEX_KILL``. An abandoned mutex is 40inherently considered unowned. 41 42Except for the "unowned" semantics of zero, the actual value of the 43owner identifier is not interpreted by the ntsync driver at all. The 44intended use is to store a thread identifier; however, the ntsync 45driver does not actually validate that a calling thread provides 46consistent or unique identifiers. 47 48An event is similar to a semaphore with a maximum count of one. It holds 49a volatile boolean state denoting whether it is signaled or not. There 50are two types of events, auto-reset and manual-reset. An auto-reset 51event is designaled when a wait is satisfied; a manual-reset event is 52not. The event type is specified when the event is created. 53 54Unless specified otherwise, all operations on an object are atomic and 55totally ordered with respect to other operations on the same object. 56 57Objects are represented by files. When all file descriptors to an 58object are closed, that object is deleted. 59 60Char device 61=========== 62 63The ntsync driver creates a single char device /dev/ntsync. Each file 64description opened on the device represents a unique instance intended 65to back an individual NT virtual machine. Objects created by one ntsync 66instance may only be used with other objects created by the same 67instance. 68 69ioctl reference 70=============== 71 72All operations on the device are done through ioctls. There are four 73structures used in ioctl calls:: 74 75 struct ntsync_sem_args { 76 __u32 count; 77 __u32 max; 78 }; 79 80 struct ntsync_mutex_args { 81 __u32 owner; 82 __u32 count; 83 }; 84 85 struct ntsync_event_args { 86 __u32 signaled; 87 __u32 manual; 88 }; 89 90 struct ntsync_wait_args { 91 __u64 timeout; 92 __u64 objs; 93 __u32 count; 94 __u32 owner; 95 __u32 index; 96 __u32 alert; 97 __u32 flags; 98 __u32 pad; 99 }; 100 101Depending on the ioctl, members of the structure may be used as input, 102output, or not at all. 103 104The ioctls on the device file are as follows: 105 106.. c:macro:: NTSYNC_IOC_CREATE_SEM 107 108 Create a semaphore object. Takes a pointer to struct 109 :c:type:`ntsync_sem_args`, which is used as follows: 110 111 .. list-table:: 112 113 * - ``count`` 114 - Initial count of the semaphore. 115 * - ``max`` 116 - Maximum count of the semaphore. 117 118 Fails with ``EINVAL`` if ``count`` is greater than ``max``. 119 On success, returns a file descriptor the created semaphore. 120 121.. c:macro:: NTSYNC_IOC_CREATE_MUTEX 122 123 Create a mutex object. Takes a pointer to struct 124 :c:type:`ntsync_mutex_args`, which is used as follows: 125 126 .. list-table:: 127 128 * - ``count`` 129 - Initial recursion count of the mutex. 130 * - ``owner`` 131 - Initial owner of the mutex. 132 133 If ``owner`` is nonzero and ``count`` is zero, or if ``owner`` is 134 zero and ``count`` is nonzero, the function fails with ``EINVAL``. 135 On success, returns a file descriptor the created mutex. 136 137.. c:macro:: NTSYNC_IOC_CREATE_EVENT 138 139 Create an event object. Takes a pointer to struct 140 :c:type:`ntsync_event_args`, which is used as follows: 141 142 .. list-table:: 143 144 * - ``signaled`` 145 - If nonzero, the event is initially signaled, otherwise 146 nonsignaled. 147 * - ``manual`` 148 - If nonzero, the event is a manual-reset event, otherwise 149 auto-reset. 150 151 On success, returns a file descriptor the created event. 152 153The ioctls on the individual objects are as follows: 154 155.. c:macro:: NTSYNC_IOC_SEM_POST 156 157 Post to a semaphore object. Takes a pointer to a 32-bit integer, 158 which on input holds the count to be added to the semaphore, and on 159 output contains its previous count. 160 161 If adding to the semaphore's current count would raise the latter 162 past the semaphore's maximum count, the ioctl fails with 163 ``EOVERFLOW`` and the semaphore is not affected. If raising the 164 semaphore's count causes it to become signaled, eligible threads 165 waiting on this semaphore will be woken and the semaphore's count 166 decremented appropriately. 167 168.. c:macro:: NTSYNC_IOC_MUTEX_UNLOCK 169 170 Release a mutex object. Takes a pointer to struct 171 :c:type:`ntsync_mutex_args`, which is used as follows: 172 173 .. list-table:: 174 175 * - ``owner`` 176 - Specifies the owner trying to release this mutex. 177 * - ``count`` 178 - On output, contains the previous recursion count. 179 180 If ``owner`` is zero, the ioctl fails with ``EINVAL``. If ``owner`` 181 is not the current owner of the mutex, the ioctl fails with 182 ``EPERM``. 183 184 The mutex's count will be decremented by one. If decrementing the 185 mutex's count causes it to become zero, the mutex is marked as 186 unowned and signaled, and eligible threads waiting on it will be 187 woken as appropriate. 188 189.. c:macro:: NTSYNC_IOC_SET_EVENT 190 191 Signal an event object. Takes a pointer to a 32-bit integer, which on 192 output contains the previous state of the event. 193 194 Eligible threads will be woken, and auto-reset events will be 195 designaled appropriately. 196 197.. c:macro:: NTSYNC_IOC_RESET_EVENT 198 199 Designal an event object. Takes a pointer to a 32-bit integer, which 200 on output contains the previous state of the event. 201 202.. c:macro:: NTSYNC_IOC_PULSE_EVENT 203 204 Wake threads waiting on an event object while leaving it in an 205 unsignaled state. Takes a pointer to a 32-bit integer, which on 206 output contains the previous state of the event. 207 208 A pulse operation can be thought of as a set followed by a reset, 209 performed as a single atomic operation. If two threads are waiting on 210 an auto-reset event which is pulsed, only one will be woken. If two 211 threads are waiting a manual-reset event which is pulsed, both will 212 be woken. However, in both cases, the event will be unsignaled 213 afterwards, and a simultaneous read operation will always report the 214 event as unsignaled. 215 216.. c:macro:: NTSYNC_IOC_READ_SEM 217 218 Read the current state of a semaphore object. Takes a pointer to 219 struct :c:type:`ntsync_sem_args`, which is used as follows: 220 221 .. list-table:: 222 223 * - ``count`` 224 - On output, contains the current count of the semaphore. 225 * - ``max`` 226 - On output, contains the maximum count of the semaphore. 227 228.. c:macro:: NTSYNC_IOC_READ_MUTEX 229 230 Read the current state of a mutex object. Takes a pointer to struct 231 :c:type:`ntsync_mutex_args`, which is used as follows: 232 233 .. list-table:: 234 235 * - ``owner`` 236 - On output, contains the current owner of the mutex, or zero 237 if the mutex is not currently owned. 238 * - ``count`` 239 - On output, contains the current recursion count of the mutex. 240 241 If the mutex is marked as abandoned, the function fails with 242 ``EOWNERDEAD``. In this case, ``count`` and ``owner`` are set to 243 zero. 244 245.. c:macro:: NTSYNC_IOC_READ_EVENT 246 247 Read the current state of an event object. Takes a pointer to struct 248 :c:type:`ntsync_event_args`, which is used as follows: 249 250 .. list-table:: 251 252 * - ``signaled`` 253 - On output, contains the current state of the event. 254 * - ``manual`` 255 - On output, contains 1 if the event is a manual-reset event, 256 and 0 otherwise. 257 258.. c:macro:: NTSYNC_IOC_KILL_OWNER 259 260 Mark a mutex as unowned and abandoned if it is owned by the given 261 owner. Takes an input-only pointer to a 32-bit integer denoting the 262 owner. If the owner is zero, the ioctl fails with ``EINVAL``. If the 263 owner does not own the mutex, the function fails with ``EPERM``. 264 265 Eligible threads waiting on the mutex will be woken as appropriate 266 (and such waits will fail with ``EOWNERDEAD``, as described below). 267 268.. c:macro:: NTSYNC_IOC_WAIT_ANY 269 270 Poll on any of a list of objects, atomically acquiring at most one. 271 Takes a pointer to struct :c:type:`ntsync_wait_args`, which is 272 used as follows: 273 274 .. list-table:: 275 276 * - ``timeout`` 277 - Absolute timeout in nanoseconds. If ``NTSYNC_WAIT_REALTIME`` 278 is set, the timeout is measured against the REALTIME clock; 279 otherwise it is measured against the MONOTONIC clock. If the 280 timeout is equal to or earlier than the current time, the 281 function returns immediately without sleeping. If ``timeout`` 282 is U64_MAX, the function will sleep until an object is 283 signaled, and will not fail with ``ETIMEDOUT``. 284 * - ``objs`` 285 - Pointer to an array of ``count`` file descriptors 286 (specified as an integer so that the structure has the same 287 size regardless of architecture). If any object is 288 invalid, the function fails with ``EINVAL``. 289 * - ``count`` 290 - Number of objects specified in the ``objs`` array. 291 If greater than ``NTSYNC_MAX_WAIT_COUNT``, the function fails 292 with ``EINVAL``. 293 * - ``owner`` 294 - Mutex owner identifier. If any object in ``objs`` is a mutex, 295 the ioctl will attempt to acquire that mutex on behalf of 296 ``owner``. If ``owner`` is zero, the ioctl fails with 297 ``EINVAL``. 298 * - ``index`` 299 - On success, contains the index (into ``objs``) of the object 300 which was signaled. If ``alert`` was signaled instead, 301 this contains ``count``. 302 * - ``alert`` 303 - Optional event object file descriptor. If nonzero, this 304 specifies an "alert" event object which, if signaled, will 305 terminate the wait. If nonzero, the identifier must point to a 306 valid event. 307 * - ``flags`` 308 - Zero or more flags. Currently the only flag is 309 ``NTSYNC_WAIT_REALTIME``, which causes the timeout to be 310 measured against the REALTIME clock instead of MONOTONIC. 311 * - ``pad`` 312 - Unused, must be set to zero. 313 314 This function attempts to acquire one of the given objects. If unable 315 to do so, it sleeps until an object becomes signaled, subsequently 316 acquiring it, or the timeout expires. In the latter case the ioctl 317 fails with ``ETIMEDOUT``. The function only acquires one object, even 318 if multiple objects are signaled. 319 320 A semaphore is considered to be signaled if its count is nonzero, and 321 is acquired by decrementing its count by one. A mutex is considered 322 to be signaled if it is unowned or if its owner matches the ``owner`` 323 argument, and is acquired by incrementing its recursion count by one 324 and setting its owner to the ``owner`` argument. An auto-reset event 325 is acquired by designaling it; a manual-reset event is not affected 326 by acquisition. 327 328 Acquisition is atomic and totally ordered with respect to other 329 operations on the same object. If two wait operations (with different 330 ``owner`` identifiers) are queued on the same mutex, only one is 331 signaled. If two wait operations are queued on the same semaphore, 332 and a value of one is posted to it, only one is signaled. 333 334 If an abandoned mutex is acquired, the ioctl fails with 335 ``EOWNERDEAD``. Although this is a failure return, the function may 336 otherwise be considered successful. The mutex is marked as owned by 337 the given owner (with a recursion count of 1) and as no longer 338 abandoned, and ``index`` is still set to the index of the mutex. 339 340 The ``alert`` argument is an "extra" event which can terminate the 341 wait, independently of all other objects. 342 343 It is valid to pass the same object more than once, including by 344 passing the same event in the ``objs`` array and in ``alert``. If a 345 wakeup occurs due to that object being signaled, ``index`` is set to 346 the lowest index corresponding to that object. 347 348 The function may fail with ``EINTR`` if a signal is received. 349 350.. c:macro:: NTSYNC_IOC_WAIT_ALL 351 352 Poll on a list of objects, atomically acquiring all of them. Takes a 353 pointer to struct :c:type:`ntsync_wait_args`, which is used 354 identically to ``NTSYNC_IOC_WAIT_ANY``, except that ``index`` is 355 always filled with zero on success if not woken via alert. 356 357 This function attempts to simultaneously acquire all of the given 358 objects. If unable to do so, it sleeps until all objects become 359 simultaneously signaled, subsequently acquiring them, or the timeout 360 expires. In the latter case the ioctl fails with ``ETIMEDOUT`` and no 361 objects are modified. 362 363 Objects may become signaled and subsequently designaled (through 364 acquisition by other threads) while this thread is sleeping. Only 365 once all objects are simultaneously signaled does the ioctl acquire 366 them and return. The entire acquisition is atomic and totally ordered 367 with respect to other operations on any of the given objects. 368 369 If an abandoned mutex is acquired, the ioctl fails with 370 ``EOWNERDEAD``. Similarly to ``NTSYNC_IOC_WAIT_ANY``, all objects are 371 nevertheless marked as acquired. Note that if multiple mutex objects 372 are specified, there is no way to know which were marked as 373 abandoned. 374 375 As with "any" waits, the ``alert`` argument is an "extra" event which 376 can terminate the wait. Critically, however, an "all" wait will 377 succeed if all members in ``objs`` are signaled, *or* if ``alert`` is 378 signaled. In the latter case ``index`` will be set to ``count``. As 379 with "any" waits, if both conditions are filled, the former takes 380 priority, and objects in ``objs`` will be acquired. 381 382 Unlike ``NTSYNC_IOC_WAIT_ANY``, it is not valid to pass the same 383 object more than once, nor is it valid to pass the same object in 384 ``objs`` and in ``alert``. If this is attempted, the function fails 385 with ``EINVAL``. 386