1.\" 2.\" Copyright 2000 Massachusetts Institute of Technology 3.\" 4.\" Permission to use, copy, modify, and distribute this software and 5.\" its documentation for any purpose and without fee is hereby 6.\" granted, provided that both the above copyright notice and this 7.\" permission notice appear in all copies, that both the above 8.\" copyright notice and this permission notice appear in all 9.\" supporting documentation, and that the name of M.I.T. not be used 10.\" in advertising or publicity pertaining to distribution of the 11.\" software without specific, written prior permission. M.I.T. makes 12.\" no representations about the suitability of this software for any 13.\" purpose. It is provided "as is" without express or implied 14.\" warranty. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS 17.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE, 18.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 19.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT 20.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 21.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 22.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF 23.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 24.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 26.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd January 30, 2023 30.Dt SHM_OPEN 2 31.Os 32.Sh NAME 33.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink 34.Nd "shared memory object operations" 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/mman.h 40.In fcntl.h 41.Ft int 42.Fn memfd_create "const char *name" "unsigned int flags" 43.Ft int 44.Fo shm_create_largepage 45.Fa "const char *path" 46.Fa "int flags" 47.Fa "int psind" 48.Fa "int alloc_policy" 49.Fa "mode_t mode" 50.Fc 51.Ft int 52.Fn shm_open "const char *path" "int flags" "mode_t mode" 53.Ft int 54.Fn shm_rename "const char *path_from" "const char *path_to" "int flags" 55.Ft int 56.Fn shm_unlink "const char *path" 57.Sh DESCRIPTION 58The 59.Fn shm_open 60function opens (or optionally creates) a 61POSIX 62shared memory object named 63.Fa path . 64The 65.Fa flags 66argument contains a subset of the flags used by 67.Xr open 2 . 68An access mode of either 69.Dv O_RDONLY 70or 71.Dv O_RDWR 72must be included in 73.Fa flags . 74The optional flags 75.Dv O_CREAT , 76.Dv O_EXCL , 77and 78.Dv O_TRUNC 79may also be specified. 80.Pp 81If 82.Dv O_CREAT 83is specified, 84then a new shared memory object named 85.Fa path 86will be created if it does not exist. 87In this case, 88the shared memory object is created with mode 89.Fa mode 90subject to the process' umask value. 91If both the 92.Dv O_CREAT 93and 94.Dv O_EXCL 95flags are specified and a shared memory object named 96.Fa path 97already exists, 98then 99.Fn shm_open 100will fail with 101.Er EEXIST . 102.Pp 103Newly created objects start off with a size of zero. 104If an existing shared memory object is opened with 105.Dv O_RDWR 106and the 107.Dv O_TRUNC 108flag is specified, 109then the shared memory object will be truncated to a size of zero. 110The size of the object can be adjusted via 111.Xr ftruncate 2 112and queried via 113.Xr fstat 2 . 114.Pp 115The new descriptor is set to close during 116.Xr execve 2 117system calls; 118see 119.Xr close 2 120and 121.Xr fcntl 2 . 122.Pp 123The constant 124.Dv SHM_ANON 125may be used for the 126.Fa path 127argument to 128.Fn shm_open . 129In this case, an anonymous, unnamed shared memory object is created. 130Since the object has no name, 131it cannot be removed via a subsequent call to 132.Fn shm_unlink , 133or moved with a call to 134.Fn shm_rename . 135Instead, 136the shared memory object will be garbage collected when the last reference to 137the shared memory object is removed. 138The shared memory object may be shared with other processes by sharing the 139file descriptor via 140.Xr fork 2 141or 142.Xr sendmsg 2 . 143Attempting to open an anonymous shared memory object with 144.Dv O_RDONLY 145will fail with 146.Er EINVAL . 147All other flags are ignored. 148.Pp 149The 150.Fn shm_create_largepage 151function behaves similarly to 152.Fn shm_open , 153except that the 154.Dv O_CREAT 155flag is implicitly specified, and the returned 156.Dq largepage 157object is always backed by aligned, physically contiguous chunks of memory. 158This ensures that the object can be mapped using so-called 159.Dq superpages , 160which can improve application performance in some workloads by reducing the 161number of translation lookaside buffer (TLB) entries required to access a 162mapping of the object, 163and by reducing the number of page faults performed when accessing a mapping. 164This happens automatically for all largepage objects. 165.Pp 166An existing largepage object can be opened using the 167.Fn shm_open 168function. 169Largepage shared memory objects behave slightly differently from non-largepage 170objects: 171.Bl -bullet -offset indent 172.It 173Memory for a largepage object is allocated when the object is 174extended using the 175.Xr ftruncate 2 176system call, whereas memory for regular shared memory objects is allocated 177lazily and may be paged out to a swap device when not in use. 178.It 179The size of a mapping of a largepage object must be a multiple of the 180underlying large page size. 181Most attributes of such a mapping can only be modified at the granularity 182of the large page size. 183For example, when using 184.Xr munmap 2 185to unmap a portion of a largepage object mapping, or when using 186.Xr mprotect 2 187to adjust protections of a mapping of a largepage object, the starting address 188must be large page size-aligned, and the length of the operation must be a 189multiple of the large page size. 190If not, the corresponding system call will fail and set 191.Va errno 192to 193.Er EINVAL . 194.El 195.Pp 196The 197.Fa psind 198argument to 199.Fn shm_create_largepage 200specifies the size of large pages used to back the object. 201This argument is an index into the page sizes array returned by 202.Xr getpagesizes 3 . 203In particular, all large pages backing a largepage object must be of the 204same size. 205For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage 206object will consist of either 1024 2MB pages, or 2 1GB pages, depending on 207the value specified for the 208.Fa psind 209argument. 210The 211.Fa alloc_policy 212parameter specifies what happens when an attempt to use 213.Xr ftruncate 2 214to allocate memory for the object fails. 215The following values are accepted: 216.Bl -tag -offset indent -width SHM_ 217.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT 218If the (non-blocking) memory allocation fails because there is insufficient free 219contiguous memory, the kernel will attempt to defragment physical memory and 220try another allocation. 221The subsequent allocation may or may not succeed. 222If this subsequent allocation also fails, 223.Xr ftruncate 2 224will fail and set 225.Va errno 226to 227.Er ENOMEM . 228.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT 229If the memory allocation fails, 230.Xr ftruncate 2 231will fail and set 232.Va errno 233to 234.Er ENOMEM . 235.It Dv SHM_LARGEPAGE_ALLOC_HARD 236The kernel will attempt defragmentation until the allocation succeeds, 237or an unblocked signal is delivered to the thread. 238However, it is possible for physical memory to be fragmented such that the 239allocation will never succeed. 240.El 241.Pp 242The 243.Dv FIOSSHMLPGCNF 244and 245.Dv FIOGSHMLPGCNF 246.Xr ioctl 2 247commands can be used with a largepage shared memory object to get and set 248largepage object parameters. 249Both commands operate on the following structure: 250.Bd -literal 251struct shm_largepage_conf { 252 int psind; 253 int alloc_policy; 254}; 255 256.Ed 257The 258.Dv FIOGSHMLPGCNF 259command populates this structure with the current values of these parameters, 260while the 261.Dv FIOSSHMLPGCNF 262command modifies the largepage object. 263Currently only the 264.Va alloc_policy 265parameter may be modified. 266Internally, 267.Fn shm_create_largepage 268works by creating a regular shared memory object using 269.Fn shm_open , 270and then converting it into a largepage object using the 271.Dv FIOSSHMLPGCNF 272ioctl command. 273.Pp 274The 275.Fn shm_rename 276system call atomically removes a shared memory object named 277.Fa path_from 278and relinks it at 279.Fa path_to . 280If another object is already linked at 281.Fa path_to , 282that object will be unlinked, unless one of the following flags are provided: 283.Bl -tag -offset indent -width Er 284.It Er SHM_RENAME_EXCHANGE 285Atomically exchange the shms at 286.Fa path_from 287and 288.Fa path_to . 289.It Er SHM_RENAME_NOREPLACE 290Return an error if an shm exists at 291.Fa path_to , 292rather than unlinking it. 293.El 294.Pp 295The 296.Fn shm_unlink 297system call removes a shared memory object named 298.Fa path . 299.Pp 300The 301.Fn memfd_create 302function creates an anonymous shared memory object, identical to that created 303by 304.Fn shm_open 305when 306.Dv SHM_ANON 307is specified. 308Newly created objects start off with a size of zero. 309The size of the new object must be adjusted via 310.Xr ftruncate 2 . 311.Pp 312The 313.Fa name 314argument must not be 315.Dv NULL , 316but it may be an empty string. 317The length of the 318.Fa name 319argument may not exceed 320.Dv NAME_MAX 321minus six characters for the prefix 322.Dq memfd: , 323which will be prepended. 324The 325.Fa name 326argument is intended solely for debugging purposes and will never be used by the 327kernel to identify a memfd. 328Names are therefore not required to be unique. 329.Pp 330The following 331.Fa flags 332may be specified to 333.Fn memfd_create : 334.Bl -tag -width MFD_ALLOW_SEALING 335.It Dv MFD_CLOEXEC 336Set 337.Dv FD_CLOEXEC 338on the resulting file descriptor. 339.It Dv MFD_ALLOW_SEALING 340Allow adding seals to the resulting file descriptor using the 341.Dv F_ADD_SEALS 342.Xr fcntl 2 343command. 344.It Dv MFD_HUGETLB 345This flag is currently unsupported. 346.El 347.Sh RETURN VALUES 348If successful, 349.Fn memfd_create 350and 351.Fn shm_open 352both return a non-negative integer, 353and 354.Fn shm_rename 355and 356.Fn shm_unlink 357return zero. 358All functions return -1 on failure, and set 359.Va errno 360to indicate the error. 361.Sh COMPATIBILITY 362The 363.Fn shm_create_largepage 364and 365.Fn shm_rename 366functions are 367.Fx 368extensions, as is support for the 369.Dv SHM_ANON 370value in 371.Fn shm_open . 372.Pp 373The 374.Fa path , 375.Fa path_from , 376and 377.Fa path_to 378arguments do not necessarily represent a pathname (although they do in 379most other implementations). 380Two processes opening the same 381.Fa path 382are guaranteed to access the same shared memory object if and only if 383.Fa path 384begins with a slash 385.Pq Ql \&/ 386character. 387.Pp 388Only the 389.Dv O_RDONLY , 390.Dv O_RDWR , 391.Dv O_CREAT , 392.Dv O_EXCL , 393and 394.Dv O_TRUNC 395flags may be used in portable programs. 396.Pp 397POSIX 398specifications state that the result of using 399.Xr open 2 , 400.Xr read 2 , 401or 402.Xr write 2 403on a shared memory object, or on the descriptor returned by 404.Fn shm_open , 405is undefined. 406However, the 407.Fx 408kernel implementation explicitly includes support for 409.Xr read 2 410and 411.Xr write 2 . 412.Pp 413.Fx 414also supports zero-copy transmission of data from shared memory 415objects with 416.Xr sendfile 2 . 417.Pp 418Neither shared memory objects nor their contents persist across reboots. 419.Pp 420Writes do not extend shared memory objects, so 421.Xr ftruncate 2 422must be called before any data can be written. 423See 424.Sx EXAMPLES . 425.Sh EXAMPLES 426This example fails without the call to 427.Xr ftruncate 2 : 428.Bd -literal -compact 429 430 uint8_t buffer[getpagesize()]; 431 ssize_t len; 432 int fd; 433 434 fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600); 435 if (fd < 0) 436 err(EX_OSERR, "%s: shm_open", __func__); 437 if (ftruncate(fd, getpagesize()) < 0) 438 err(EX_IOERR, "%s: ftruncate", __func__); 439 len = pwrite(fd, buffer, getpagesize(), 0); 440 if (len < 0) 441 err(EX_IOERR, "%s: pwrite", __func__); 442 if (len != getpagesize()) 443 errx(EX_IOERR, "%s: pwrite length mismatch", __func__); 444.Ed 445.Sh ERRORS 446.Fn memfd_create 447fails with these error codes for these conditions: 448.Bl -tag -width Er 449.It Bq Er EBADF 450The 451.Fa name 452argument was NULL. 453.It Bq Er EINVAL 454The 455.Fa name 456argument was too long. 457.Pp 458An invalid or unsupported flag was included in 459.Fa flags . 460.It Bq Er EMFILE 461The process has already reached its limit for open file descriptors. 462.It Bq Er ENFILE 463The system file table is full. 464.It Bq Er ENOSYS 465In 466.Fa memfd_create , 467.Dv MFD_HUGETLB 468was specified in 469.Fa flags , 470and this system does not support forced hugetlb mappings. 471.El 472.Pp 473.Fn shm_open 474fails with these error codes for these conditions: 475.Bl -tag -width Er 476.It Bq Er EINVAL 477A flag other than 478.Dv O_RDONLY , 479.Dv O_RDWR , 480.Dv O_CREAT , 481.Dv O_EXCL , 482or 483.Dv O_TRUNC 484was included in 485.Fa flags . 486.It Bq Er EMFILE 487The process has already reached its limit for open file descriptors. 488.It Bq Er ENFILE 489The system file table is full. 490.It Bq Er EINVAL 491.Dv O_RDONLY 492was specified while creating an anonymous shared memory object via 493.Dv SHM_ANON . 494.It Bq Er EFAULT 495The 496.Fa path 497argument points outside the process' allocated address space. 498.It Bq Er ENAMETOOLONG 499The entire pathname exceeds 1023 characters. 500.It Bq Er EINVAL 501The 502.Fa path 503does not begin with a slash 504.Pq Ql \&/ 505character. 506.It Bq Er ENOENT 507.Dv O_CREAT 508is not specified and the named shared memory object does not exist. 509.It Bq Er EEXIST 510.Dv O_CREAT 511and 512.Dv O_EXCL 513are specified and the named shared memory object does exist. 514.It Bq Er EACCES 515The required permissions (for reading or reading and writing) are denied. 516.It Bq Er ECAPMODE 517The process is running in capability mode (see 518.Xr capsicum 4 ) 519and attempted to create a named shared memory object. 520.El 521.Pp 522.Fn shm_create_largepage 523can fail for the reasons listed above. 524It also fails with these error codes for the following conditions: 525.Bl -tag -width Er 526.It Bq Er ENOTTY 527The kernel does not support large pages on the current platform. 528.El 529.Pp 530The following errors are defined for 531.Fn shm_rename : 532.Bl -tag -width Er 533.It Bq Er EFAULT 534The 535.Fa path_from 536or 537.Fa path_to 538argument points outside the process' allocated address space. 539.It Bq Er ENAMETOOLONG 540The entire pathname exceeds 1023 characters. 541.It Bq Er ENOENT 542The shared memory object at 543.Fa path_from 544does not exist. 545.It Bq Er EACCES 546The required permissions are denied. 547.It Bq Er EEXIST 548An shm exists at 549.Fa path_to , 550and the 551.Dv SHM_RENAME_NOREPLACE 552flag was provided. 553.El 554.Pp 555.Fn shm_unlink 556fails with these error codes for these conditions: 557.Bl -tag -width Er 558.It Bq Er EFAULT 559The 560.Fa path 561argument points outside the process' allocated address space. 562.It Bq Er ENAMETOOLONG 563The entire pathname exceeds 1023 characters. 564.It Bq Er ENOENT 565The named shared memory object does not exist. 566.It Bq Er EACCES 567The required permissions are denied. 568.Fn shm_unlink 569requires write permission to the shared memory object. 570.El 571.Sh SEE ALSO 572.Xr posixshmcontrol 1 , 573.Xr close 2 , 574.Xr fstat 2 , 575.Xr ftruncate 2 , 576.Xr ioctl 2 , 577.Xr mmap 2 , 578.Xr munmap 2 , 579.Xr sendfile 2 580.Sh STANDARDS 581The 582.Fn memfd_create 583function is expected to be compatible with the Linux system call of the same 584name. 585.Pp 586The 587.Fn shm_open 588and 589.Fn shm_unlink 590functions are believed to conform to 591.St -p1003.1b-93 . 592.Sh HISTORY 593The 594.Fn memfd_create 595function appeared in 596.Fx 13.0 . 597.Pp 598The 599.Fn shm_open 600and 601.Fn shm_unlink 602functions first appeared in 603.Fx 4.3 . 604The functions were reimplemented as system calls using shared memory objects 605directly rather than files in 606.Fx 8.0 . 607.Pp 608.Fn shm_rename 609first appeared in 610.Fx 13.0 611as a 612.Fx 613extension. 614.Sh AUTHORS 615.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org 616(C library support and this manual page) 617.Pp 618.An Matthew Dillon Aq Mt dillon@FreeBSD.org 619.Pq Dv MAP_NOSYNC 620.Pp 621.An Matthew Bryan Aq Mt matthew.bryan@isilon.com 622.Pq Dv shm_rename implementation 623