1.\" 2.\" Copyright 2000 Massachusetts Institute of Technology 3.\" 4.\" Permission to use, copy, modify, and distribute this software and 5.\" its documentation for any purpose and without fee is hereby 6.\" granted, provided that both the above copyright notice and this 7.\" permission notice appear in all copies, that both the above 8.\" copyright notice and this permission notice appear in all 9.\" supporting documentation, and that the name of M.I.T. not be used 10.\" in advertising or publicity pertaining to distribution of the 11.\" software without specific, written prior permission. M.I.T. makes 12.\" no representations about the suitability of this software for any 13.\" purpose. It is provided "as is" without express or implied 14.\" warranty. 15.\" 16.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''. M.I.T. DISCLAIMS 17.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE, 18.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 19.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT 20.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 21.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 22.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF 23.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 24.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT 26.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 27.\" SUCH DAMAGE. 28.\" 29.Dd March 26, 2025 30.Dt SHM_OPEN 2 31.Os 32.Sh NAME 33.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink 34.Nd "shared memory object operations" 35.Sh LIBRARY 36.Lb libc 37.Sh SYNOPSIS 38.In sys/types.h 39.In sys/mman.h 40.In fcntl.h 41.Ft int 42.Fn memfd_create "const char *name" "unsigned int flags" 43.Ft int 44.Fo shm_create_largepage 45.Fa "const char *path" 46.Fa "int flags" 47.Fa "int psind" 48.Fa "int alloc_policy" 49.Fa "mode_t mode" 50.Fc 51.Ft int 52.Fn shm_open "const char *path" "int flags" "mode_t mode" 53.Ft int 54.Fn shm_rename "const char *path_from" "const char *path_to" "int flags" 55.Ft int 56.Fn shm_unlink "const char *path" 57.Sh DESCRIPTION 58The 59.Fn shm_open 60function opens (or optionally creates) a 61POSIX 62shared memory object named 63.Fa path . 64The 65.Fa flags 66argument contains a subset of the flags used by 67.Xr open 2 . 68An access mode of either 69.Dv O_RDONLY 70or 71.Dv O_RDWR 72must be included in 73.Fa flags . 74The optional flags 75.Dv O_CREAT , 76.Dv O_EXCL , 77.Dv O_TRUNC , 78and 79.Dv O_CLOFORK 80may also be specified. 81.Pp 82If 83.Dv O_CREAT 84is specified, 85then a new shared memory object named 86.Fa path 87will be created if it does not exist. 88In this case, 89the shared memory object is created with mode 90.Fa mode 91subject to the process' umask value. 92If both the 93.Dv O_CREAT 94and 95.Dv O_EXCL 96flags are specified and a shared memory object named 97.Fa path 98already exists, 99then 100.Fn shm_open 101will fail with 102.Er EEXIST . 103.Pp 104Newly created objects start off with a size of zero. 105If an existing shared memory object is opened with 106.Dv O_RDWR 107and the 108.Dv O_TRUNC 109flag is specified, 110then the shared memory object will be truncated to a size of zero. 111The size of the object can be adjusted via 112.Xr ftruncate 2 113and queried via 114.Xr fstat 2 . 115.Pp 116The new descriptor is set to close during 117.Xr execve 2 118system calls; 119see 120.Xr close 2 121and 122.Xr fcntl 2 . 123.Pp 124The constant 125.Dv SHM_ANON 126may be used for the 127.Fa path 128argument to 129.Fn shm_open . 130In this case, an anonymous, unnamed shared memory object is created. 131Since the object has no name, 132it cannot be removed via a subsequent call to 133.Fn shm_unlink , 134or moved with a call to 135.Fn shm_rename . 136Instead, 137the shared memory object will be garbage collected when the last reference to 138the shared memory object is removed. 139The shared memory object may be shared with other processes by sharing the 140file descriptor via 141.Xr fork 2 142or 143.Xr sendmsg 2 . 144Attempting to open an anonymous shared memory object with 145.Dv O_RDONLY 146will fail with 147.Er EINVAL . 148All other flags are ignored. 149.Pp 150The 151.Fn shm_create_largepage 152function behaves similarly to 153.Fn shm_open , 154except that the 155.Dv O_CREAT 156flag is implicitly specified, and the returned 157.Dq largepage 158object is always backed by aligned, physically contiguous chunks of memory. 159This ensures that the object can be mapped using so-called 160.Dq superpages , 161which can improve application performance in some workloads by reducing the 162number of translation lookaside buffer (TLB) entries required to access a 163mapping of the object, 164and by reducing the number of page faults performed when accessing a mapping. 165This happens automatically for all largepage objects. 166.Pp 167An existing largepage object can be opened using the 168.Fn shm_open 169function. 170Largepage shared memory objects behave slightly differently from non-largepage 171objects: 172.Bl -bullet -offset indent 173.It 174Memory for a largepage object is allocated when the object is 175extended using the 176.Xr ftruncate 2 177system call, whereas memory for regular shared memory objects is allocated 178lazily and may be paged out to a swap device when not in use. 179.It 180The size of a mapping of a largepage object must be a multiple of the 181underlying large page size. 182Most attributes of such a mapping can only be modified at the granularity 183of the large page size. 184For example, when using 185.Xr munmap 2 186to unmap a portion of a largepage object mapping, or when using 187.Xr mprotect 2 188to adjust protections of a mapping of a largepage object, the starting address 189must be large page size-aligned, and the length of the operation must be a 190multiple of the large page size. 191If not, the corresponding system call will fail and set 192.Va errno 193to 194.Er EINVAL . 195.El 196.Pp 197The 198.Fa psind 199argument to 200.Fn shm_create_largepage 201specifies the size of large pages used to back the object. 202This argument is an index into the page sizes array returned by 203.Xr getpagesizes 3 . 204In particular, all large pages backing a largepage object must be of the 205same size. 206For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage 207object will consist of either 1024 2MB pages, or 2 1GB pages, depending on 208the value specified for the 209.Fa psind 210argument. 211The 212.Fa alloc_policy 213parameter specifies what happens when an attempt to use 214.Xr ftruncate 2 215to allocate memory for the object fails. 216The following values are accepted: 217.Bl -tag -offset indent -width SHM_ 218.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT 219If the (non-blocking) memory allocation fails because there is insufficient free 220contiguous memory, the kernel will attempt to defragment physical memory and 221try another allocation. 222The subsequent allocation may or may not succeed. 223If this subsequent allocation also fails, 224.Xr ftruncate 2 225will fail and set 226.Va errno 227to 228.Er ENOMEM . 229.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT 230If the memory allocation fails, 231.Xr ftruncate 2 232will fail and set 233.Va errno 234to 235.Er ENOMEM . 236.It Dv SHM_LARGEPAGE_ALLOC_HARD 237The kernel will attempt defragmentation until the allocation succeeds, 238or an unblocked signal is delivered to the thread. 239However, it is possible for physical memory to be fragmented such that the 240allocation will never succeed. 241.El 242.Pp 243The 244.Dv FIOSSHMLPGCNF 245and 246.Dv FIOGSHMLPGCNF 247.Xr ioctl 2 248commands can be used with a largepage shared memory object to get and set 249largepage object parameters. 250Both commands operate on the following structure: 251.Bd -literal 252struct shm_largepage_conf { 253 int psind; 254 int alloc_policy; 255}; 256 257.Ed 258The 259.Dv FIOGSHMLPGCNF 260command populates this structure with the current values of these parameters, 261while the 262.Dv FIOSSHMLPGCNF 263command modifies the largepage object. 264Currently only the 265.Va alloc_policy 266parameter may be modified. 267Internally, 268.Fn shm_create_largepage 269works by creating a regular shared memory object using 270.Fn shm_open , 271and then converting it into a largepage object using the 272.Dv FIOSSHMLPGCNF 273ioctl command. 274.Pp 275The 276.Fn shm_rename 277system call atomically removes a shared memory object named 278.Fa path_from 279and relinks it at 280.Fa path_to . 281If another object is already linked at 282.Fa path_to , 283that object will be unlinked, unless one of the following flags are provided: 284.Bl -tag -offset indent -width Er 285.It Er SHM_RENAME_EXCHANGE 286Atomically exchange the shms at 287.Fa path_from 288and 289.Fa path_to . 290.It Er SHM_RENAME_NOREPLACE 291Return an error if an shm exists at 292.Fa path_to , 293rather than unlinking it. 294.El 295.Pp 296The 297.Fn shm_unlink 298system call removes a shared memory object named 299.Fa path . 300.Pp 301The 302.Fn memfd_create 303function creates an anonymous shared memory object, identical to that created 304by 305.Fn shm_open 306when 307.Dv SHM_ANON 308is specified. 309Newly created objects start off with a size of zero. 310The size of the new object must be adjusted via 311.Xr ftruncate 2 . 312.Pp 313The 314.Fa name 315argument must not be 316.Dv NULL , 317but it may be an empty string. 318The length of the 319.Fa name 320argument may not exceed 321.Dv NAME_MAX 322minus six characters for the prefix 323.Dq memfd: , 324which will be prepended. 325The 326.Fa name 327argument is intended solely for debugging purposes and will never be used by the 328kernel to identify a memfd. 329Names are therefore not required to be unique. 330.Pp 331The following 332.Fa flags 333may be specified to 334.Fn memfd_create : 335.Bl -tag -width MFD_ALLOW_SEALING 336.It Dv MFD_CLOEXEC 337Set 338.Dv FD_CLOEXEC 339on the resulting file descriptor. 340.It Dv MFD_ALLOW_SEALING 341Allow adding seals to the resulting file descriptor using the 342.Dv F_ADD_SEALS 343.Xr fcntl 2 344command. 345.It Dv MFD_HUGETLB 346Create a memfd backed by a 347.Dq largepage 348object. 349One of the 350.Dv MFD_HUGE_* 351flags defined in 352.In sys/mman.h 353may be included to specify a fixed size. 354If a specific size is not requested, the smallest supported large page size is 355selected. 356.Pp 357The behavior documented above for the 358.Fn shm_create_largepage 359.Fa psind 360argument also applies to largepage objects created by 361.Fn memfd_create , 362and the 363.Dv SHM_LARGEPAGE_ALLOC_DEFAULT 364policy will always be used. 365.El 366.Sh RETURN VALUES 367If successful, 368.Fn memfd_create 369and 370.Fn shm_open 371both return a non-negative integer, 372and 373.Fn shm_rename 374and 375.Fn shm_unlink 376return zero. 377All functions return -1 on failure, and set 378.Va errno 379to indicate the error. 380.Sh COMPATIBILITY 381The 382.Fn shm_create_largepage 383and 384.Fn shm_rename 385functions are 386.Fx 387extensions, as is support for the 388.Dv SHM_ANON 389value in 390.Fn shm_open . 391.Pp 392The 393.Fa path , 394.Fa path_from , 395and 396.Fa path_to 397arguments do not necessarily represent a pathname (although they do in 398most other implementations). 399Two processes opening the same 400.Fa path 401are guaranteed to access the same shared memory object if and only if 402.Fa path 403begins with a slash 404.Pq Ql \&/ 405character. 406.Pp 407Only the 408.Dv O_RDONLY , 409.Dv O_RDWR , 410.Dv O_CREAT , 411.Dv O_EXCL , 412and 413.Dv O_TRUNC 414flags may be used in portable programs. 415.Pp 416POSIX 417specifications state that the result of using 418.Xr open 2 , 419.Xr read 2 , 420or 421.Xr write 2 422on a shared memory object, or on the descriptor returned by 423.Fn shm_open , 424is undefined. 425However, the 426.Fx 427kernel implementation explicitly includes support for 428.Xr read 2 429and 430.Xr write 2 . 431.Pp 432.Fx 433also supports zero-copy transmission of data from shared memory 434objects with 435.Xr sendfile 2 . 436.Pp 437Neither shared memory objects nor their contents persist across reboots. 438.Pp 439Writes do not extend shared memory objects, so 440.Xr ftruncate 2 441must be called before any data can be written. 442See 443.Sx EXAMPLES . 444.Sh EXAMPLES 445This example fails without the call to 446.Xr ftruncate 2 : 447.Bd -literal -compact 448 449 uint8_t buffer[getpagesize()]; 450 ssize_t len; 451 int fd; 452 453 fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600); 454 if (fd < 0) 455 err(EX_OSERR, "%s: shm_open", __func__); 456 if (ftruncate(fd, getpagesize()) < 0) 457 err(EX_IOERR, "%s: ftruncate", __func__); 458 len = pwrite(fd, buffer, getpagesize(), 0); 459 if (len < 0) 460 err(EX_IOERR, "%s: pwrite", __func__); 461 if (len != getpagesize()) 462 errx(EX_IOERR, "%s: pwrite length mismatch", __func__); 463.Ed 464.Sh ERRORS 465.Fn memfd_create 466fails with these error codes for these conditions: 467.Bl -tag -width Er 468.It Bq Er EBADF 469The 470.Fa name 471argument was NULL. 472.It Bq Er EINVAL 473The 474.Fa name 475argument was too long. 476.Pp 477An invalid or unsupported flag was included in 478.Fa flags . 479.It Bq Er EINVAL 480A hugetlb mapping was requested, but 481.Dv MFD_HUGETLB 482was not specified in 483.Fa flags . 484.It Bq Er EMFILE 485The process has already reached its limit for open file descriptors. 486.It Bq Er ENFILE 487The system file table is full. 488.It Bq Er ENOSYS 489.Dv MFD_HUGETLB 490was specified in 491.Fa flags , 492and this system does not support forced hugetlb mappings. 493.It Bq Er EOPNOTSUPP 494This system does not support the requested hugetlb page size. 495.El 496.Pp 497.Fn shm_open 498fails with these error codes for these conditions: 499.Bl -tag -width Er 500.It Bq Er EINVAL 501A flag other than 502.Dv O_RDONLY , 503.Dv O_RDWR , 504.Dv O_CREAT , 505.Dv O_EXCL , 506or 507.Dv O_TRUNC 508was included in 509.Fa flags . 510.It Bq Er EMFILE 511The process has already reached its limit for open file descriptors. 512.It Bq Er ENFILE 513The system file table is full. 514.It Bq Er EINVAL 515.Dv O_RDONLY 516was specified while creating an anonymous shared memory object via 517.Dv SHM_ANON . 518.It Bq Er EFAULT 519The 520.Fa path 521argument points outside the process' allocated address space. 522.It Bq Er ENAMETOOLONG 523The entire pathname exceeds 1023 characters. 524.It Bq Er EINVAL 525The 526.Fa path 527does not begin with a slash 528.Pq Ql \&/ 529character. 530.It Bq Er ENOENT 531.Dv O_CREAT 532is not specified and the named shared memory object does not exist. 533.It Bq Er EEXIST 534.Dv O_CREAT 535and 536.Dv O_EXCL 537are specified and the named shared memory object does exist. 538.It Bq Er EACCES 539The required permissions (for reading or reading and writing) are denied. 540.It Bq Er ECAPMODE 541The process is running in capability mode (see 542.Xr capsicum 4 ) 543and attempted to create a named shared memory object. 544.El 545.Pp 546.Fn shm_create_largepage 547can fail for the reasons listed above. 548It also fails with these error codes for the following conditions: 549.Bl -tag -width Er 550.It Bq Er ENOTTY 551The kernel does not support large pages on the current platform. 552.El 553.Pp 554The following errors are defined for 555.Fn shm_rename : 556.Bl -tag -width Er 557.It Bq Er EFAULT 558The 559.Fa path_from 560or 561.Fa path_to 562argument points outside the process' allocated address space. 563.It Bq Er ENAMETOOLONG 564The entire pathname exceeds 1023 characters. 565.It Bq Er ENOENT 566The shared memory object at 567.Fa path_from 568does not exist. 569.It Bq Er EACCES 570The required permissions are denied. 571.It Bq Er EEXIST 572An shm exists at 573.Fa path_to , 574and the 575.Dv SHM_RENAME_NOREPLACE 576flag was provided. 577.El 578.Pp 579.Fn shm_unlink 580fails with these error codes for these conditions: 581.Bl -tag -width Er 582.It Bq Er EFAULT 583The 584.Fa path 585argument points outside the process' allocated address space. 586.It Bq Er ENAMETOOLONG 587The entire pathname exceeds 1023 characters. 588.It Bq Er ENOENT 589The named shared memory object does not exist. 590.It Bq Er EACCES 591The required permissions are denied. 592.Fn shm_unlink 593requires write permission to the shared memory object. 594.El 595.Sh SEE ALSO 596.Xr posixshmcontrol 1 , 597.Xr close 2 , 598.Xr fstat 2 , 599.Xr ftruncate 2 , 600.Xr ioctl 2 , 601.Xr mmap 2 , 602.Xr munmap 2 , 603.Xr sendfile 2 604.Sh STANDARDS 605The 606.Fn memfd_create 607function is expected to be compatible with the Linux system call of the same 608name. 609.Pp 610The 611.Fn shm_open 612and 613.Fn shm_unlink 614functions are believed to conform to 615.St -p1003.1b-93 . 616.Sh HISTORY 617The 618.Fn memfd_create 619function appeared in 620.Fx 13.0 . 621.Pp 622The 623.Fn shm_open 624and 625.Fn shm_unlink 626functions first appeared in 627.Fx 4.3 . 628The functions were reimplemented as system calls using shared memory objects 629directly rather than files in 630.Fx 8.0 . 631.Pp 632.Fn shm_rename 633first appeared in 634.Fx 13.0 635as a 636.Fx 637extension. 638.Sh AUTHORS 639.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org 640(C library support and this manual page) 641.Pp 642.An Matthew Dillon Aq Mt dillon@FreeBSD.org 643.Pq Dv MAP_NOSYNC 644.Pp 645.An Matthew Bryan Aq Mt matthew.bryan@isilon.com 646.Pq Dv shm_rename implementation 647