xref: /freebsd/lib/libsys/shm_open.2 (revision a3cefe7f2b4df0f70ff92d4570ce18e517af43ec)
1.\"
2.\" Copyright 2000 Massachusetts Institute of Technology
3.\"
4.\" Permission to use, copy, modify, and distribute this software and
5.\" its documentation for any purpose and without fee is hereby
6.\" granted, provided that both the above copyright notice and this
7.\" permission notice appear in all copies, that both the above
8.\" copyright notice and this permission notice appear in all
9.\" supporting documentation, and that the name of M.I.T. not be used
10.\" in advertising or publicity pertaining to distribution of the
11.\" software without specific, written prior permission.  M.I.T. makes
12.\" no representations about the suitability of this software for any
13.\" purpose.  It is provided "as is" without express or implied
14.\" warranty.
15.\"
16.\" THIS SOFTWARE IS PROVIDED BY M.I.T. ``AS IS''.  M.I.T. DISCLAIMS
17.\" ALL EXPRESS OR IMPLIED WARRANTIES WITH REGARD TO THIS SOFTWARE,
18.\" INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
19.\" MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT
20.\" SHALL M.I.T. BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
21.\" SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
22.\" LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
23.\" USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
24.\" ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
25.\" OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
26.\" OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27.\" SUCH DAMAGE.
28.\"
29.Dd March 26, 2025
30.Dt SHM_OPEN 2
31.Os
32.Sh NAME
33.Nm memfd_create , shm_create_largepage , shm_open , shm_rename, shm_unlink
34.Nd "shared memory object operations"
35.Sh LIBRARY
36.Lb libc
37.Sh SYNOPSIS
38.In sys/types.h
39.In sys/mman.h
40.In fcntl.h
41.Ft int
42.Fn memfd_create "const char *name" "unsigned int flags"
43.Ft int
44.Fo shm_create_largepage
45.Fa "const char *path"
46.Fa "int flags"
47.Fa "int psind"
48.Fa "int alloc_policy"
49.Fa "mode_t mode"
50.Fc
51.Ft int
52.Fn shm_open "const char *path" "int flags" "mode_t mode"
53.Ft int
54.Fn shm_rename "const char *path_from" "const char *path_to" "int flags"
55.Ft int
56.Fn shm_unlink "const char *path"
57.Sh DESCRIPTION
58The
59.Fn shm_open
60function opens (or optionally creates) a
61POSIX
62shared memory object named
63.Fa path .
64The
65.Fa flags
66argument contains a subset of the flags used by
67.Xr open 2 .
68An access mode of either
69.Dv O_RDONLY
70or
71.Dv O_RDWR
72must be included in
73.Fa flags .
74The optional flags
75.Dv O_CREAT ,
76.Dv O_EXCL ,
77.Dv O_TRUNC ,
78and
79.Dv O_CLOFORK
80may also be specified.
81.Pp
82If
83.Dv O_CREAT
84is specified,
85then a new shared memory object named
86.Fa path
87will be created if it does not exist.
88In this case,
89the shared memory object is created with mode
90.Fa mode
91subject to the process' umask value.
92If both the
93.Dv O_CREAT
94and
95.Dv O_EXCL
96flags are specified and a shared memory object named
97.Fa path
98already exists,
99then
100.Fn shm_open
101will fail with
102.Er EEXIST .
103.Pp
104Newly created objects start off with a size of zero.
105If an existing shared memory object is opened with
106.Dv O_RDWR
107and the
108.Dv O_TRUNC
109flag is specified,
110then the shared memory object will be truncated to a size of zero.
111The size of the object can be adjusted via
112.Xr ftruncate 2
113and queried via
114.Xr fstat 2 .
115.Pp
116The new descriptor is set to close during
117.Xr execve 2
118system calls;
119see
120.Xr close 2
121and
122.Xr fcntl 2 .
123.Pp
124The constant
125.Dv SHM_ANON
126may be used for the
127.Fa path
128argument to
129.Fn shm_open .
130In this case, an anonymous, unnamed shared memory object is created.
131Since the object has no name,
132it cannot be removed via a subsequent call to
133.Fn shm_unlink ,
134or moved with a call to
135.Fn shm_rename .
136Instead,
137the shared memory object will be garbage collected when the last reference to
138the shared memory object is removed.
139The shared memory object may be shared with other processes by sharing the
140file descriptor via
141.Xr fork 2
142or
143.Xr sendmsg 2 .
144Attempting to open an anonymous shared memory object with
145.Dv O_RDONLY
146will fail with
147.Er EINVAL .
148All other flags are ignored.
149.Pp
150The
151.Fn shm_create_largepage
152function behaves similarly to
153.Fn shm_open ,
154except that the
155.Dv O_CREAT
156flag is implicitly specified, and the returned
157.Dq largepage
158object is always backed by aligned, physically contiguous chunks of memory.
159This ensures that the object can be mapped using so-called
160.Dq superpages ,
161which can improve application performance in some workloads by reducing the
162number of translation lookaside buffer (TLB) entries required to access a
163mapping of the object,
164and by reducing the number of page faults performed when accessing a mapping.
165This happens automatically for all largepage objects.
166.Pp
167An existing largepage object can be opened using the
168.Fn shm_open
169function.
170Largepage shared memory objects behave slightly differently from non-largepage
171objects:
172.Bl -bullet -offset indent
173.It
174Memory for a largepage object is allocated when the object is
175extended using the
176.Xr ftruncate 2
177system call, whereas memory for regular shared memory objects is allocated
178lazily and may be paged out to a swap device when not in use.
179.It
180The size of a mapping of a largepage object must be a multiple of the
181underlying large page size.
182Most attributes of such a mapping can only be modified at the granularity
183of the large page size.
184For example, when using
185.Xr munmap 2
186to unmap a portion of a largepage object mapping, or when using
187.Xr mprotect 2
188to adjust protections of a mapping of a largepage object, the starting address
189must be large page size-aligned, and the length of the operation must be a
190multiple of the large page size.
191If not, the corresponding system call will fail and set
192.Va errno
193to
194.Er EINVAL .
195.El
196.Pp
197The
198.Fa psind
199argument to
200.Fn shm_create_largepage
201specifies the size of large pages used to back the object.
202This argument is an index into the page sizes array returned by
203.Xr getpagesizes 3 .
204In particular, all large pages backing a largepage object must be of the
205same size.
206For example, on a system with large page sizes of 2MB and 1GB, a 2GB largepage
207object will consist of either 1024 2MB pages, or 2 1GB pages, depending on
208the value specified for the
209.Fa psind
210argument.
211The
212.Fa alloc_policy
213parameter specifies what happens when an attempt to use
214.Xr ftruncate 2
215to allocate memory for the object fails.
216The following values are accepted:
217.Bl -tag -offset indent -width SHM_
218.It Dv SHM_LARGEPAGE_ALLOC_DEFAULT
219If the (non-blocking) memory allocation fails because there is insufficient free
220contiguous memory, the kernel will attempt to defragment physical memory and
221try another allocation.
222The subsequent allocation may or may not succeed.
223If this subsequent allocation also fails,
224.Xr ftruncate 2
225will fail and set
226.Va errno
227to
228.Er ENOMEM .
229.It Dv SHM_LARGEPAGE_ALLOC_NOWAIT
230If the memory allocation fails,
231.Xr ftruncate 2
232will fail and set
233.Va errno
234to
235.Er ENOMEM .
236.It Dv SHM_LARGEPAGE_ALLOC_HARD
237The kernel will attempt defragmentation until the allocation succeeds,
238or an unblocked signal is delivered to the thread.
239However, it is possible for physical memory to be fragmented such that the
240allocation will never succeed.
241.El
242.Pp
243The
244.Dv FIOSSHMLPGCNF
245and
246.Dv FIOGSHMLPGCNF
247.Xr ioctl 2
248commands can be used with a largepage shared memory object to get and set
249largepage object parameters.
250Both commands operate on the following structure:
251.Bd -literal
252struct shm_largepage_conf {
253	int psind;
254	int alloc_policy;
255};
256
257.Ed
258The
259.Dv FIOGSHMLPGCNF
260command populates this structure with the current values of these parameters,
261while the
262.Dv FIOSSHMLPGCNF
263command modifies the largepage object.
264Currently only the
265.Va alloc_policy
266parameter may be modified.
267Internally,
268.Fn shm_create_largepage
269works by creating a regular shared memory object using
270.Fn shm_open ,
271and then converting it into a largepage object using the
272.Dv FIOSSHMLPGCNF
273ioctl command.
274.Pp
275The
276.Fn shm_rename
277system call atomically removes a shared memory object named
278.Fa path_from
279and relinks it at
280.Fa path_to .
281If another object is already linked at
282.Fa path_to ,
283that object will be unlinked, unless one of the following flags are provided:
284.Bl -tag -offset indent -width Er
285.It Er SHM_RENAME_EXCHANGE
286Atomically exchange the shms at
287.Fa path_from
288and
289.Fa path_to .
290.It Er SHM_RENAME_NOREPLACE
291Return an error if an shm exists at
292.Fa path_to ,
293rather than unlinking it.
294.El
295.Pp
296The
297.Fn shm_unlink
298system call removes a shared memory object named
299.Fa path .
300.Pp
301The
302.Fn memfd_create
303function creates an anonymous shared memory object, identical to that created
304by
305.Fn shm_open
306when
307.Dv SHM_ANON
308is specified.
309Newly created objects start off with a size of zero.
310The size of the new object must be adjusted via
311.Xr ftruncate 2 .
312.Pp
313The
314.Fa name
315argument must not be
316.Dv NULL ,
317but it may be an empty string.
318The length of the
319.Fa name
320argument may not exceed
321.Dv NAME_MAX
322minus six characters for the prefix
323.Dq memfd: ,
324which will be prepended.
325The
326.Fa name
327argument is intended solely for debugging purposes and will never be used by the
328kernel to identify a memfd.
329Names are therefore not required to be unique.
330.Pp
331The following
332.Fa flags
333may be specified to
334.Fn memfd_create :
335.Bl -tag -width MFD_ALLOW_SEALING
336.It Dv MFD_CLOEXEC
337Set
338.Dv FD_CLOEXEC
339on the resulting file descriptor.
340.It Dv MFD_ALLOW_SEALING
341Allow adding seals to the resulting file descriptor using the
342.Dv F_ADD_SEALS
343.Xr fcntl 2
344command.
345.It Dv MFD_HUGETLB
346Create a memfd backed by a
347.Dq largepage
348object.
349One of the
350.Dv MFD_HUGE_*
351flags defined in
352.In sys/mman.h
353may be included to specify a fixed size.
354If a specific size is not requested, the smallest supported large page size is
355selected.
356.Pp
357The behavior documented above for the
358.Fn shm_create_largepage
359.Fa psind
360argument also applies to largepage objects created by
361.Fn memfd_create ,
362and the
363.Dv SHM_LARGEPAGE_ALLOC_DEFAULT
364policy will always be used.
365.El
366.Sh RETURN VALUES
367If successful,
368.Fn memfd_create
369and
370.Fn shm_open
371both return a non-negative integer,
372and
373.Fn shm_rename
374and
375.Fn shm_unlink
376return zero.
377All functions return -1 on failure, and set
378.Va errno
379to indicate the error.
380.Sh COMPATIBILITY
381The
382.Fn shm_create_largepage
383and
384.Fn shm_rename
385functions are
386.Fx
387extensions, as is support for the
388.Dv SHM_ANON
389value in
390.Fn shm_open .
391.Pp
392The
393.Fa path ,
394.Fa path_from ,
395and
396.Fa path_to
397arguments do not necessarily represent a pathname (although they do in
398most other implementations).
399Two processes opening the same
400.Fa path
401are guaranteed to access the same shared memory object if and only if
402.Fa path
403begins with a slash
404.Pq Ql \&/
405character.
406.Pp
407Only the
408.Dv O_RDONLY ,
409.Dv O_RDWR ,
410.Dv O_CREAT ,
411.Dv O_EXCL ,
412and
413.Dv O_TRUNC
414flags may be used in portable programs.
415.Pp
416POSIX
417specifications state that the result of using
418.Xr open 2 ,
419.Xr read 2 ,
420or
421.Xr write 2
422on a shared memory object, or on the descriptor returned by
423.Fn shm_open ,
424is undefined.
425However, the
426.Fx
427kernel implementation explicitly includes support for
428.Xr read 2
429and
430.Xr write 2 .
431.Pp
432.Fx
433also supports zero-copy transmission of data from shared memory
434objects with
435.Xr sendfile 2 .
436.Pp
437Neither shared memory objects nor their contents persist across reboots.
438.Pp
439Writes do not extend shared memory objects, so
440.Xr ftruncate 2
441must be called before any data can be written.
442See
443.Sx EXAMPLES .
444.Sh EXAMPLES
445This example fails without the call to
446.Xr ftruncate 2 :
447.Bd -literal -compact
448
449        uint8_t buffer[getpagesize()];
450        ssize_t len;
451        int fd;
452
453        fd = shm_open(SHM_ANON, O_RDWR | O_CREAT, 0600);
454        if (fd < 0)
455                err(EX_OSERR, "%s: shm_open", __func__);
456        if (ftruncate(fd, getpagesize()) < 0)
457                err(EX_IOERR, "%s: ftruncate", __func__);
458        len = pwrite(fd, buffer, getpagesize(), 0);
459        if (len < 0)
460                err(EX_IOERR, "%s: pwrite", __func__);
461        if (len != getpagesize())
462                errx(EX_IOERR, "%s: pwrite length mismatch", __func__);
463.Ed
464.Sh ERRORS
465.Fn memfd_create
466fails with these error codes for these conditions:
467.Bl -tag -width Er
468.It Bq Er EBADF
469The
470.Fa name
471argument was NULL.
472.It Bq Er EINVAL
473The
474.Fa name
475argument was too long.
476.Pp
477An invalid or unsupported flag was included in
478.Fa flags .
479.It Bq Er EINVAL
480A hugetlb mapping was requested, but
481.Dv MFD_HUGETLB
482was not specified in
483.Fa flags .
484.It Bq Er EMFILE
485The process has already reached its limit for open file descriptors.
486.It Bq Er ENFILE
487The system file table is full.
488.It Bq Er ENOSYS
489.Dv MFD_HUGETLB
490was specified in
491.Fa flags ,
492and this system does not support forced hugetlb mappings.
493.It Bq Er EOPNOTSUPP
494This system does not support the requested hugetlb page size.
495.El
496.Pp
497.Fn shm_open
498fails with these error codes for these conditions:
499.Bl -tag -width Er
500.It Bq Er EINVAL
501A flag other than
502.Dv O_RDONLY ,
503.Dv O_RDWR ,
504.Dv O_CREAT ,
505.Dv O_EXCL ,
506or
507.Dv O_TRUNC
508was included in
509.Fa flags .
510.It Bq Er EMFILE
511The process has already reached its limit for open file descriptors.
512.It Bq Er ENFILE
513The system file table is full.
514.It Bq Er EINVAL
515.Dv O_RDONLY
516was specified while creating an anonymous shared memory object via
517.Dv SHM_ANON .
518.It Bq Er EFAULT
519The
520.Fa path
521argument points outside the process' allocated address space.
522.It Bq Er ENAMETOOLONG
523The entire pathname exceeds 1023 characters.
524.It Bq Er EINVAL
525The
526.Fa path
527does not begin with a slash
528.Pq Ql \&/
529character.
530.It Bq Er ENOENT
531.Dv O_CREAT
532is not specified and the named shared memory object does not exist.
533.It Bq Er EEXIST
534.Dv O_CREAT
535and
536.Dv O_EXCL
537are specified and the named shared memory object does exist.
538.It Bq Er EACCES
539The required permissions (for reading or reading and writing) are denied.
540.It Bq Er ECAPMODE
541The process is running in capability mode (see
542.Xr capsicum 4 )
543and attempted to create a named shared memory object.
544.El
545.Pp
546.Fn shm_create_largepage
547can fail for the reasons listed above.
548It also fails with these error codes for the following conditions:
549.Bl -tag -width Er
550.It Bq Er ENOTTY
551The kernel does not support large pages on the current platform.
552.El
553.Pp
554The following errors are defined for
555.Fn shm_rename :
556.Bl -tag -width Er
557.It Bq Er EFAULT
558The
559.Fa path_from
560or
561.Fa path_to
562argument points outside the process' allocated address space.
563.It Bq Er ENAMETOOLONG
564The entire pathname exceeds 1023 characters.
565.It Bq Er ENOENT
566The shared memory object at
567.Fa path_from
568does not exist.
569.It Bq Er EACCES
570The required permissions are denied.
571.It Bq Er EEXIST
572An shm exists at
573.Fa path_to ,
574and the
575.Dv SHM_RENAME_NOREPLACE
576flag was provided.
577.El
578.Pp
579.Fn shm_unlink
580fails with these error codes for these conditions:
581.Bl -tag -width Er
582.It Bq Er EFAULT
583The
584.Fa path
585argument points outside the process' allocated address space.
586.It Bq Er ENAMETOOLONG
587The entire pathname exceeds 1023 characters.
588.It Bq Er ENOENT
589The named shared memory object does not exist.
590.It Bq Er EACCES
591The required permissions are denied.
592.Fn shm_unlink
593requires write permission to the shared memory object.
594.El
595.Sh SEE ALSO
596.Xr posixshmcontrol 1 ,
597.Xr close 2 ,
598.Xr fstat 2 ,
599.Xr ftruncate 2 ,
600.Xr ioctl 2 ,
601.Xr mmap 2 ,
602.Xr munmap 2 ,
603.Xr sendfile 2
604.Sh STANDARDS
605The
606.Fn memfd_create
607function is expected to be compatible with the Linux system call of the same
608name.
609.Pp
610The
611.Fn shm_open
612and
613.Fn shm_unlink
614functions are believed to conform to
615.St -p1003.1b-93 .
616.Sh HISTORY
617The
618.Fn memfd_create
619function appeared in
620.Fx 13.0 .
621.Pp
622The
623.Fn shm_open
624and
625.Fn shm_unlink
626functions first appeared in
627.Fx 4.3 .
628The functions were reimplemented as system calls using shared memory objects
629directly rather than files in
630.Fx 8.0 .
631.Pp
632.Fn shm_rename
633first appeared in
634.Fx 13.0
635as a
636.Fx
637extension.
638.Sh AUTHORS
639.An Garrett A. Wollman Aq Mt wollman@FreeBSD.org
640(C library support and this manual page)
641.Pp
642.An Matthew Dillon Aq Mt dillon@FreeBSD.org
643.Pq Dv MAP_NOSYNC
644.Pp
645.An Matthew Bryan Aq Mt matthew.bryan@isilon.com
646.Pq Dv shm_rename implementation
647