1.\" Copyright (c) 2003, David G. Lawrence 2.\" All rights reserved. 3.\" 4.\" Redistribution and use in source and binary forms, with or without 5.\" modification, are permitted provided that the following conditions 6.\" are met: 7.\" 1. Redistributions of source code must retain the above copyright 8.\" notice unmodified, this list of conditions, and the following 9.\" disclaimer. 10.\" 2. Redistributions in binary form must reproduce the above copyright 11.\" notice, this list of conditions and the following disclaimer in the 12.\" documentation and/or other materials provided with the distribution. 13.\" 14.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND 15.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 17.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE 18.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 20.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 21.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 22.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 23.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 24.\" SUCH DAMAGE. 25.\" 26.Dd June 24, 2025 27.Dt SENDFILE 2 28.Os 29.Sh NAME 30.Nm sendfile 31.Nd send a file to a socket 32.Sh LIBRARY 33.Lb libc 34.Sh SYNOPSIS 35.In sys/types.h 36.In sys/socket.h 37.In sys/uio.h 38.Ft int 39.Fo sendfile 40.Fa "int fd" "int s" "off_t offset" "size_t nbytes" 41.Fa "struct sf_hdtr *hdtr" "off_t *sbytes" "int flags" 42.Fc 43.Sh DESCRIPTION 44The 45.Fn sendfile 46system call 47sends a regular file or shared memory object specified by descriptor 48.Fa fd 49out a stream socket specified by descriptor 50.Fa s . 51.Pp 52The 53.Fa offset 54argument specifies where to begin in the file. 55Should 56.Fa offset 57fall beyond the end of file, the system will return 58success and report 0 bytes sent as described below. 59The 60.Fa nbytes 61argument specifies how many bytes of the file should be sent, with 0 having the special 62meaning of send until the end of file has been reached. 63.Pp 64An optional header and/or trailer can be sent before and after the file data by specifying 65a pointer to a 66.Vt "struct sf_hdtr" , 67which has the following structure: 68.Pp 69.Bd -literal -offset indent -compact 70struct sf_hdtr { 71 struct iovec *headers; /* pointer to header iovecs */ 72 int hdr_cnt; /* number of header iovecs */ 73 struct iovec *trailers; /* pointer to trailer iovecs */ 74 int trl_cnt; /* number of trailer iovecs */ 75}; 76.Ed 77.Pp 78The 79.Fa headers 80and 81.Fa trailers 82pointers, if 83.Pf non- Dv NULL , 84point to arrays of 85.Vt "struct iovec" 86structures. 87See the 88.Fn writev 89system call for information on the iovec structure. 90The number of iovecs in these 91arrays is specified by 92.Fa hdr_cnt 93and 94.Fa trl_cnt . 95.Pp 96If 97.Pf non- Dv NULL , 98the system will write the total number of bytes sent on the socket to the 99variable pointed to by 100.Fa sbytes . 101.Pp 102The least significant 16 bits of 103.Fa flags 104argument is a bitmap of these values: 105.Bl -tag -offset indent -width "SF_USER_READAHEAD" 106.It Dv SF_NODISKIO 107This flag causes 108.Nm 109to return 110.Er EBUSY 111instead of blocking when a busy page is encountered. 112This rare situation can happen if some other process is now working 113with the same region of the file. 114It is advised to retry the operation after a short period. 115.Pp 116Note that in older 117.Fx 118versions the 119.Dv SF_NODISKIO 120had slightly different notion. 121The flag prevented 122.Nm 123to run I/O operations in case if an invalid (not cached) page is encountered, 124thus avoiding blocking on I/O. 125Starting with 126.Fx 11 127.Nm 128sending files off the 129.Xr ffs 4 130filesystem does not block on I/O 131(see 132.Sx IMPLEMENTATION NOTES 133), so the condition no longer applies. 134However, it is safe if an application utilizes 135.Dv SF_NODISKIO 136and on 137.Er EBUSY 138performs the same action as it did in 139older 140.Fx 141versions, e.g., 142.Xr aio_read 2 , 143.Xr read 2 144or 145.Nm 146in a different context. 147.It Dv SF_NOCACHE 148The data sent to socket will not be cached by the virtual memory system, 149and will be freed directly to the pool of free pages. 150.It Dv SF_USER_READAHEAD 151.Nm 152has some internal heuristics to do readahead when sending data. 153This flag forces 154.Nm 155to override any heuristically calculated readahead and use exactly the 156application specified readahead. 157See 158.Sx SETTING READAHEAD 159for more details on readahead. 160.El 161.Pp 162When using a socket marked for non-blocking I/O, 163.Fn sendfile 164may send fewer bytes than requested. 165In this case, the number of bytes successfully 166written is returned in 167.Fa *sbytes 168(if specified), 169and the error 170.Er EAGAIN 171is returned. 172.Sh SETTING READAHEAD 173.Nm 174uses internal heuristics based on request size and file system layout 175to do readahead. 176Additionally application may request extra readahead. 177The most significant 16 bits of 178.Fa flags 179specify amount of pages that 180.Nm 181may read ahead when reading the file. 182A macro 183.Fn SF_FLAGS 184is provided to combine readahead amount and flags. 185An example showing specifying readahead of 16 pages and 186.Dv SF_NOCACHE 187flag: 188.Pp 189.Bd -literal -offset indent -compact 190 SF_FLAGS(16, SF_NOCACHE) 191.Ed 192.Pp 193.Nm 194will use either application specified readahead or internally calculated, 195whichever is bigger. 196Setting flag 197.Dv SF_USER_READAHEAD 198would turn off any heuristics and set maximum possible readahead length to 199the number of pages specified via flags. 200.Sh IMPLEMENTATION NOTES 201The 202.Fx 203implementation of 204.Fn sendfile 205does not block on disk I/O when it sends a file off the 206.Xr ffs 4 207filesystem. 208The syscall returns success before the actual I/O completes, and data 209is put into the socket later unattended. 210However, the order of data in the socket is preserved, so it is safe 211to do further writes to the socket. 212.Pp 213The 214.Fx 215implementation of 216.Fn sendfile 217is "zero-copy", meaning that it has been optimized so that copying of the file data is avoided. 218.Sh TUNING 219.Ss physical paging buffers 220.Fn sendfile 221uses vnode pager to read file pages into memory. 222The pager uses a pool of physical buffers to run its I/O operations. 223When system runs out of pbufs, sendfile will block and report state 224.Dq Li zonelimit . 225Size of the pool can be tuned with 226.Va vm.vnode_pbufs 227.Xr loader.conf 5 228tunable and can be checked with 229.Xr sysctl 8 230OID of the same name at runtime. 231.Ss sendfile(2) buffers 232On some architectures, this system call internally uses a special 233.Fn sendfile 234buffer 235.Pq Vt "struct sf_buf" 236to handle sending file data to the client. 237If the sending socket is 238blocking, and there are not enough 239.Fn sendfile 240buffers available, 241.Fn sendfile 242will block and report a state of 243.Dq Li sfbufa . 244If the sending socket is non-blocking and there are not enough 245.Fn sendfile 246buffers available, the call will block and wait for the 247necessary buffers to become available before finishing the call. 248.Pp 249The number of 250.Vt sf_buf Ns 's 251allocated should be proportional to the number of nmbclusters used to 252send data to a client via 253.Fn sendfile . 254Tune accordingly to avoid blocking! 255Busy installations that make extensive use of 256.Fn sendfile 257may want to increase these values to be inline with their 258.Va kern.ipc.nmbclusters 259(see 260.Xr tuning 7 261for details). 262.Pp 263The number of 264.Fn sendfile 265buffers available is determined at boot time by either the 266.Va kern.ipc.nsfbufs 267.Xr loader.conf 5 268variable or the 269.Dv NSFBUFS 270kernel configuration tunable. 271The number of 272.Fn sendfile 273buffers scales with 274.Va kern.maxusers . 275The 276.Va kern.ipc.nsfbufsused 277and 278.Va kern.ipc.nsfbufspeak 279read-only 280.Xr sysctl 8 281variables show current and peak 282.Fn sendfile 283buffers usage respectively. 284These values may also be viewed through 285.Nm netstat Fl m . 286.Pp 287If 288.Xr sysctl 8 289OID 290.Va kern.ipc.nsfbufs 291doesn't exist, your architecture does not need to use 292.Fn sendfile 293buffers because their task can be efficiently performed 294by the generic virtual memory structures. 295.Sh RETURN VALUES 296.Rv -std sendfile 297.Sh ERRORS 298.Bl -tag -width Er 299.It Bq Er EAGAIN 300The socket is marked for non-blocking I/O and not all data was sent due to 301the socket buffer being filled. 302If specified, the number of bytes successfully sent will be returned in 303.Fa *sbytes . 304.It Bq Er EBADF 305The 306.Fa fd 307argument 308is not a valid file descriptor. 309.It Bq Er EBADF 310The 311.Fa s 312argument 313is not a valid socket descriptor. 314.It Bq Er EBUSY 315A busy page was encountered and 316.Dv SF_NODISKIO 317had been specified. 318Partial data may have been sent. 319.It Bq Er EFAULT 320An invalid address was specified for an argument. 321.It Bq Er EINTR 322A signal interrupted 323.Fn sendfile 324before it could be completed. 325If specified, the number 326of bytes successfully sent will be returned in 327.Fa *sbytes . 328.It Bq Er EINVAL 329The 330.Fa fd 331argument 332is not a regular file. 333.It Bq Er EINVAL 334The 335.Fa s 336argument 337is not a SOCK_STREAM type socket. 338.It Bq Er EINVAL 339The 340.Fa offset 341argument 342is negative. 343.It Bq Er EIO 344An error occurred while reading from 345.Fa fd . 346.It Bq Er EINTEGRITY 347Corrupted data was detected while reading from 348.Fa fd . 349.It Bq Er ENOTCAPABLE 350The 351.Fa fd 352or the 353.Fa s 354argument has insufficient rights. 355.It Bq Er ENOBUFS 356The system was unable to allocate an internal buffer. 357.It Bq Er ENOTCONN 358The 359.Fa s 360argument 361points to an unconnected socket. 362.It Bq Er ENOTSOCK 363The 364.Fa s 365argument 366is not a socket. 367.It Bq Er EOPNOTSUPP 368The file system for descriptor 369.Fa fd 370does not support 371.Fn sendfile . 372.It Bq Er EPIPE 373The socket peer has closed the connection. 374.El 375.Sh SEE ALSO 376.Xr netstat 1 , 377.Xr open 2 , 378.Xr send 2 , 379.Xr socket 2 , 380.Xr writev 2 , 381.Xr loader.conf 5 , 382.Xr tuning 7 , 383.Xr sysctl 8 384.Rs 385.%A K. Elmeleegy 386.%A A. Chanda 387.%A A. L. Cox 388.%A W. Zwaenepoel 389.%T A Portable Kernel Abstraction for Low-Overhead Ephemeral Mapping Management 390.%J The Proceedings of the 2005 USENIX Annual Technical Conference 391.%P pp 223-236 392.%D 2005 393.Re 394.Sh HISTORY 395The 396.Fn sendfile 397system call 398first appeared in 399.Fx 3.0 . 400This manual page first appeared in 401.Fx 3.1 . 402In 403.Fx 10 404support for sending shared memory descriptors had been introduced. 405In 406.Fx 11 407a non-blocking implementation had been introduced. 408.Sh AUTHORS 409The initial implementation of 410.Fn sendfile 411system call 412and this manual page were written by 413.An David G. Lawrence Aq Mt dg@dglawrence.com . 414The 415.Fx 11 416implementation was written by 417.An Gleb Smirnoff Aq Mt glebius@FreeBSD.org . 418.Sh BUGS 419The 420.Fn sendfile 421system call will not fail, i.e., return 422.Dv -1 423and set 424.Va errno 425to 426.Er EFAULT , 427if provided an invalid address for 428.Fa sbytes . 429The 430.Fn sendfile 431system call does not support SCTP sockets, 432it will return 433.Dv -1 434and set 435.Va errno 436to 437.Er EINVAL . 438