xref: /freebsd/share/man/man4/netmap.4 (revision d4d112e34a28aee2571da46afaf93e2469f6a1be)
117885a7bSLuigi Rizzo.\" Copyright (c) 2011-2014 Matteo Landi, Luigi Rizzo, Universita` di Pisa
268b8534bSLuigi Rizzo.\" All rights reserved.
368b8534bSLuigi Rizzo.\"
468b8534bSLuigi Rizzo.\" Redistribution and use in source and binary forms, with or without
568b8534bSLuigi Rizzo.\" modification, are permitted provided that the following conditions
668b8534bSLuigi Rizzo.\" are met:
768b8534bSLuigi Rizzo.\" 1. Redistributions of source code must retain the above copyright
868b8534bSLuigi Rizzo.\"    notice, this list of conditions and the following disclaimer.
968b8534bSLuigi Rizzo.\" 2. Redistributions in binary form must reproduce the above copyright
1068b8534bSLuigi Rizzo.\"    notice, this list of conditions and the following disclaimer in the
1168b8534bSLuigi Rizzo.\"    documentation and/or other materials provided with the distribution.
1268b8534bSLuigi Rizzo.\"
1368b8534bSLuigi Rizzo.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
1468b8534bSLuigi Rizzo.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
1568b8534bSLuigi Rizzo.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
1668b8534bSLuigi Rizzo.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
1768b8534bSLuigi Rizzo.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
1868b8534bSLuigi Rizzo.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
1968b8534bSLuigi Rizzo.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
2068b8534bSLuigi Rizzo.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
2168b8534bSLuigi Rizzo.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
2268b8534bSLuigi Rizzo.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
2368b8534bSLuigi Rizzo.\" SUCH DAMAGE.
2468b8534bSLuigi Rizzo.\"
2568b8534bSLuigi Rizzo.\" This document is derived in part from the enet man page (enet.4)
2668b8534bSLuigi Rizzo.\" distributed with 4.3BSD Unix.
2768b8534bSLuigi Rizzo.\"
2868b8534bSLuigi Rizzo.\" $FreeBSD$
2968b8534bSLuigi Rizzo.\"
30fa7db06bSLuigi Rizzo.Dd February 13, 2014
3168b8534bSLuigi Rizzo.Dt NETMAP 4
3268b8534bSLuigi Rizzo.Os
3368b8534bSLuigi Rizzo.Sh NAME
3468b8534bSLuigi Rizzo.Nm netmap
3568b8534bSLuigi Rizzo.Nd a framework for fast packet I/O
3617885a7bSLuigi Rizzo.br
3717885a7bSLuigi Rizzo.Nm VALE
3817885a7bSLuigi Rizzo.Nd a fast VirtuAl Local Ethernet using the netmap API
39fa7db06bSLuigi Rizzo.br
40fa7db06bSLuigi Rizzo.Nm netmap pipes
41fa7db06bSLuigi Rizzo.Nd a shared memory packet transport channel
4268b8534bSLuigi Rizzo.Sh SYNOPSIS
4368b8534bSLuigi Rizzo.Cd device netmap
4468b8534bSLuigi Rizzo.Sh DESCRIPTION
4568b8534bSLuigi Rizzo.Nm
46ce3ee1e7SLuigi Rizzois a framework for extremely fast and efficient packet I/O
47ce3ee1e7SLuigi Rizzofor both userspace and kernel clients.
4817885a7bSLuigi RizzoIt runs on FreeBSD and Linux,
4917885a7bSLuigi Rizzoand includes
5017885a7bSLuigi Rizzo.Nm VALE ,
51fa7db06bSLuigi Rizzoa very fast and modular in-kernel software switch/dataplane,
5217885a7bSLuigi Rizzoand
53fa7db06bSLuigi Rizzo.Nm netmap pipes ,
54fa7db06bSLuigi Rizzoa shared memory packet transport channel.
55fa7db06bSLuigi RizzoAll these are accessed interchangeably with the same API.
56fa7db06bSLuigi Rizzo.Pp
57fa7db06bSLuigi Rizzo.Nm , VALE
58fa7db06bSLuigi Rizzoand
59fa7db06bSLuigi Rizzo.Nm netmap pipes
60fa7db06bSLuigi Rizzoare at least one order of magnitude faster than
61fa7db06bSLuigi Rizzostandard OS mechanisms
62fa7db06bSLuigi Rizzo(sockets, bpf, tun/tap interfaces, native switches, pipes),
63fa7db06bSLuigi Rizzoreaching 14.88 million packets per second (Mpps)
64fa7db06bSLuigi Rizzowith much less than one core on a 10 Gbit NIC,
65fa7db06bSLuigi Rizzoabout 20 Mpps per core for VALE ports,
66fa7db06bSLuigi Rizzoand over 100 Mpps for netmap pipes.
67ce3ee1e7SLuigi Rizzo.Pp
6817885a7bSLuigi RizzoUserspace clients can dynamically switch NICs into
6968b8534bSLuigi Rizzo.Nm
7017885a7bSLuigi Rizzomode and send and receive raw packets through
7117885a7bSLuigi Rizzomemory mapped buffers.
7217885a7bSLuigi RizzoSimilarly,
7317885a7bSLuigi Rizzo.Nm VALE
74fa7db06bSLuigi Rizzoswitch instances and ports, and
75fa7db06bSLuigi Rizzo.Nm netmap pipes
76fa7db06bSLuigi Rizzocan be created dynamically,
7717885a7bSLuigi Rizzoproviding high speed packet I/O between processes,
7817885a7bSLuigi Rizzovirtual machines, NICs and the host stack.
7917885a7bSLuigi Rizzo.Pp
80fa7db06bSLuigi Rizzo.Nm
81fa7db06bSLuigi Rizzosuports both non-blocking I/O through
82fa7db06bSLuigi Rizzo.Xr ioctls() ,
83fa7db06bSLuigi Rizzosynchronization and blocking I/O through a file descriptor
84fa7db06bSLuigi Rizzoand standard OS mechanisms such as
85fa7db06bSLuigi Rizzo.Xr select 2 ,
86fa7db06bSLuigi Rizzo.Xr poll 2 ,
87fa7db06bSLuigi Rizzo.Xr epoll 2 ,
88fa7db06bSLuigi Rizzo.Xr kqueue 2 .
89fa7db06bSLuigi Rizzo.Nm VALE
90fa7db06bSLuigi Rizzoand
91fa7db06bSLuigi Rizzo.Nm netmap pipes
92fa7db06bSLuigi Rizzoare implemented by a single kernel module, which also emulates the
93fa7db06bSLuigi Rizzo.Nm
94fa7db06bSLuigi RizzoAPI over standard drivers for devices without native
95fa7db06bSLuigi Rizzo.Nm
96fa7db06bSLuigi Rizzosupport.
9717885a7bSLuigi RizzoFor best performance,
9868b8534bSLuigi Rizzo.Nm
99fa7db06bSLuigi Rizzorequires explicit support in device drivers.
100ce3ee1e7SLuigi Rizzo.Pp
10117885a7bSLuigi RizzoIn the rest of this (long) manual page we document
10217885a7bSLuigi Rizzovarious aspects of the
103ce3ee1e7SLuigi Rizzo.Nm
10417885a7bSLuigi Rizzoand
105ce3ee1e7SLuigi Rizzo.Nm VALE
10617885a7bSLuigi Rizzoarchitecture, features and usage.
107ce3ee1e7SLuigi Rizzo.Pp
10817885a7bSLuigi Rizzo.Sh ARCHITECTURE
10917885a7bSLuigi Rizzo.Nm
11017885a7bSLuigi Rizzosupports raw packet I/O through a
11117885a7bSLuigi Rizzo.Em port ,
11217885a7bSLuigi Rizzowhich can be connected to a physical interface
11317885a7bSLuigi Rizzo.Em ( NIC ) ,
11417885a7bSLuigi Rizzoto the host stack,
11517885a7bSLuigi Rizzoor to a
11617885a7bSLuigi Rizzo.Nm VALE
11717885a7bSLuigi Rizzoswitch).
11817885a7bSLuigi RizzoPorts use preallocated circular queues of buffers
11917885a7bSLuigi Rizzo.Em ( rings )
12017885a7bSLuigi Rizzoresiding in an mmapped region.
12117885a7bSLuigi RizzoThere is one ring for each transmit/receive queue of a
12217885a7bSLuigi RizzoNIC or virtual port.
12317885a7bSLuigi RizzoAn additional ring pair connects to the host stack.
124ce3ee1e7SLuigi Rizzo.Pp
12517885a7bSLuigi RizzoAfter binding a file descriptor to a port, a
12617885a7bSLuigi Rizzo.Nm
12717885a7bSLuigi Rizzoclient can send or receive packets in batches through
12817885a7bSLuigi Rizzothe rings, and possibly implement zero-copy forwarding
12917885a7bSLuigi Rizzobetween ports.
130ce3ee1e7SLuigi Rizzo.Pp
13117885a7bSLuigi RizzoAll NICs operating in
13268b8534bSLuigi Rizzo.Nm
133ce3ee1e7SLuigi Rizzomode use the same memory region,
13417885a7bSLuigi Rizzoaccessible to all processes who own
13517885a7bSLuigi Rizzo.Nm /dev/netmap
13617885a7bSLuigi Rizzofile descriptors bound to NICs.
137fa7db06bSLuigi RizzoIndependent
13817885a7bSLuigi Rizzo.Nm VALE
139fa7db06bSLuigi Rizzoand
140fa7db06bSLuigi Rizzo.Nm netmap pipe
141fa7db06bSLuigi Rizzoports
142fa7db06bSLuigi Rizzoby default use separate memory regions,
143fa7db06bSLuigi Rizzobut can be independently configured to share memory.
144ce3ee1e7SLuigi Rizzo.Pp
14517885a7bSLuigi Rizzo.Sh ENTERING AND EXITING NETMAP MODE
146fa7db06bSLuigi RizzoThe following section describes the system calls to create
147fa7db06bSLuigi Rizzoand control
148fa7db06bSLuigi Rizzo.Nm netmap
149fa7db06bSLuigi Rizzoports (including
150fa7db06bSLuigi Rizzo.Nm VALE
151fa7db06bSLuigi Rizzoand
152fa7db06bSLuigi Rizzo.Nm netmap pipe
153fa7db06bSLuigi Rizzoports).
154fa7db06bSLuigi RizzoSimpler, higher level functions are described in section
155fa7db06bSLuigi Rizzo.Xr LIBRARIES .
156fa7db06bSLuigi Rizzo.Pp
15717885a7bSLuigi RizzoPorts and rings are created and controlled through a file descriptor,
15817885a7bSLuigi Rizzocreated by opening a special device
15917885a7bSLuigi Rizzo.Dl fd = open("/dev/netmap");
16017885a7bSLuigi Rizzoand then bound to a specific port with an
16117885a7bSLuigi Rizzo.Dl ioctl(fd, NIOCREGIF, (struct nmreq *)arg);
16217885a7bSLuigi Rizzo.Pp
16317885a7bSLuigi Rizzo.Nm
16417885a7bSLuigi Rizzohas multiple modes of operation controlled by the
16517885a7bSLuigi Rizzo.Vt struct nmreq
16617885a7bSLuigi Rizzoargument.
16717885a7bSLuigi Rizzo.Va arg.nr_name
16817885a7bSLuigi Rizzospecifies the port name, as follows:
16917885a7bSLuigi Rizzo.Bl -tag -width XXXX
17017885a7bSLuigi Rizzo.It Dv OS network interface name (e.g. 'em0', 'eth1', ... )
17117885a7bSLuigi Rizzothe data path of the NIC is disconnected from the host stack,
17217885a7bSLuigi Rizzoand the file descriptor is bound to the NIC (one or all queues),
17317885a7bSLuigi Rizzoor to the host stack;
17417885a7bSLuigi Rizzo.It Dv valeXXX:YYY (arbitrary XXX and YYY)
17517885a7bSLuigi Rizzothe file descriptor is bound to port YYY of a VALE switch called XXX,
17617885a7bSLuigi Rizzoboth dynamically created if necessary.
17717885a7bSLuigi RizzoThe string cannot exceed IFNAMSIZ characters, and YYY cannot
17817885a7bSLuigi Rizzobe the name of any existing OS network interface.
17917885a7bSLuigi Rizzo.El
18017885a7bSLuigi Rizzo.Pp
18117885a7bSLuigi RizzoOn return,
18217885a7bSLuigi Rizzo.Va arg
18317885a7bSLuigi Rizzoindicates the size of the shared memory region,
18417885a7bSLuigi Rizzoand the number, size and location of all the
18517885a7bSLuigi Rizzo.Nm
18617885a7bSLuigi Rizzodata structures, which can be accessed by mmapping the memory
18717885a7bSLuigi Rizzo.Dl char *mem = mmap(0, arg.nr_memsize, fd);
18817885a7bSLuigi Rizzo.Pp
18917885a7bSLuigi RizzoNon blocking I/O is done with special
19017885a7bSLuigi Rizzo.Xr ioctl 2
19117885a7bSLuigi Rizzo.Xr select 2
19217885a7bSLuigi Rizzoand
19317885a7bSLuigi Rizzo.Xr poll 2
19417885a7bSLuigi Rizzoon the file descriptor permit blocking I/O.
19517885a7bSLuigi Rizzo.Xr epoll 2
19617885a7bSLuigi Rizzoand
19717885a7bSLuigi Rizzo.Xr kqueue 2
19817885a7bSLuigi Rizzoare not supported on
19917885a7bSLuigi Rizzo.Nm
20017885a7bSLuigi Rizzofile descriptors.
20117885a7bSLuigi Rizzo.Pp
20217885a7bSLuigi RizzoWhile a NIC is in
20317885a7bSLuigi Rizzo.Nm
20417885a7bSLuigi Rizzomode, the OS will still believe the interface is up and running.
20517885a7bSLuigi RizzoOS-generated packets for that NIC end up into a
20617885a7bSLuigi Rizzo.Nm
20717885a7bSLuigi Rizzoring, and another ring is used to send packets into the OS network stack.
20817885a7bSLuigi RizzoA
20917885a7bSLuigi Rizzo.Xr close 2
21017885a7bSLuigi Rizzoon the file descriptor removes the binding,
21117885a7bSLuigi Rizzoand returns the NIC to normal mode (reconnecting the data path
21217885a7bSLuigi Rizzoto the host stack), or destroys the virtual port.
21317885a7bSLuigi Rizzo.Pp
21417885a7bSLuigi Rizzo.Sh DATA STRUCTURES
21517885a7bSLuigi RizzoThe data structures in the mmapped memory region are detailed in
21617885a7bSLuigi Rizzo.Xr sys/net/netmap.h ,
21717885a7bSLuigi Rizzowhich is the ultimate reference for the
21817885a7bSLuigi Rizzo.Nm
21917885a7bSLuigi RizzoAPI. The main structures and fields are indicated below:
22068b8534bSLuigi Rizzo.Bl -tag -width XXX
22168b8534bSLuigi Rizzo.It Dv struct netmap_if (one per interface)
22268b8534bSLuigi Rizzo.Bd -literal
22368b8534bSLuigi Rizzostruct netmap_if {
22417885a7bSLuigi Rizzo    ...
22517885a7bSLuigi Rizzo    const uint32_t   ni_flags;      /* properties              */
22617885a7bSLuigi Rizzo    ...
22717885a7bSLuigi Rizzo    const uint32_t   ni_tx_rings;   /* NIC tx rings            */
22817885a7bSLuigi Rizzo    const uint32_t   ni_rx_rings;   /* NIC rx rings            */
229fa7db06bSLuigi Rizzo    uint32_t         ni_bufs_head;  /* head of extra bufs list */
23017885a7bSLuigi Rizzo    ...
23168b8534bSLuigi Rizzo};
23268b8534bSLuigi Rizzo.Ed
233ce3ee1e7SLuigi Rizzo.Pp
23417885a7bSLuigi RizzoIndicates the number of available rings
23517885a7bSLuigi Rizzo.Pa ( struct netmap_rings )
23617885a7bSLuigi Rizzoand their position in the mmapped region.
23717885a7bSLuigi RizzoThe number of tx and rx rings
23817885a7bSLuigi Rizzo.Pa ( ni_tx_rings , ni_rx_rings )
23917885a7bSLuigi Rizzonormally depends on the hardware.
24017885a7bSLuigi RizzoNICs also have an extra tx/rx ring pair connected to the host stack.
24117885a7bSLuigi Rizzo.Em NIOCREGIF
242fa7db06bSLuigi Rizzocan also request additional unbound buffers in the same memory space,
243fa7db06bSLuigi Rizzoto be used as temporary storage for packets.
244fa7db06bSLuigi Rizzo.Pa ni_bufs_head
245fa7db06bSLuigi Rizzocontains the index of the first of these free rings,
246fa7db06bSLuigi Rizzowhich are connected in a list (the first uint32_t of each
247fa7db06bSLuigi Rizzobuffer being the index of the next buffer in the list).
248fa7db06bSLuigi RizzoA 0 indicates the end of the list.
249fa7db06bSLuigi Rizzo.Pp
25017885a7bSLuigi Rizzo.It Dv struct netmap_ring (one per ring)
25168b8534bSLuigi Rizzo.Bd -literal
25268b8534bSLuigi Rizzostruct netmap_ring {
25317885a7bSLuigi Rizzo    ...
25417885a7bSLuigi Rizzo    const uint32_t num_slots;   /* slots in each ring            */
25517885a7bSLuigi Rizzo    const uint32_t nr_buf_size; /* size of each buffer           */
25617885a7bSLuigi Rizzo    ...
25717885a7bSLuigi Rizzo    uint32_t       head;        /* (u) first buf owned by user   */
25817885a7bSLuigi Rizzo    uint32_t       cur;         /* (u) wakeup position           */
25917885a7bSLuigi Rizzo    const uint32_t tail;        /* (k) first buf owned by kernel */
26017885a7bSLuigi Rizzo    ...
26117885a7bSLuigi Rizzo    uint32_t       flags;
26217885a7bSLuigi Rizzo    struct timeval ts;          /* (k) time of last rxsync()     */
26317885a7bSLuigi Rizzo    ...
264ce3ee1e7SLuigi Rizzo    struct netmap_slot slot[0]; /* array of slots                */
26568b8534bSLuigi Rizzo}
26668b8534bSLuigi Rizzo.Ed
267ce3ee1e7SLuigi Rizzo.Pp
26817885a7bSLuigi RizzoImplements transmit and receive rings, with read/write
26917885a7bSLuigi Rizzopointers, metadata and and an array of
27017885a7bSLuigi Rizzo.Pa slots
27117885a7bSLuigi Rizzodescribing the buffers.
27217885a7bSLuigi Rizzo.Pp
27317885a7bSLuigi Rizzo.It Dv struct netmap_slot (one per buffer)
27468b8534bSLuigi Rizzo.Bd -literal
27568b8534bSLuigi Rizzostruct netmap_slot {
27668b8534bSLuigi Rizzo    uint32_t buf_idx;           /* buffer index                 */
27768b8534bSLuigi Rizzo    uint16_t len;               /* packet length                */
27868b8534bSLuigi Rizzo    uint16_t flags;             /* buf changed, etc.            */
27917885a7bSLuigi Rizzo    uint64_t ptr;               /* address for indirect buffers */
28068b8534bSLuigi Rizzo};
28168b8534bSLuigi Rizzo.Ed
28217885a7bSLuigi Rizzo.Pp
28317885a7bSLuigi RizzoDescribes a packet buffer, which normally is identified by
28417885a7bSLuigi Rizzoan index and resides in the mmapped region.
28568b8534bSLuigi Rizzo.It Dv packet buffers
28617885a7bSLuigi RizzoFixed size (normally 2 KB) packet buffers allocated by the kernel.
287ce3ee1e7SLuigi Rizzo.El
288ce3ee1e7SLuigi Rizzo.Pp
28917885a7bSLuigi RizzoThe offset of the
29017885a7bSLuigi Rizzo.Pa struct netmap_if
29117885a7bSLuigi Rizzoin the mmapped region is indicated by the
29217885a7bSLuigi Rizzo.Pa nr_offset
29317885a7bSLuigi Rizzofield in the structure returned by
29417885a7bSLuigi Rizzo.Pa NIOCREGIF .
29517885a7bSLuigi RizzoFrom there, all other objects are reachable through
29617885a7bSLuigi Rizzorelative references (offsets or indexes).
29717885a7bSLuigi RizzoMacros and functions in <net/netmap_user.h>
29817885a7bSLuigi Rizzohelp converting them into actual pointers:
29917885a7bSLuigi Rizzo.Pp
30017885a7bSLuigi Rizzo.Dl struct netmap_if  *nifp = NETMAP_IF(mem, arg.nr_offset);
30117885a7bSLuigi Rizzo.Dl struct netmap_ring *txr = NETMAP_TXRING(nifp, ring_index);
30217885a7bSLuigi Rizzo.Dl struct netmap_ring *rxr = NETMAP_RXRING(nifp, ring_index);
30317885a7bSLuigi Rizzo.Pp
30417885a7bSLuigi Rizzo.Dl char *buf = NETMAP_BUF(ring, buffer_index);
30517885a7bSLuigi Rizzo.Sh RINGS, BUFFERS AND DATA I/O
30617885a7bSLuigi Rizzo.Va Rings
30717885a7bSLuigi Rizzoare circular queues of packets with three indexes/pointers
30817885a7bSLuigi Rizzo.Va ( head , cur , tail ) ;
30917885a7bSLuigi Rizzoone slot is always kept empty.
31017885a7bSLuigi RizzoThe ring size
31117885a7bSLuigi Rizzo.Va ( num_slots )
31217885a7bSLuigi Rizzoshould not be assumed to be a power of two.
31317885a7bSLuigi Rizzo.br
31417885a7bSLuigi Rizzo(NOTE: older versions of netmap used head/count format to indicate
31517885a7bSLuigi Rizzothe content of a ring).
31617885a7bSLuigi Rizzo.Pp
31717885a7bSLuigi Rizzo.Va head
31817885a7bSLuigi Rizzois the first slot available to userspace;
31917885a7bSLuigi Rizzo.br
32017885a7bSLuigi Rizzo.Va cur
32117885a7bSLuigi Rizzois the wakeup point:
32217885a7bSLuigi Rizzoselect/poll will unblock when
32317885a7bSLuigi Rizzo.Va tail
32417885a7bSLuigi Rizzopasses
32517885a7bSLuigi Rizzo.Va cur ;
32617885a7bSLuigi Rizzo.br
32717885a7bSLuigi Rizzo.Va tail
32817885a7bSLuigi Rizzois the first slot reserved to the kernel.
32917885a7bSLuigi Rizzo.Pp
33017885a7bSLuigi RizzoSlot indexes MUST only move forward;
33117885a7bSLuigi Rizzofor convenience, the function
33217885a7bSLuigi Rizzo.Dl nm_ring_next(ring, index)
33317885a7bSLuigi Rizzoreturns the next index modulo the ring size.
33417885a7bSLuigi Rizzo.Pp
33517885a7bSLuigi Rizzo.Va head
33617885a7bSLuigi Rizzoand
33717885a7bSLuigi Rizzo.Va cur
33817885a7bSLuigi Rizzoare only modified by the user program;
33917885a7bSLuigi Rizzo.Va tail
34017885a7bSLuigi Rizzois only modified by the kernel.
34117885a7bSLuigi RizzoThe kernel only reads/writes the
34217885a7bSLuigi Rizzo.Vt struct netmap_ring
34317885a7bSLuigi Rizzoslots and buffers
34417885a7bSLuigi Rizzoduring the execution of a netmap-related system call.
34517885a7bSLuigi RizzoThe only exception are slots (and buffers) in the range
34617885a7bSLuigi Rizzo.Va tail\  . . . head-1 ,
34717885a7bSLuigi Rizzothat are explicitly assigned to the kernel.
34817885a7bSLuigi Rizzo.Pp
34917885a7bSLuigi Rizzo.Ss TRANSMIT RINGS
35017885a7bSLuigi RizzoOn transmit rings, after a
35117885a7bSLuigi Rizzo.Nm
35217885a7bSLuigi Rizzosystem call, slots in the range
35317885a7bSLuigi Rizzo.Va head\  . . . tail-1
35417885a7bSLuigi Rizzoare available for transmission.
35517885a7bSLuigi RizzoUser code should fill the slots sequentially
35617885a7bSLuigi Rizzoand advance
35717885a7bSLuigi Rizzo.Va head
35817885a7bSLuigi Rizzoand
35917885a7bSLuigi Rizzo.Va cur
36017885a7bSLuigi Rizzopast slots ready to transmit.
36117885a7bSLuigi Rizzo.Va cur
36217885a7bSLuigi Rizzomay be moved further ahead if the user code needs
36317885a7bSLuigi Rizzomore slots before further transmissions (see
36417885a7bSLuigi Rizzo.Sx SCATTER GATHER I/O ) .
36517885a7bSLuigi Rizzo.Pp
36617885a7bSLuigi RizzoAt the next NIOCTXSYNC/select()/poll(),
36717885a7bSLuigi Rizzoslots up to
36817885a7bSLuigi Rizzo.Va head-1
36917885a7bSLuigi Rizzoare pushed to the port, and
37017885a7bSLuigi Rizzo.Va tail
37117885a7bSLuigi Rizzomay advance if further slots have become available.
37217885a7bSLuigi RizzoBelow is an example of the evolution of a TX ring:
37317885a7bSLuigi Rizzo.Pp
37417885a7bSLuigi Rizzo.Bd -literal
37517885a7bSLuigi Rizzo    after the syscall, slots between cur and tail are (a)vailable
37617885a7bSLuigi Rizzo              head=cur   tail
37717885a7bSLuigi Rizzo               |          |
37817885a7bSLuigi Rizzo               v          v
37917885a7bSLuigi Rizzo     TX  [.....aaaaaaaaaaa.............]
38017885a7bSLuigi Rizzo
38117885a7bSLuigi Rizzo    user creates new packets to (T)ransmit
38217885a7bSLuigi Rizzo                head=cur tail
38317885a7bSLuigi Rizzo                    |     |
38417885a7bSLuigi Rizzo                    v     v
38517885a7bSLuigi Rizzo     TX  [.....TTTTTaaaaaa.............]
38617885a7bSLuigi Rizzo
38717885a7bSLuigi Rizzo    NIOCTXSYNC/poll()/select() sends packets and reports new slots
38817885a7bSLuigi Rizzo                head=cur      tail
38917885a7bSLuigi Rizzo                    |          |
39017885a7bSLuigi Rizzo                    v          v
39117885a7bSLuigi Rizzo     TX  [..........aaaaaaaaaaa........]
39217885a7bSLuigi Rizzo.Ed
39317885a7bSLuigi Rizzo.Pp
39417885a7bSLuigi Rizzoselect() and poll() wlll block if there is no space in the ring, i.e.
39517885a7bSLuigi Rizzo.Dl ring->cur == ring->tail
39617885a7bSLuigi Rizzoand return when new slots have become available.
39717885a7bSLuigi Rizzo.Pp
39817885a7bSLuigi RizzoHigh speed applications may want to amortize the cost of system calls
39917885a7bSLuigi Rizzoby preparing as many packets as possible before issuing them.
40017885a7bSLuigi Rizzo.Pp
40117885a7bSLuigi RizzoA transmit ring with pending transmissions has
40217885a7bSLuigi Rizzo.Dl ring->head != ring->tail + 1 (modulo the ring size).
40317885a7bSLuigi RizzoThe function
40417885a7bSLuigi Rizzo.Va int nm_tx_pending(ring)
40517885a7bSLuigi Rizzoimplements this test.
40617885a7bSLuigi Rizzo.Pp
40717885a7bSLuigi Rizzo.Ss RECEIVE RINGS
40817885a7bSLuigi RizzoOn receive rings, after a
40917885a7bSLuigi Rizzo.Nm
41017885a7bSLuigi Rizzosystem call, the slots in the range
41117885a7bSLuigi Rizzo.Va head\& . . . tail-1
41217885a7bSLuigi Rizzocontain received packets.
41317885a7bSLuigi RizzoUser code should process them and advance
41417885a7bSLuigi Rizzo.Va head
41517885a7bSLuigi Rizzoand
41617885a7bSLuigi Rizzo.Va cur
41717885a7bSLuigi Rizzopast slots it wants to return to the kernel.
41817885a7bSLuigi Rizzo.Va cur
41917885a7bSLuigi Rizzomay be moved further ahead if the user code wants to
42017885a7bSLuigi Rizzowait for more packets
42117885a7bSLuigi Rizzowithout returning all the previous slots to the kernel.
42217885a7bSLuigi Rizzo.Pp
42317885a7bSLuigi RizzoAt the next NIOCRXSYNC/select()/poll(),
42417885a7bSLuigi Rizzoslots up to
42517885a7bSLuigi Rizzo.Va head-1
42617885a7bSLuigi Rizzoare returned to the kernel for further receives, and
42717885a7bSLuigi Rizzo.Va tail
42817885a7bSLuigi Rizzomay advance to report new incoming packets.
42917885a7bSLuigi Rizzo.br
43017885a7bSLuigi RizzoBelow is an example of the evolution of an RX ring:
43117885a7bSLuigi Rizzo.Bd -literal
43217885a7bSLuigi Rizzo    after the syscall, there are some (h)eld and some (R)eceived slots
43317885a7bSLuigi Rizzo           head  cur     tail
43417885a7bSLuigi Rizzo            |     |       |
43517885a7bSLuigi Rizzo            v     v       v
43617885a7bSLuigi Rizzo     RX  [..hhhhhhRRRRRRRR..........]
43717885a7bSLuigi Rizzo
43817885a7bSLuigi Rizzo    user advances head and cur, releasing some slots and holding others
43917885a7bSLuigi Rizzo               head cur  tail
44017885a7bSLuigi Rizzo                 |  |     |
44117885a7bSLuigi Rizzo                 v  v     v
44217885a7bSLuigi Rizzo     RX  [..*****hhhRRRRRR...........]
44317885a7bSLuigi Rizzo
44417885a7bSLuigi Rizzo    NICRXSYNC/poll()/select() recovers slots and reports new packets
44517885a7bSLuigi Rizzo               head cur        tail
44617885a7bSLuigi Rizzo                 |  |           |
44717885a7bSLuigi Rizzo                 v  v           v
44817885a7bSLuigi Rizzo     RX  [.......hhhRRRRRRRRRRRR....]
44917885a7bSLuigi Rizzo.Ed
45017885a7bSLuigi Rizzo.Pp
45117885a7bSLuigi Rizzo.Sh SLOTS AND PACKET BUFFERS
45217885a7bSLuigi RizzoNormally, packets should be stored in the netmap-allocated buffers
45317885a7bSLuigi Rizzoassigned to slots when ports are bound to a file descriptor.
45417885a7bSLuigi RizzoOne packet is fully contained in a single buffer.
45517885a7bSLuigi Rizzo.Pp
45617885a7bSLuigi RizzoThe following flags affect slot and buffer processing:
457ce3ee1e7SLuigi Rizzo.Bl -tag -width XXX
458ce3ee1e7SLuigi Rizzo.It NS_BUF_CHANGED
45917885a7bSLuigi Rizzoit MUST be used when the buf_idx in the slot is changed.
46017885a7bSLuigi RizzoThis can be used to implement
46117885a7bSLuigi Rizzozero-copy forwarding, see
46217885a7bSLuigi Rizzo.Sx ZERO-COPY FORWARDING .
463ce3ee1e7SLuigi Rizzo.Pp
464ce3ee1e7SLuigi Rizzo.It NS_REPORT
46517885a7bSLuigi Rizzoreports when this buffer has been transmitted.
466ce3ee1e7SLuigi RizzoNormally,
467ce3ee1e7SLuigi Rizzo.Nm
468ce3ee1e7SLuigi Rizzonotifies transmit completions in batches, hence signals
46917885a7bSLuigi Rizzocan be delayed indefinitely. This flag helps detecting
47017885a7bSLuigi Rizzowhen packets have been send and a file descriptor can be closed.
471ce3ee1e7SLuigi Rizzo.It NS_FORWARD
47217885a7bSLuigi RizzoWhen a ring is in 'transparent' mode (see
47317885a7bSLuigi Rizzo.Sx TRANSPARENT MODE ) ,
47417885a7bSLuigi Rizzopackets marked with this flags are forwarded to the other endpoint
47517885a7bSLuigi Rizzoat the next system call, thus restoring (in a selective way)
47617885a7bSLuigi Rizzothe connection between a NIC and the host stack.
477ce3ee1e7SLuigi Rizzo.It NS_NO_LEARN
478ce3ee1e7SLuigi Rizzotells the forwarding code that the SRC MAC address for this
47917885a7bSLuigi Rizzopacket must not be used in the learning bridge code.
480ce3ee1e7SLuigi Rizzo.It NS_INDIRECT
48117885a7bSLuigi Rizzoindicates that the packet's payload is in a user-supplied buffer,
48217885a7bSLuigi Rizzowhose user virtual address is in the 'ptr' field of the slot.
483ce3ee1e7SLuigi RizzoThe size can reach 65535 bytes.
48417885a7bSLuigi Rizzo.br
48517885a7bSLuigi RizzoThis is only supported on the transmit ring of
48617885a7bSLuigi Rizzo.Nm VALE
48717885a7bSLuigi Rizzoports, and it helps reducing data copies in the interconnection
48817885a7bSLuigi Rizzoof virtual machines.
489ce3ee1e7SLuigi Rizzo.It NS_MOREFRAG
490ce3ee1e7SLuigi Rizzoindicates that the packet continues with subsequent buffers;
491ce3ee1e7SLuigi Rizzothe last buffer in a packet must have the flag clear.
492ce3ee1e7SLuigi Rizzo.El
49317885a7bSLuigi Rizzo.Sh SCATTER GATHER I/O
49417885a7bSLuigi RizzoPackets can span multiple slots if the
49517885a7bSLuigi Rizzo.Va NS_MOREFRAG
49617885a7bSLuigi Rizzoflag is set in all but the last slot.
49717885a7bSLuigi RizzoThe maximum length of a chain is 64 buffers.
49817885a7bSLuigi RizzoThis is normally used with
49917885a7bSLuigi Rizzo.Nm VALE
50017885a7bSLuigi Rizzoports when connecting virtual machines, as they generate large
50117885a7bSLuigi RizzoTSO segments that are not split unless they reach a physical device.
50217885a7bSLuigi Rizzo.Pp
50317885a7bSLuigi RizzoNOTE: The length field always refers to the individual
50417885a7bSLuigi Rizzofragment; there is no place with the total length of a packet.
50517885a7bSLuigi Rizzo.Pp
50617885a7bSLuigi RizzoOn receive rings the macro
50717885a7bSLuigi Rizzo.Va NS_RFRAGS(slot)
50817885a7bSLuigi Rizzoindicates the remaining number of slots for this packet,
50917885a7bSLuigi Rizzoincluding the current one.
51017885a7bSLuigi RizzoSlots with a value greater than 1 also have NS_MOREFRAG set.
51113a5d88fSLuigi Rizzo.Sh IOCTLS
51268b8534bSLuigi Rizzo.Nm
51317885a7bSLuigi Rizzouses two ioctls (NIOCTXSYNC, NIOCRXSYNC)
51417885a7bSLuigi Rizzofor non-blocking I/O. They take no argument.
51517885a7bSLuigi RizzoTwo more ioctls (NIOCGINFO, NIOCREGIF) are used
51617885a7bSLuigi Rizzoto query and configure ports, with the following argument:
51768b8534bSLuigi Rizzo.Bd -literal
51868b8534bSLuigi Rizzostruct nmreq {
51917885a7bSLuigi Rizzo    char      nr_name[IFNAMSIZ]; /* (i) port name                  */
52017885a7bSLuigi Rizzo    uint32_t  nr_version;        /* (i) API version                */
52117885a7bSLuigi Rizzo    uint32_t  nr_offset;         /* (o) nifp offset in mmap region */
52217885a7bSLuigi Rizzo    uint32_t  nr_memsize;        /* (o) size of the mmap region    */
523fa7db06bSLuigi Rizzo    uint32_t  nr_tx_slots;       /* (i/o) slots in tx rings        */
524fa7db06bSLuigi Rizzo    uint32_t  nr_rx_slots;       /* (i/o) slots in rx rings        */
525fa7db06bSLuigi Rizzo    uint16_t  nr_tx_rings;       /* (i/o) number of tx rings       */
526fa7db06bSLuigi Rizzo    uint16_t  nr_rx_rings;       /* (i/o) number of tx rings       */
527fa7db06bSLuigi Rizzo    uint16_t  nr_ringid;         /* (i/o) ring(s) we care about    */
52817885a7bSLuigi Rizzo    uint16_t  nr_cmd;            /* (i) special command            */
529fa7db06bSLuigi Rizzo    uint16_t  nr_arg1;           /* (i/o) extra arguments          */
530fa7db06bSLuigi Rizzo    uint16_t  nr_arg2;           /* (i/o) extra arguments          */
531fa7db06bSLuigi Rizzo    uint32_t  nr_arg3;           /* (i/o) extra arguments          */
532fa7db06bSLuigi Rizzo    uint32_t  nr_flags           /* (i/o) open mode                */
53317885a7bSLuigi Rizzo    ...
53468b8534bSLuigi Rizzo};
53568b8534bSLuigi Rizzo.Ed
53668b8534bSLuigi Rizzo.Pp
53717885a7bSLuigi RizzoA file descriptor obtained through
53817885a7bSLuigi Rizzo.Pa /dev/netmap
53917885a7bSLuigi Rizzoalso supports the ioctl supported by network devices, see
54017885a7bSLuigi Rizzo.Xr netintro 4 .
54117885a7bSLuigi Rizzo.Pp
54268b8534bSLuigi Rizzo.Bl -tag -width XXXX
54368b8534bSLuigi Rizzo.It Dv NIOCGINFO
54417885a7bSLuigi Rizzoreturns EINVAL if the named port does not support netmap.
545ce3ee1e7SLuigi RizzoOtherwise, it returns 0 and (advisory) information
54617885a7bSLuigi Rizzoabout the port.
547ce3ee1e7SLuigi RizzoNote that all the information below can change before the
548ce3ee1e7SLuigi Rizzointerface is actually put in netmap mode.
54968b8534bSLuigi Rizzo.Pp
55017885a7bSLuigi Rizzo.Bl -tag -width XX
55117885a7bSLuigi Rizzo.It Pa nr_memsize
55217885a7bSLuigi Rizzoindicates the size of the
55317885a7bSLuigi Rizzo.Nm
55417885a7bSLuigi Rizzomemory region. NICs in
55517885a7bSLuigi Rizzo.Nm
55617885a7bSLuigi Rizzomode all share the same memory region,
55717885a7bSLuigi Rizzowhereas
55817885a7bSLuigi Rizzo.Nm VALE
55917885a7bSLuigi Rizzoports have independent regions for each port.
56017885a7bSLuigi Rizzo.It Pa nr_tx_slots , nr_rx_slots
561ce3ee1e7SLuigi Rizzoindicate the size of transmit and receive rings.
56217885a7bSLuigi Rizzo.It Pa nr_tx_rings , nr_rx_rings
563ce3ee1e7SLuigi Rizzoindicate the number of transmit
564ce3ee1e7SLuigi Rizzoand receive rings.
565ce3ee1e7SLuigi RizzoBoth ring number and sizes may be configured at runtime
566ce3ee1e7SLuigi Rizzousing interface-specific functions (e.g.
56717885a7bSLuigi Rizzo.Xr ethtool
56817885a7bSLuigi Rizzo).
56917885a7bSLuigi Rizzo.El
57068b8534bSLuigi Rizzo.It Dv NIOCREGIF
57117885a7bSLuigi Rizzobinds the port named in
57217885a7bSLuigi Rizzo.Va nr_name
57317885a7bSLuigi Rizzoto the file descriptor. For a physical device this also switches it into
57417885a7bSLuigi Rizzo.Nm
57517885a7bSLuigi Rizzomode, disconnecting
57617885a7bSLuigi Rizzoit from the host stack.
57717885a7bSLuigi RizzoMultiple file descriptors can be bound to the same port,
57817885a7bSLuigi Rizzowith proper synchronization left to the user.
57917885a7bSLuigi Rizzo.Pp
580fa7db06bSLuigi Rizzo.Dv NIOCREGIF can also bind a file descriptor to one endpoint of a
581fa7db06bSLuigi Rizzo.Em netmap pipe ,
582fa7db06bSLuigi Rizzoconsisting of two netmap ports with a crossover connection.
583fa7db06bSLuigi RizzoA netmap pipe share the same memory space of the parent port,
584fa7db06bSLuigi Rizzoand is meant to enable configuration where a master process acts
585fa7db06bSLuigi Rizzoas a dispatcher towards slave processes.
586fa7db06bSLuigi Rizzo.Pp
587fa7db06bSLuigi RizzoTo enable this function, the
588fa7db06bSLuigi Rizzo.Pa nr_arg1
589fa7db06bSLuigi Rizzofield of the structure can be used as a hint to the kernel to
590fa7db06bSLuigi Rizzoindicate how many pipes we expect to use, and reserve extra space
591fa7db06bSLuigi Rizzoin the memory region.
592fa7db06bSLuigi Rizzo.Pp
593fa7db06bSLuigi RizzoOn return, it gives the same info as NIOCGINFO,
594fa7db06bSLuigi Rizzowith
595fa7db06bSLuigi Rizzo.Pa nr_ringid
596fa7db06bSLuigi Rizzoand
597fa7db06bSLuigi Rizzo.Pa nr_flags
598fa7db06bSLuigi Rizzoindicating the identity of the rings controlled through the file
59968b8534bSLuigi Rizzodescriptor.
60068b8534bSLuigi Rizzo.Pp
601fa7db06bSLuigi Rizzo.Va nr_flags
60217885a7bSLuigi Rizzo.Va nr_ringid
60317885a7bSLuigi Rizzoselects which rings are controlled through this file descriptor.
604fa7db06bSLuigi RizzoPossible values of
605fa7db06bSLuigi Rizzo.Pa nr_flags
606fa7db06bSLuigi Rizzoare indicated below, together with the naming schemes
607fa7db06bSLuigi Rizzothat application libraries (such as the
608fa7db06bSLuigi Rizzo.Nm nm_open
609fa7db06bSLuigi Rizzoindicated below) can use to indicate the specific set of rings.
610fa7db06bSLuigi RizzoIn the example below, "netmap:foo" is any valid netmap port name.
611fa7db06bSLuigi Rizzo.Pp
61268b8534bSLuigi Rizzo.Bl -tag -width XXXXX
613fa7db06bSLuigi Rizzo.It NR_REG_ALL_NIC                         "netmap:foo"
614fa7db06bSLuigi Rizzo(default) all hardware ring pairs
615415dfa83SMaxim Sobolev.It NR_REG_SW            "netmap:foo^"
61617885a7bSLuigi Rizzothe ``host rings'', connecting to the host stack.
617*d4d112e3SJoel Dahl.It NR_REG_NIC_SW        "netmap:foo+"
618fa7db06bSLuigi Rizzoall hardware rings and the host rings
619fa7db06bSLuigi Rizzo.It NR_REG_ONE_NIC       "netmap:foo-i"
620fa7db06bSLuigi Rizzoonly the i-th hardware ring pair, where the number is in
621fa7db06bSLuigi Rizzo.Pa nr_ringid ;
622fa7db06bSLuigi Rizzo.It NR_REG_PIPE_MASTER  "netmap:foo{i"
623fa7db06bSLuigi Rizzothe master side of the netmap pipe whose identifier (i) is in
624fa7db06bSLuigi Rizzo.Pa nr_ringid ;
625fa7db06bSLuigi Rizzo.It NR_REG_PIPE_SLAVE   "netmap:foo}i"
626fa7db06bSLuigi Rizzothe slave side of the netmap pipe whose identifier (i) is in
627fa7db06bSLuigi Rizzo.Pa nr_ringid .
628fa7db06bSLuigi Rizzo.Pp
629fa7db06bSLuigi RizzoThe identifier of a pipe must be thought as part of the pipe name,
630fa7db06bSLuigi Rizzoand does not need to be sequential. On return the pipe
631fa7db06bSLuigi Rizzowill only have a single ring pair with index 0,
632fa7db06bSLuigi Rizzoirrespective of the value of i.
63368b8534bSLuigi Rizzo.El
63417885a7bSLuigi Rizzo.Pp
63568b8534bSLuigi RizzoBy default, a
63617885a7bSLuigi Rizzo.Xr poll 2
63768b8534bSLuigi Rizzoor
63817885a7bSLuigi Rizzo.Xr select 2
63968b8534bSLuigi Rizzocall pushes out any pending packets on the transmit ring, even if
64068b8534bSLuigi Rizzono write events are specified.
64168b8534bSLuigi RizzoThe feature can be disabled by or-ing
642415dfa83SMaxim Sobolev.Va NETMAP_NO_TX_POLL
64317885a7bSLuigi Rizzoto the value written to
64417885a7bSLuigi Rizzo.Va nr_ringid.
64517885a7bSLuigi RizzoWhen this feature is used,
64617885a7bSLuigi Rizzopackets are transmitted only on
64717885a7bSLuigi Rizzo.Va ioctl(NIOCTXSYNC)
64817885a7bSLuigi Rizzoor select()/poll() are called with a write event (POLLOUT/wfdset) or a full ring.
649ce3ee1e7SLuigi Rizzo.Pp
650ce3ee1e7SLuigi RizzoWhen registering a virtual interface that is dynamically created to a
651ce3ee1e7SLuigi Rizzo.Xr vale 4
652ce3ee1e7SLuigi Rizzoswitch, we can specify the desired number of rings (1 by default,
653ce3ee1e7SLuigi Rizzoand currently up to 16) on it using nr_tx_rings and nr_rx_rings fields.
65468b8534bSLuigi Rizzo.It Dv NIOCTXSYNC
65568b8534bSLuigi Rizzotells the hardware of new packets to transmit, and updates the
65668b8534bSLuigi Rizzonumber of slots available for transmission.
65768b8534bSLuigi Rizzo.It Dv NIOCRXSYNC
65868b8534bSLuigi Rizzotells the hardware of consumed packets, and asks for newly available
65968b8534bSLuigi Rizzopackets.
66068b8534bSLuigi Rizzo.El
661fa7db06bSLuigi Rizzo.Sh SELECT, POLL, EPOLL, KQUEUE.
66217885a7bSLuigi Rizzo.Xr select 2
66317885a7bSLuigi Rizzoand
66417885a7bSLuigi Rizzo.Xr poll 2
66517885a7bSLuigi Rizzoon a
66617885a7bSLuigi Rizzo.Nm
66717885a7bSLuigi Rizzofile descriptor process rings as indicated in
66817885a7bSLuigi Rizzo.Sx TRANSMIT RINGS
66917885a7bSLuigi Rizzoand
670fa7db06bSLuigi Rizzo.Sx RECEIVE RINGS ,
671fa7db06bSLuigi Rizzorespectively when write (POLLOUT) and read (POLLIN) events are requested.
672fa7db06bSLuigi RizzoBoth block if no slots are available in the ring
673fa7db06bSLuigi Rizzo.Va ( ring->cur == ring->tail ) .
674fa7db06bSLuigi RizzoDepending on the platform,
675fa7db06bSLuigi Rizzo.Xr epoll 2
676fa7db06bSLuigi Rizzoand
677fa7db06bSLuigi Rizzo.Xr kqueue 2
678fa7db06bSLuigi Rizzoare supported too.
67917885a7bSLuigi Rizzo.Pp
680fa7db06bSLuigi RizzoPackets in transmit rings are normally pushed out
681fa7db06bSLuigi Rizzo(and buffers reclaimed) even without
682415dfa83SMaxim Sobolevrequesting write events. Passing the NETMAP_NO_TX_POLL flag to
68317885a7bSLuigi Rizzo.Em NIOCREGIF
68417885a7bSLuigi Rizzodisables this feature.
685fa7db06bSLuigi RizzoBy default, receive rings are processed only if read
686415dfa83SMaxim Sobolevevents are requested. Passing the NETMAP_DO_RX_POLL flag to
687fa7db06bSLuigi Rizzo.Em NIOCREGIF updates receive rings even without read events.
688415dfa83SMaxim SobolevNote that on epoll and kqueue, NETMAP_NO_TX_POLL and NETMAP_DO_RX_POLL
689fa7db06bSLuigi Rizzoonly have an effect when some event is posted for the file descriptor.
69017885a7bSLuigi Rizzo.Sh LIBRARIES
69117885a7bSLuigi RizzoThe
69217885a7bSLuigi Rizzo.Nm
69317885a7bSLuigi RizzoAPI is supposed to be used directly, both because of its simplicity and
69417885a7bSLuigi Rizzofor efficient integration with applications.
69517885a7bSLuigi Rizzo.Pp
69617885a7bSLuigi RizzoFor conveniency, the
69717885a7bSLuigi Rizzo.Va <net/netmap_user.h>
69817885a7bSLuigi Rizzoheader provides a few macros and functions to ease creating
69917885a7bSLuigi Rizzoa file descriptor and doing I/O with a
70017885a7bSLuigi Rizzo.Nm
70117885a7bSLuigi Rizzoport. These are loosely modeled after the
70217885a7bSLuigi Rizzo.Xr pcap 3
70317885a7bSLuigi RizzoAPI, to ease porting of libpcap-based applications to
70417885a7bSLuigi Rizzo.Nm .
70517885a7bSLuigi RizzoTo use these extra functions, programs should
70617885a7bSLuigi Rizzo.Dl #define NETMAP_WITH_LIBS
70717885a7bSLuigi Rizzobefore
70817885a7bSLuigi Rizzo.Dl #include <net/netmap_user.h>
70917885a7bSLuigi Rizzo.Pp
71017885a7bSLuigi RizzoThe following functions are available:
71117885a7bSLuigi Rizzo.Bl -tag -width XXXXX
712fa7db06bSLuigi Rizzo.It Va  struct nm_desc * nm_open(const char *ifname, const struct nmreq *req, uint64_t flags, const struct nm_desc *arg)
71317885a7bSLuigi Rizzosimilar to
71417885a7bSLuigi Rizzo.Xr pcap_open ,
71517885a7bSLuigi Rizzobinds a file descriptor to a port.
71617885a7bSLuigi Rizzo.Bl -tag -width XX
71717885a7bSLuigi Rizzo.It Va ifname
71817885a7bSLuigi Rizzois a port name, in the form "netmap:XXX" for a NIC and "valeXXX:YYY" for a
71917885a7bSLuigi Rizzo.Nm VALE
72017885a7bSLuigi Rizzoport.
721fa7db06bSLuigi Rizzo.It Va req
722fa7db06bSLuigi Rizzoprovides the initial values for the argument to the NIOCREGIF ioctl.
723fa7db06bSLuigi RizzoThe nm_flags and nm_ringid values are overwritten by parsing
724fa7db06bSLuigi Rizzoifname and flags, and other fields can be overridden through
725fa7db06bSLuigi Rizzothe other two arguments.
726fa7db06bSLuigi Rizzo.It Va arg
727fa7db06bSLuigi Rizzopoints to a struct nm_desc containing arguments (e.g. from a previously
728fa7db06bSLuigi Rizzoopen file descriptor) that should override the defaults.
729fa7db06bSLuigi RizzoThe fields are used as described below
73017885a7bSLuigi Rizzo.It Va flags
731fa7db06bSLuigi Rizzocan be set to a combination of the following flags:
732fa7db06bSLuigi Rizzo.Va NETMAP_NO_TX_POLL ,
733fa7db06bSLuigi Rizzo.Va NETMAP_DO_RX_POLL
734fa7db06bSLuigi Rizzo(copied into nr_ringid);
735fa7db06bSLuigi Rizzo.Va NM_OPEN_NO_MMAP (if arg points to the same memory region,
736fa7db06bSLuigi Rizzoavoids the mmap and uses the values from it);
737fa7db06bSLuigi Rizzo.Va NM_OPEN_IFNAME (ignores ifname and uses the values in arg);
738fa7db06bSLuigi Rizzo.Va NM_OPEN_ARG1 ,
739fa7db06bSLuigi Rizzo.Va NM_OPEN_ARG2 ,
740fa7db06bSLuigi Rizzo.Va NM_OPEN_ARG3 (uses the fields from arg);
741fa7db06bSLuigi Rizzo.Va NM_OPEN_RING_CFG (uses the ring number and sizes from arg).
74217885a7bSLuigi Rizzo.El
743fa7db06bSLuigi Rizzo.It Va int nm_close(struct nm_desc *d)
74417885a7bSLuigi Rizzocloses the file descriptor, unmaps memory, frees resources.
745fa7db06bSLuigi Rizzo.It Va int nm_inject(struct nm_desc *d, const void *buf, size_t size)
74617885a7bSLuigi Rizzosimilar to pcap_inject(), pushes a packet to a ring, returns the size
74717885a7bSLuigi Rizzoof the packet is successful, or 0 on error;
748fa7db06bSLuigi Rizzo.It Va int nm_dispatch(struct nm_desc *d, int cnt, nm_cb_t cb, u_char *arg)
74917885a7bSLuigi Rizzosimilar to pcap_dispatch(), applies a callback to incoming packets
750fa7db06bSLuigi Rizzo.It Va u_char * nm_nextpkt(struct nm_desc *d, struct nm_pkthdr *hdr)
75117885a7bSLuigi Rizzosimilar to pcap_next(), fetches the next packet
75217885a7bSLuigi Rizzo.Pp
75317885a7bSLuigi Rizzo.El
75417885a7bSLuigi Rizzo.Sh SUPPORTED DEVICES
75517885a7bSLuigi Rizzo.Nm
75617885a7bSLuigi Rizzonatively supports the following devices:
75717885a7bSLuigi Rizzo.Pp
75817885a7bSLuigi RizzoOn FreeBSD:
75917885a7bSLuigi Rizzo.Xr em 4 ,
76017885a7bSLuigi Rizzo.Xr igb 4 ,
76117885a7bSLuigi Rizzo.Xr ixgbe 4 ,
76217885a7bSLuigi Rizzo.Xr lem 4 ,
76317885a7bSLuigi Rizzo.Xr re 4 .
76417885a7bSLuigi Rizzo.Pp
76517885a7bSLuigi RizzoOn Linux
76617885a7bSLuigi Rizzo.Xr e1000 4 ,
76717885a7bSLuigi Rizzo.Xr e1000e 4 ,
76817885a7bSLuigi Rizzo.Xr igb 4 ,
76917885a7bSLuigi Rizzo.Xr ixgbe 4 ,
77017885a7bSLuigi Rizzo.Xr mlx4 4 ,
77117885a7bSLuigi Rizzo.Xr forcedeth 4 ,
77217885a7bSLuigi Rizzo.Xr r8169 4 .
77317885a7bSLuigi Rizzo.Pp
77417885a7bSLuigi RizzoNICs without native support can still be used in
77517885a7bSLuigi Rizzo.Nm
77617885a7bSLuigi Rizzomode through emulation. Performance is inferior to native netmap
77717885a7bSLuigi Rizzomode but still significantly higher than sockets, and approaching
77817885a7bSLuigi Rizzothat of in-kernel solutions such as Linux's
77917885a7bSLuigi Rizzo.Xr pktgen .
78017885a7bSLuigi Rizzo.Pp
78117885a7bSLuigi RizzoEmulation is also available for devices with native netmap support,
78217885a7bSLuigi Rizzowhich can be used for testing or performance comparison.
78317885a7bSLuigi RizzoThe sysctl variable
78417885a7bSLuigi Rizzo.Va dev.netmap.admode
78517885a7bSLuigi Rizzoglobally controls how netmap mode is implemented.
78617885a7bSLuigi Rizzo.Sh SYSCTL VARIABLES AND MODULE PARAMETERS
78717885a7bSLuigi RizzoSome aspect of the operation of
78817885a7bSLuigi Rizzo.Nm
78917885a7bSLuigi Rizzoare controlled through sysctl variables on FreeBSD
79017885a7bSLuigi Rizzo.Em ( dev.netmap.* )
79117885a7bSLuigi Rizzoand module parameters on Linux
79217885a7bSLuigi Rizzo.Em ( /sys/module/netmap_lin/parameters/* ) :
79317885a7bSLuigi Rizzo.Pp
79417885a7bSLuigi Rizzo.Bl -tag -width indent
79517885a7bSLuigi Rizzo.It Va dev.netmap.admode: 0
79617885a7bSLuigi RizzoControls the use of native or emulated adapter mode.
79717885a7bSLuigi Rizzo0 uses the best available option, 1 forces native and
79817885a7bSLuigi Rizzofails if not available, 2 forces emulated hence never fails.
79917885a7bSLuigi Rizzo.It Va dev.netmap.generic_ringsize: 1024
80017885a7bSLuigi RizzoRing size used for emulated netmap mode
80117885a7bSLuigi Rizzo.It Va dev.netmap.generic_mit: 100000
80217885a7bSLuigi RizzoControls interrupt moderation for emulated mode
80317885a7bSLuigi Rizzo.It Va dev.netmap.mmap_unreg: 0
80417885a7bSLuigi Rizzo.It Va dev.netmap.fwd: 0
80517885a7bSLuigi RizzoForces NS_FORWARD mode
80617885a7bSLuigi Rizzo.It Va dev.netmap.flags: 0
80717885a7bSLuigi Rizzo.It Va dev.netmap.txsync_retry: 2
80817885a7bSLuigi Rizzo.It Va dev.netmap.no_pendintr: 1
80917885a7bSLuigi RizzoForces recovery of transmit buffers on system calls
81017885a7bSLuigi Rizzo.It Va dev.netmap.mitigate: 1
81117885a7bSLuigi RizzoPropagates interrupt mitigation to user processes
81217885a7bSLuigi Rizzo.It Va dev.netmap.no_timestamp: 0
81317885a7bSLuigi RizzoDisables the update of the timestamp in the netmap ring
81417885a7bSLuigi Rizzo.It Va dev.netmap.verbose: 0
81517885a7bSLuigi RizzoVerbose kernel messages
81617885a7bSLuigi Rizzo.It Va dev.netmap.buf_num: 163840
81717885a7bSLuigi Rizzo.It Va dev.netmap.buf_size: 2048
81817885a7bSLuigi Rizzo.It Va dev.netmap.ring_num: 200
81917885a7bSLuigi Rizzo.It Va dev.netmap.ring_size: 36864
82017885a7bSLuigi Rizzo.It Va dev.netmap.if_num: 100
82117885a7bSLuigi Rizzo.It Va dev.netmap.if_size: 1024
82217885a7bSLuigi RizzoSizes and number of objects (netmap_if, netmap_ring, buffers)
82317885a7bSLuigi Rizzofor the global memory region. The only parameter worth modifying is
82417885a7bSLuigi Rizzo.Va dev.netmap.buf_num
82517885a7bSLuigi Rizzoas it impacts the total amount of memory used by netmap.
82617885a7bSLuigi Rizzo.It Va dev.netmap.buf_curr_num: 0
82717885a7bSLuigi Rizzo.It Va dev.netmap.buf_curr_size: 0
82817885a7bSLuigi Rizzo.It Va dev.netmap.ring_curr_num: 0
82917885a7bSLuigi Rizzo.It Va dev.netmap.ring_curr_size: 0
83017885a7bSLuigi Rizzo.It Va dev.netmap.if_curr_num: 0
83117885a7bSLuigi Rizzo.It Va dev.netmap.if_curr_size: 0
83217885a7bSLuigi RizzoActual values in use.
83317885a7bSLuigi Rizzo.It Va dev.netmap.bridge_batch: 1024
83417885a7bSLuigi RizzoBatch size used when moving packets across a
83517885a7bSLuigi Rizzo.Nm VALE
83617885a7bSLuigi Rizzoswitch. Values above 64 generally guarantee good
83717885a7bSLuigi Rizzoperformance.
83817885a7bSLuigi Rizzo.El
83913a5d88fSLuigi Rizzo.Sh SYSTEM CALLS
84068b8534bSLuigi Rizzo.Nm
84168b8534bSLuigi Rizzouses
842fa7db06bSLuigi Rizzo.Xr select 2 ,
843fa7db06bSLuigi Rizzo.Xr poll 2 ,
844fa7db06bSLuigi Rizzo.Xr epoll
84568b8534bSLuigi Rizzoand
846fa7db06bSLuigi Rizzo.Xr kqueue
847ce3ee1e7SLuigi Rizzoto wake up processes when significant events occur, and
848ce3ee1e7SLuigi Rizzo.Xr mmap 2
849ce3ee1e7SLuigi Rizzoto map memory.
85017885a7bSLuigi Rizzo.Xr ioctl 2
85117885a7bSLuigi Rizzois used to configure ports and
85217885a7bSLuigi Rizzo.Nm VALE switches .
853ce3ee1e7SLuigi Rizzo.Pp
854ce3ee1e7SLuigi RizzoApplications may need to create threads and bind them to
855ce3ee1e7SLuigi Rizzospecific cores to improve performance, using standard
856ce3ee1e7SLuigi RizzoOS primitives, see
857ce3ee1e7SLuigi Rizzo.Xr pthread 3 .
858ce3ee1e7SLuigi RizzoIn particular,
859ce3ee1e7SLuigi Rizzo.Xr pthread_setaffinity_np 3
860ce3ee1e7SLuigi Rizzomay be of use.
86117885a7bSLuigi Rizzo.Sh CAVEATS
86217885a7bSLuigi RizzoNo matter how fast the CPU and OS are,
86317885a7bSLuigi Rizzoachieving line rate on 10G and faster interfaces
86417885a7bSLuigi Rizzorequires hardware with sufficient performance.
86517885a7bSLuigi RizzoSeveral NICs are unable to sustain line rate with
86617885a7bSLuigi Rizzosmall packet sizes. Insufficient PCIe or memory bandwidth
86717885a7bSLuigi Rizzocan also cause reduced performance.
86817885a7bSLuigi Rizzo.Pp
86917885a7bSLuigi RizzoAnother frequent reason for low performance is the use
87017885a7bSLuigi Rizzoof flow control on the link: a slow receiver can limit
87117885a7bSLuigi Rizzothe transmit speed.
87217885a7bSLuigi RizzoBe sure to disable flow control when running high
87317885a7bSLuigi Rizzospeed experiments.
87417885a7bSLuigi Rizzo.Pp
87517885a7bSLuigi Rizzo.Ss SPECIAL NIC FEATURES
87617885a7bSLuigi Rizzo.Nm
87717885a7bSLuigi Rizzois orthogonal to some NIC features such as
87817885a7bSLuigi Rizzomultiqueue, schedulers, packet filters.
87917885a7bSLuigi Rizzo.Pp
88017885a7bSLuigi RizzoMultiple transmit and receive rings are supported natively
88117885a7bSLuigi Rizzoand can be configured with ordinary OS tools,
88217885a7bSLuigi Rizzosuch as
88317885a7bSLuigi Rizzo.Xr ethtool
88417885a7bSLuigi Rizzoor
88517885a7bSLuigi Rizzodevice-specific sysctl variables.
88617885a7bSLuigi RizzoThe same goes for Receive Packet Steering (RPS)
88717885a7bSLuigi Rizzoand filtering of incoming traffic.
88817885a7bSLuigi Rizzo.Pp
88917885a7bSLuigi Rizzo.Nm
89017885a7bSLuigi Rizzo.Em does not use
89117885a7bSLuigi Rizzofeatures such as
89217885a7bSLuigi Rizzo.Em checksum offloading , TCP segmentation offloading ,
89317885a7bSLuigi Rizzo.Em encryption , VLAN encapsulation/decapsulation ,
89417885a7bSLuigi Rizzoetc. .
89517885a7bSLuigi RizzoWhen using netmap to exchange packets with the host stack,
89617885a7bSLuigi Rizzomake sure to disable these features.
89768b8534bSLuigi Rizzo.Sh EXAMPLES
89817885a7bSLuigi Rizzo.Ss TEST PROGRAMS
89917885a7bSLuigi Rizzo.Nm
90017885a7bSLuigi Rizzocomes with a few programs that can be used for testing or
90117885a7bSLuigi Rizzosimple applications.
90217885a7bSLuigi RizzoSee the
90317885a7bSLuigi Rizzo.Va examples/
90417885a7bSLuigi Rizzodirectory in
90517885a7bSLuigi Rizzo.Nm
90617885a7bSLuigi Rizzodistributions, or
90717885a7bSLuigi Rizzo.Va tools/tools/netmap/
90817885a7bSLuigi Rizzodirectory in FreeBSD distributions.
90917885a7bSLuigi Rizzo.Pp
91017885a7bSLuigi Rizzo.Xr pkt-gen
91117885a7bSLuigi Rizzois a general purpose traffic source/sink.
91217885a7bSLuigi Rizzo.Pp
91317885a7bSLuigi RizzoAs an example
91417885a7bSLuigi Rizzo.Dl pkt-gen -i ix0 -f tx -l 60
91517885a7bSLuigi Rizzocan generate an infinite stream of minimum size packets, and
91617885a7bSLuigi Rizzo.Dl pkt-gen -i ix0 -f rx
91717885a7bSLuigi Rizzois a traffic sink.
91817885a7bSLuigi RizzoBoth print traffic statistics, to help monitor
91917885a7bSLuigi Rizzohow the system performs.
92017885a7bSLuigi Rizzo.Pp
92117885a7bSLuigi Rizzo.Xr pkt-gen
92217885a7bSLuigi Rizzohas many options can be uses to set packet sizes, addresses,
92317885a7bSLuigi Rizzorates, and use multiple send/receive threads and cores.
92417885a7bSLuigi Rizzo.Pp
92517885a7bSLuigi Rizzo.Xr bridge
92617885a7bSLuigi Rizzois another test program which interconnects two
92717885a7bSLuigi Rizzo.Nm
92817885a7bSLuigi Rizzoports. It can be used for transparent forwarding between
92917885a7bSLuigi Rizzointerfaces, as in
93017885a7bSLuigi Rizzo.Dl bridge -i ix0 -i ix1
93117885a7bSLuigi Rizzoor even connect the NIC to the host stack using netmap
93217885a7bSLuigi Rizzo.Dl bridge -i ix0 -i ix0
93317885a7bSLuigi Rizzo.Ss USING THE NATIVE API
93468b8534bSLuigi RizzoThe following code implements a traffic generator
93568b8534bSLuigi Rizzo.Pp
93668b8534bSLuigi Rizzo.Bd -literal -compact
93768b8534bSLuigi Rizzo#include <net/netmap_user.h>
93817885a7bSLuigi Rizzo...
93917885a7bSLuigi Rizzovoid sender(void)
94017885a7bSLuigi Rizzo{
94168b8534bSLuigi Rizzo    struct netmap_if *nifp;
94268b8534bSLuigi Rizzo    struct netmap_ring *ring;
943d83a410eSHiren Panchasara    struct nmreq nmr;
94417885a7bSLuigi Rizzo    struct pollfd fds;
94568b8534bSLuigi Rizzo
94668b8534bSLuigi Rizzo    fd = open("/dev/netmap", O_RDWR);
94768b8534bSLuigi Rizzo    bzero(&nmr, sizeof(nmr));
948d83a410eSHiren Panchasara    strcpy(nmr.nr_name, "ix0");
949ce3ee1e7SLuigi Rizzo    nmr.nm_version = NETMAP_API;
950ce3ee1e7SLuigi Rizzo    ioctl(fd, NIOCREGIF, &nmr);
951d83a410eSHiren Panchasara    p = mmap(0, nmr.nr_memsize, fd);
952ce3ee1e7SLuigi Rizzo    nifp = NETMAP_IF(p, nmr.nr_offset);
95368b8534bSLuigi Rizzo    ring = NETMAP_TXRING(nifp, 0);
95468b8534bSLuigi Rizzo    fds.fd = fd;
95568b8534bSLuigi Rizzo    fds.events = POLLOUT;
95668b8534bSLuigi Rizzo    for (;;) {
95717885a7bSLuigi Rizzo	poll(&fds, 1, -1);
95817885a7bSLuigi Rizzo	while (!nm_ring_empty(ring)) {
95968b8534bSLuigi Rizzo	    i = ring->cur;
96068b8534bSLuigi Rizzo	    buf = NETMAP_BUF(ring, ring->slot[i].buf_index);
96168b8534bSLuigi Rizzo	    ... prepare packet in buf ...
96268b8534bSLuigi Rizzo	    ring->slot[i].len = ... packet length ...
96317885a7bSLuigi Rizzo	    ring->head = ring->cur = nm_ring_next(ring, i);
96417885a7bSLuigi Rizzo	}
96568b8534bSLuigi Rizzo    }
96668b8534bSLuigi Rizzo}
96768b8534bSLuigi Rizzo.Ed
96817885a7bSLuigi Rizzo.Ss HELPER FUNCTIONS
96917885a7bSLuigi RizzoA simple receiver can be implemented using the helper functions
97017885a7bSLuigi Rizzo.Bd -literal -compact
97117885a7bSLuigi Rizzo#define NETMAP_WITH_LIBS
97217885a7bSLuigi Rizzo#include <net/netmap_user.h>
97317885a7bSLuigi Rizzo...
97417885a7bSLuigi Rizzovoid receiver(void)
97517885a7bSLuigi Rizzo{
976fa7db06bSLuigi Rizzo    struct nm_desc *d;
97717885a7bSLuigi Rizzo    struct pollfd fds;
97817885a7bSLuigi Rizzo    u_char *buf;
979fa7db06bSLuigi Rizzo    struct nm_pkthdr h;
98017885a7bSLuigi Rizzo    ...
98117885a7bSLuigi Rizzo    d = nm_open("netmap:ix0", NULL, 0, 0);
98217885a7bSLuigi Rizzo    fds.fd = NETMAP_FD(d);
98317885a7bSLuigi Rizzo    fds.events = POLLIN;
98417885a7bSLuigi Rizzo    for (;;) {
98517885a7bSLuigi Rizzo	poll(&fds, 1, -1);
98617885a7bSLuigi Rizzo        while ( (buf = nm_nextpkt(d, &h)) )
98717885a7bSLuigi Rizzo	    consume_pkt(buf, h->len);
98817885a7bSLuigi Rizzo    }
98917885a7bSLuigi Rizzo    nm_close(d);
99017885a7bSLuigi Rizzo}
99117885a7bSLuigi Rizzo.Ed
99217885a7bSLuigi Rizzo.Ss ZERO-COPY FORWARDING
99317885a7bSLuigi RizzoSince physical interfaces share the same memory region,
99417885a7bSLuigi Rizzoit is possible to do packet forwarding between ports
99517885a7bSLuigi Rizzoswapping buffers. The buffer from the transmit ring is used
99617885a7bSLuigi Rizzoto replenish the receive ring:
99717885a7bSLuigi Rizzo.Bd -literal -compact
99817885a7bSLuigi Rizzo    uint32_t tmp;
99917885a7bSLuigi Rizzo    struct netmap_slot *src, *dst;
100017885a7bSLuigi Rizzo    ...
100117885a7bSLuigi Rizzo    src = &src_ring->slot[rxr->cur];
100217885a7bSLuigi Rizzo    dst = &dst_ring->slot[txr->cur];
100317885a7bSLuigi Rizzo    tmp = dst->buf_idx;
100417885a7bSLuigi Rizzo    dst->buf_idx = src->buf_idx;
100517885a7bSLuigi Rizzo    dst->len = src->len;
100617885a7bSLuigi Rizzo    dst->flags = NS_BUF_CHANGED;
100717885a7bSLuigi Rizzo    src->buf_idx = tmp;
100817885a7bSLuigi Rizzo    src->flags = NS_BUF_CHANGED;
100917885a7bSLuigi Rizzo    rxr->head = rxr->cur = nm_ring_next(rxr, rxr->cur);
101017885a7bSLuigi Rizzo    txr->head = txr->cur = nm_ring_next(txr, txr->cur);
101117885a7bSLuigi Rizzo    ...
101217885a7bSLuigi Rizzo.Ed
101317885a7bSLuigi Rizzo.Ss ACCESSING THE HOST STACK
1014fa7db06bSLuigi RizzoThe host stack is for all practical purposes just a regular ring pair,
1015fa7db06bSLuigi Rizzowhich you can access with the netmap API (e.g. with
1016fa7db06bSLuigi Rizzo.Dl nm_open("netmap:eth0^", ... ) ;
1017fa7db06bSLuigi RizzoAll packets that the host would send to an interface in
1018fa7db06bSLuigi Rizzo.Nm
1019fa7db06bSLuigi Rizzomode end up into the RX ring, whereas all packets queued to the
1020fa7db06bSLuigi RizzoTX ring are send up to the host stack.
102117885a7bSLuigi Rizzo.Ss VALE SWITCH
102217885a7bSLuigi RizzoA simple way to test the performance of a
102317885a7bSLuigi Rizzo.Nm VALE
102417885a7bSLuigi Rizzoswitch is to attach a sender and a receiver to it,
102517885a7bSLuigi Rizzoe.g. running the following in two different terminals:
102617885a7bSLuigi Rizzo.Dl pkt-gen -i vale1:a -f rx # receiver
102717885a7bSLuigi Rizzo.Dl pkt-gen -i vale1:b -f tx # sender
1028fa7db06bSLuigi RizzoThe same example can be used to test netmap pipes, by simply
1029fa7db06bSLuigi Rizzochanging port names, e.g.
1030fa7db06bSLuigi Rizzo.Dl pkt-gen -i vale:x{3 -f rx # receiver on the master side
1031fa7db06bSLuigi Rizzo.Dl pkt-gen -i vale:x}3 -f tx # sender on the slave side
103217885a7bSLuigi Rizzo.Pp
103317885a7bSLuigi RizzoThe following command attaches an interface and the host stack
103417885a7bSLuigi Rizzoto a switch:
103517885a7bSLuigi Rizzo.Dl vale-ctl -h vale2:em0
103617885a7bSLuigi RizzoOther
103768b8534bSLuigi Rizzo.Nm
103817885a7bSLuigi Rizzoclients attached to the same switch can now communicate
103917885a7bSLuigi Rizzowith the network card or the host.
104017885a7bSLuigi Rizzo.Pp
104113a5d88fSLuigi Rizzo.Sh SEE ALSO
104213a5d88fSLuigi Rizzo.Pp
104313a5d88fSLuigi Rizzohttp://info.iet.unipi.it/~luigi/netmap/
104413a5d88fSLuigi Rizzo.Pp
104513a5d88fSLuigi RizzoLuigi Rizzo, Revisiting network I/O APIs: the netmap framework,
104613a5d88fSLuigi RizzoCommunications of the ACM, 55 (3), pp.45-51, March 2012
104713a5d88fSLuigi Rizzo.Pp
104813a5d88fSLuigi RizzoLuigi Rizzo, netmap: a novel framework for fast packet I/O,
104913a5d88fSLuigi RizzoUsenix ATC'12, June 2012, Boston
1050fa7db06bSLuigi Rizzo.Pp
1051fa7db06bSLuigi RizzoLuigi Rizzo, Giuseppe Lettieri,
1052fa7db06bSLuigi RizzoVALE, a switched ethernet for virtual machines,
1053fa7db06bSLuigi RizzoACM CoNEXT'12, December 2012, Nice
1054fa7db06bSLuigi Rizzo.Pp
1055fa7db06bSLuigi RizzoLuigi Rizzo, Giuseppe Lettieri, Vincenzo Maffione,
1056fa7db06bSLuigi RizzoSpeeding up packet I/O in virtual machines,
1057fa7db06bSLuigi RizzoACM/IEEE ANCS'13, October 2013, San Jose
105868b8534bSLuigi Rizzo.Sh AUTHORS
105913a5d88fSLuigi Rizzo.An -nosplit
106068b8534bSLuigi RizzoThe
106168b8534bSLuigi Rizzo.Nm
1062ce3ee1e7SLuigi Rizzoframework has been originally designed and implemented at the
106313a5d88fSLuigi RizzoUniversita` di Pisa in 2011 by
106413a5d88fSLuigi Rizzo.An Luigi Rizzo ,
1065ce3ee1e7SLuigi Rizzoand further extended with help from
106613a5d88fSLuigi Rizzo.An Matteo Landi ,
106713a5d88fSLuigi Rizzo.An Gaetano Catalli ,
1068ce3ee1e7SLuigi Rizzo.An Giuseppe Lettieri ,
1069ce3ee1e7SLuigi Rizzo.An Vincenzo Maffione .
107013a5d88fSLuigi Rizzo.Pp
107113a5d88fSLuigi Rizzo.Nm
1072ce3ee1e7SLuigi Rizzoand
1073ce3ee1e7SLuigi Rizzo.Nm VALE
1074ce3ee1e7SLuigi Rizzohave been funded by the European Commission within FP7 Projects
1075ce3ee1e7SLuigi RizzoCHANGE (257422) and OPENLAB (287581).
1076