xref: /linux/Documentation/admin-guide/nfs/nfs-rdma.rst (revision 4b4193256c8d3bc3a5397b5cd9494c2ad386317d)
1f8b8d030SDaniel W. S. Almeida===================
2f8b8d030SDaniel W. S. AlmeidaSetting up NFS/RDMA
3f8b8d030SDaniel W. S. Almeida===================
4f8b8d030SDaniel W. S. Almeida
5f8b8d030SDaniel W. S. Almeida:Author:
6f8b8d030SDaniel W. S. Almeida  NetApp and Open Grid Computing (May 29, 2008)
7f8b8d030SDaniel W. S. Almeida
8f8b8d030SDaniel W. S. Almeida.. warning::
9f8b8d030SDaniel W. S. Almeida  This document is probably obsolete.
10f8b8d030SDaniel W. S. Almeida
11f8b8d030SDaniel W. S. AlmeidaOverview
12f8b8d030SDaniel W. S. Almeida========
13f8b8d030SDaniel W. S. Almeida
14f8b8d030SDaniel W. S. AlmeidaThis document describes how to install and setup the Linux NFS/RDMA client
15f8b8d030SDaniel W. S. Almeidaand server software.
16f8b8d030SDaniel W. S. Almeida
17f8b8d030SDaniel W. S. AlmeidaThe NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server
18f8b8d030SDaniel W. S. Almeidawas first included in the following release, Linux 2.6.25.
19f8b8d030SDaniel W. S. Almeida
20f8b8d030SDaniel W. S. AlmeidaIn our testing, we have obtained excellent performance results (full 10Gbit
21f8b8d030SDaniel W. S. Almeidawire bandwidth at minimal client CPU) under many workloads. The code passes
22f8b8d030SDaniel W. S. Almeidathe full Connectathon test suite and operates over both Infiniband and iWARP
23f8b8d030SDaniel W. S. AlmeidaRDMA adapters.
24f8b8d030SDaniel W. S. Almeida
25f8b8d030SDaniel W. S. AlmeidaGetting Help
26f8b8d030SDaniel W. S. Almeida============
27f8b8d030SDaniel W. S. Almeida
28f8b8d030SDaniel W. S. AlmeidaIf you get stuck, you can ask questions on the
29f8b8d030SDaniel W. S. Almeidanfs-rdma-devel@lists.sourceforge.net mailing list.
30f8b8d030SDaniel W. S. Almeida
31f8b8d030SDaniel W. S. AlmeidaInstallation
32f8b8d030SDaniel W. S. Almeida============
33f8b8d030SDaniel W. S. Almeida
34f8b8d030SDaniel W. S. AlmeidaThese instructions are a step by step guide to building a machine for
35f8b8d030SDaniel W. S. Almeidause with NFS/RDMA.
36f8b8d030SDaniel W. S. Almeida
37f8b8d030SDaniel W. S. Almeida- Install an RDMA device
38f8b8d030SDaniel W. S. Almeida
39f8b8d030SDaniel W. S. Almeida  Any device supported by the drivers in drivers/infiniband/hw is acceptable.
40f8b8d030SDaniel W. S. Almeida
41f8b8d030SDaniel W. S. Almeida  Testing has been performed using several Mellanox-based IB cards, the
42f8b8d030SDaniel W. S. Almeida  Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter.
43f8b8d030SDaniel W. S. Almeida
44f8b8d030SDaniel W. S. Almeida- Install a Linux distribution and tools
45f8b8d030SDaniel W. S. Almeida
46f8b8d030SDaniel W. S. Almeida  The first kernel release to contain both the NFS/RDMA client and server was
47f8b8d030SDaniel W. S. Almeida  Linux 2.6.25  Therefore, a distribution compatible with this and subsequent
48f8b8d030SDaniel W. S. Almeida  Linux kernel release should be installed.
49f8b8d030SDaniel W. S. Almeida
50f8b8d030SDaniel W. S. Almeida  The procedures described in this document have been tested with
51f8b8d030SDaniel W. S. Almeida  distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).
52f8b8d030SDaniel W. S. Almeida
53f8b8d030SDaniel W. S. Almeida- Install nfs-utils-1.1.2 or greater on the client
54f8b8d030SDaniel W. S. Almeida
55f8b8d030SDaniel W. S. Almeida  An NFS/RDMA mount point can be obtained by using the mount.nfs command in
56f8b8d030SDaniel W. S. Almeida  nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
57f8b8d030SDaniel W. S. Almeida  version with support for NFS/RDMA mounts, but for various reasons we
58f8b8d030SDaniel W. S. Almeida  recommend using nfs-utils-1.1.2 or greater). To see which version of
59f8b8d030SDaniel W. S. Almeida  mount.nfs you are using, type:
60f8b8d030SDaniel W. S. Almeida
61f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
62f8b8d030SDaniel W. S. Almeida
63f8b8d030SDaniel W. S. Almeida    $ /sbin/mount.nfs -V
64f8b8d030SDaniel W. S. Almeida
65f8b8d030SDaniel W. S. Almeida  If the version is less than 1.1.2 or the command does not exist,
66f8b8d030SDaniel W. S. Almeida  you should install the latest version of nfs-utils.
67f8b8d030SDaniel W. S. Almeida
68*6b2484e1SAlexander A. Klimov  Download the latest package from: https://www.kernel.org/pub/linux/utils/nfs
69f8b8d030SDaniel W. S. Almeida
70f8b8d030SDaniel W. S. Almeida  Uncompress the package and follow the installation instructions.
71f8b8d030SDaniel W. S. Almeida
72f8b8d030SDaniel W. S. Almeida  If you will not need the idmapper and gssd executables (you do not need
73f8b8d030SDaniel W. S. Almeida  these to create an NFS/RDMA enabled mount command), the installation
74f8b8d030SDaniel W. S. Almeida  process can be simplified by disabling these features when running
75f8b8d030SDaniel W. S. Almeida  configure:
76f8b8d030SDaniel W. S. Almeida
77f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
78f8b8d030SDaniel W. S. Almeida
79f8b8d030SDaniel W. S. Almeida    $ ./configure --disable-gss --disable-nfsv4
80f8b8d030SDaniel W. S. Almeida
81f8b8d030SDaniel W. S. Almeida  To build nfs-utils you will need the tcp_wrappers package installed. For
82f8b8d030SDaniel W. S. Almeida  more information on this see the package's README and INSTALL files.
83f8b8d030SDaniel W. S. Almeida
84f8b8d030SDaniel W. S. Almeida  After building the nfs-utils package, there will be a mount.nfs binary in
85f8b8d030SDaniel W. S. Almeida  the utils/mount directory. This binary can be used to initiate NFS v2, v3,
86f8b8d030SDaniel W. S. Almeida  or v4 mounts. To initiate a v4 mount, the binary must be called
87f8b8d030SDaniel W. S. Almeida  mount.nfs4.  The standard technique is to create a symlink called
88f8b8d030SDaniel W. S. Almeida  mount.nfs4 to mount.nfs.
89f8b8d030SDaniel W. S. Almeida
90f8b8d030SDaniel W. S. Almeida  This mount.nfs binary should be installed at /sbin/mount.nfs as follows:
91f8b8d030SDaniel W. S. Almeida
92f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
93f8b8d030SDaniel W. S. Almeida
94f8b8d030SDaniel W. S. Almeida    $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs
95f8b8d030SDaniel W. S. Almeida
96f8b8d030SDaniel W. S. Almeida  In this location, mount.nfs will be invoked automatically for NFS mounts
97f8b8d030SDaniel W. S. Almeida  by the system mount command.
98f8b8d030SDaniel W. S. Almeida
99f8b8d030SDaniel W. S. Almeida    .. note::
100f8b8d030SDaniel W. S. Almeida      mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
101f8b8d030SDaniel W. S. Almeida      on the NFS client machine. You do not need this specific version of
102f8b8d030SDaniel W. S. Almeida      nfs-utils on the server. Furthermore, only the mount.nfs command from
103f8b8d030SDaniel W. S. Almeida      nfs-utils-1.1.2 is needed on the client.
104f8b8d030SDaniel W. S. Almeida
105f8b8d030SDaniel W. S. Almeida- Install a Linux kernel with NFS/RDMA
106f8b8d030SDaniel W. S. Almeida
107f8b8d030SDaniel W. S. Almeida  The NFS/RDMA client and server are both included in the mainline Linux
108f8b8d030SDaniel W. S. Almeida  kernel version 2.6.25 and later. This and other versions of the Linux
109f8b8d030SDaniel W. S. Almeida  kernel can be found at: https://www.kernel.org/pub/linux/kernel/
110f8b8d030SDaniel W. S. Almeida
111f8b8d030SDaniel W. S. Almeida  Download the sources and place them in an appropriate location.
112f8b8d030SDaniel W. S. Almeida
113f8b8d030SDaniel W. S. Almeida- Configure the RDMA stack
114f8b8d030SDaniel W. S. Almeida
115f8b8d030SDaniel W. S. Almeida  Make sure your kernel configuration has RDMA support enabled. Under
116f8b8d030SDaniel W. S. Almeida  Device Drivers -> InfiniBand support, update the kernel configuration
117f8b8d030SDaniel W. S. Almeida  to enable InfiniBand support [NOTE: the option name is misleading. Enabling
118f8b8d030SDaniel W. S. Almeida  InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)].
119f8b8d030SDaniel W. S. Almeida
120f8b8d030SDaniel W. S. Almeida  Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or
121f8b8d030SDaniel W. S. Almeida  iWARP adapter support (amso, cxgb3, etc.).
122f8b8d030SDaniel W. S. Almeida
123f8b8d030SDaniel W. S. Almeida  If you are using InfiniBand, be sure to enable IP-over-InfiniBand support.
124f8b8d030SDaniel W. S. Almeida
125f8b8d030SDaniel W. S. Almeida- Configure the NFS client and server
126f8b8d030SDaniel W. S. Almeida
127f8b8d030SDaniel W. S. Almeida  Your kernel configuration must also have NFS file system support and/or
128f8b8d030SDaniel W. S. Almeida  NFS server support enabled. These and other NFS related configuration
129f8b8d030SDaniel W. S. Almeida  options can be found under File Systems -> Network File Systems.
130f8b8d030SDaniel W. S. Almeida
131f8b8d030SDaniel W. S. Almeida- Build, install, reboot
132f8b8d030SDaniel W. S. Almeida
133f8b8d030SDaniel W. S. Almeida  The NFS/RDMA code will be enabled automatically if NFS and RDMA
134f8b8d030SDaniel W. S. Almeida  are turned on. The NFS/RDMA client and server are configured via the hidden
135f8b8d030SDaniel W. S. Almeida  SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The
136f8b8d030SDaniel W. S. Almeida  value of SUNRPC_XPRT_RDMA will be:
137f8b8d030SDaniel W. S. Almeida
138f8b8d030SDaniel W. S. Almeida    #. N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client
139f8b8d030SDaniel W. S. Almeida       and server will not be built
140f8b8d030SDaniel W. S. Almeida
141f8b8d030SDaniel W. S. Almeida    #. M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M,
142f8b8d030SDaniel W. S. Almeida       in this case the NFS/RDMA client and server will be built as modules
143f8b8d030SDaniel W. S. Almeida
144f8b8d030SDaniel W. S. Almeida    #. Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client
145f8b8d030SDaniel W. S. Almeida       and server will be built into the kernel
146f8b8d030SDaniel W. S. Almeida
147f8b8d030SDaniel W. S. Almeida  Therefore, if you have followed the steps above and turned no NFS and RDMA,
148f8b8d030SDaniel W. S. Almeida  the NFS/RDMA client and server will be built.
149f8b8d030SDaniel W. S. Almeida
150f8b8d030SDaniel W. S. Almeida  Build a new kernel, install it, boot it.
151f8b8d030SDaniel W. S. Almeida
152f8b8d030SDaniel W. S. AlmeidaCheck RDMA and NFS Setup
153f8b8d030SDaniel W. S. Almeida========================
154f8b8d030SDaniel W. S. Almeida
155f8b8d030SDaniel W. S. AlmeidaBefore configuring the NFS/RDMA software, it is a good idea to test
156f8b8d030SDaniel W. S. Almeidayour new kernel to ensure that the kernel is working correctly.
157f8b8d030SDaniel W. S. AlmeidaIn particular, it is a good idea to verify that the RDMA stack
158f8b8d030SDaniel W. S. Almeidais functioning as expected and standard NFS over TCP/IP and/or UDP/IP
159f8b8d030SDaniel W. S. Almeidais working properly.
160f8b8d030SDaniel W. S. Almeida
161f8b8d030SDaniel W. S. Almeida- Check RDMA Setup
162f8b8d030SDaniel W. S. Almeida
163f8b8d030SDaniel W. S. Almeida  If you built the RDMA components as modules, load them at
164f8b8d030SDaniel W. S. Almeida  this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
165f8b8d030SDaniel W. S. Almeida  card:
166f8b8d030SDaniel W. S. Almeida
167f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
168f8b8d030SDaniel W. S. Almeida
169f8b8d030SDaniel W. S. Almeida    $ modprobe ib_mthca
170f8b8d030SDaniel W. S. Almeida    $ modprobe ib_ipoib
171f8b8d030SDaniel W. S. Almeida
172f8b8d030SDaniel W. S. Almeida  If you are using InfiniBand, make sure there is a Subnet Manager (SM)
173f8b8d030SDaniel W. S. Almeida  running on the network. If your IB switch has an embedded SM, you can
174f8b8d030SDaniel W. S. Almeida  use it. Otherwise, you will need to run an SM, such as OpenSM, on one
175f8b8d030SDaniel W. S. Almeida  of your end nodes.
176f8b8d030SDaniel W. S. Almeida
177f8b8d030SDaniel W. S. Almeida  If an SM is running on your network, you should see the following:
178f8b8d030SDaniel W. S. Almeida
179f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
180f8b8d030SDaniel W. S. Almeida
181f8b8d030SDaniel W. S. Almeida    $ cat /sys/class/infiniband/driverX/ports/1/state
182f8b8d030SDaniel W. S. Almeida    4: ACTIVE
183f8b8d030SDaniel W. S. Almeida
184f8b8d030SDaniel W. S. Almeida  where driverX is mthca0, ipath5, ehca3, etc.
185f8b8d030SDaniel W. S. Almeida
186f8b8d030SDaniel W. S. Almeida  To further test the InfiniBand software stack, use IPoIB (this
187f8b8d030SDaniel W. S. Almeida  assumes you have two IB hosts named host1 and host2):
188f8b8d030SDaniel W. S. Almeida
189f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
190f8b8d030SDaniel W. S. Almeida
191f8b8d030SDaniel W. S. Almeida    host1$ ip link set dev ib0 up
192f8b8d030SDaniel W. S. Almeida    host1$ ip address add dev ib0 a.b.c.x
193f8b8d030SDaniel W. S. Almeida    host2$ ip link set dev ib0 up
194f8b8d030SDaniel W. S. Almeida    host2$ ip address add dev ib0 a.b.c.y
195f8b8d030SDaniel W. S. Almeida    host1$ ping a.b.c.y
196f8b8d030SDaniel W. S. Almeida    host2$ ping a.b.c.x
197f8b8d030SDaniel W. S. Almeida
198f8b8d030SDaniel W. S. Almeida  For other device types, follow the appropriate procedures.
199f8b8d030SDaniel W. S. Almeida
200f8b8d030SDaniel W. S. Almeida- Check NFS Setup
201f8b8d030SDaniel W. S. Almeida
202f8b8d030SDaniel W. S. Almeida  For the NFS components enabled above (client and/or server),
203f8b8d030SDaniel W. S. Almeida  test their functionality over standard Ethernet using TCP/IP or UDP/IP.
204f8b8d030SDaniel W. S. Almeida
205f8b8d030SDaniel W. S. AlmeidaNFS/RDMA Setup
206f8b8d030SDaniel W. S. Almeida==============
207f8b8d030SDaniel W. S. Almeida
208f8b8d030SDaniel W. S. AlmeidaWe recommend that you use two machines, one to act as the client and
209f8b8d030SDaniel W. S. Almeidaone to act as the server.
210f8b8d030SDaniel W. S. Almeida
211f8b8d030SDaniel W. S. AlmeidaOne time configuration:
212f8b8d030SDaniel W. S. Almeida-----------------------
213f8b8d030SDaniel W. S. Almeida
214f8b8d030SDaniel W. S. Almeida- On the server system, configure the /etc/exports file and start the NFS/RDMA server.
215f8b8d030SDaniel W. S. Almeida
216f8b8d030SDaniel W. S. Almeida  Exports entries with the following formats have been tested::
217f8b8d030SDaniel W. S. Almeida
218f8b8d030SDaniel W. S. Almeida  /vol0   192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
219f8b8d030SDaniel W. S. Almeida  /vol0   192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)
220f8b8d030SDaniel W. S. Almeida
221f8b8d030SDaniel W. S. Almeida  The IP address(es) is(are) the client's IPoIB address for an InfiniBand
222f8b8d030SDaniel W. S. Almeida  HCA or the client's iWARP address(es) for an RNIC.
223f8b8d030SDaniel W. S. Almeida
224f8b8d030SDaniel W. S. Almeida  .. note::
225f8b8d030SDaniel W. S. Almeida    The "insecure" option must be used because the NFS/RDMA client does
226f8b8d030SDaniel W. S. Almeida    not use a reserved port.
227f8b8d030SDaniel W. S. Almeida
228f8b8d030SDaniel W. S. AlmeidaEach time a machine boots:
229f8b8d030SDaniel W. S. Almeida--------------------------
230f8b8d030SDaniel W. S. Almeida
231f8b8d030SDaniel W. S. Almeida- Load and configure the RDMA drivers
232f8b8d030SDaniel W. S. Almeida
233f8b8d030SDaniel W. S. Almeida  For InfiniBand using a Mellanox adapter:
234f8b8d030SDaniel W. S. Almeida
235f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
236f8b8d030SDaniel W. S. Almeida
237f8b8d030SDaniel W. S. Almeida    $ modprobe ib_mthca
238f8b8d030SDaniel W. S. Almeida    $ modprobe ib_ipoib
239f8b8d030SDaniel W. S. Almeida    $ ip li set dev ib0 up
240f8b8d030SDaniel W. S. Almeida    $ ip addr add dev ib0 a.b.c.d
241f8b8d030SDaniel W. S. Almeida
242f8b8d030SDaniel W. S. Almeida  .. note::
243f8b8d030SDaniel W. S. Almeida    Please use unique addresses for the client and server!
244f8b8d030SDaniel W. S. Almeida
245f8b8d030SDaniel W. S. Almeida- Start the NFS server
246f8b8d030SDaniel W. S. Almeida
247f8b8d030SDaniel W. S. Almeida  If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
248f8b8d030SDaniel W. S. Almeida  kernel config), load the RDMA transport module:
249f8b8d030SDaniel W. S. Almeida
250f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
251f8b8d030SDaniel W. S. Almeida
252f8b8d030SDaniel W. S. Almeida    $ modprobe svcrdma
253f8b8d030SDaniel W. S. Almeida
254f8b8d030SDaniel W. S. Almeida  Regardless of how the server was built (module or built-in), start the
255f8b8d030SDaniel W. S. Almeida  server:
256f8b8d030SDaniel W. S. Almeida
257f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
258f8b8d030SDaniel W. S. Almeida
259f8b8d030SDaniel W. S. Almeida    $ /etc/init.d/nfs start
260f8b8d030SDaniel W. S. Almeida
261f8b8d030SDaniel W. S. Almeida  or
262f8b8d030SDaniel W. S. Almeida
263f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
264f8b8d030SDaniel W. S. Almeida
265f8b8d030SDaniel W. S. Almeida    $ service nfs start
266f8b8d030SDaniel W. S. Almeida
267f8b8d030SDaniel W. S. Almeida  Instruct the server to listen on the RDMA transport:
268f8b8d030SDaniel W. S. Almeida
269f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
270f8b8d030SDaniel W. S. Almeida
271f8b8d030SDaniel W. S. Almeida    $ echo rdma 20049 > /proc/fs/nfsd/portlist
272f8b8d030SDaniel W. S. Almeida
273f8b8d030SDaniel W. S. Almeida- On the client system
274f8b8d030SDaniel W. S. Almeida
275f8b8d030SDaniel W. S. Almeida  If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
276f8b8d030SDaniel W. S. Almeida  kernel config), load the RDMA client module:
277f8b8d030SDaniel W. S. Almeida
278f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
279f8b8d030SDaniel W. S. Almeida
280f8b8d030SDaniel W. S. Almeida    $ modprobe xprtrdma.ko
281f8b8d030SDaniel W. S. Almeida
282f8b8d030SDaniel W. S. Almeida  Regardless of how the client was built (module or built-in), use this
283f8b8d030SDaniel W. S. Almeida  command to mount the NFS/RDMA server:
284f8b8d030SDaniel W. S. Almeida
285f8b8d030SDaniel W. S. Almeida  .. code-block:: sh
286f8b8d030SDaniel W. S. Almeida
287f8b8d030SDaniel W. S. Almeida    $ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt
288f8b8d030SDaniel W. S. Almeida
289f8b8d030SDaniel W. S. Almeida  To verify that the mount is using RDMA, run "cat /proc/mounts" and check
290f8b8d030SDaniel W. S. Almeida  the "proto" field for the given mount.
291f8b8d030SDaniel W. S. Almeida
292f8b8d030SDaniel W. S. Almeida  Congratulations! You're using NFS/RDMA!
293