xref: /linux/Documentation/filesystems/nfs/rpc-cache.rst (revision a1c613ae4c322ddd58d5a8539dbfba2a0380a8c0)
1f0bf8a98SDaniel W. S. Almeida=========
2f0bf8a98SDaniel W. S. AlmeidaRPC Cache
3f0bf8a98SDaniel W. S. Almeida=========
4f0bf8a98SDaniel W. S. Almeida
5f0bf8a98SDaniel W. S. AlmeidaThis document gives a brief introduction to the caching
6f0bf8a98SDaniel W. S. Almeidamechanisms in the sunrpc layer that is used, in particular,
7f0bf8a98SDaniel W. S. Almeidafor NFS authentication.
8f0bf8a98SDaniel W. S. Almeida
9f0bf8a98SDaniel W. S. AlmeidaCaches
10f0bf8a98SDaniel W. S. Almeida======
11f0bf8a98SDaniel W. S. Almeida
12f0bf8a98SDaniel W. S. AlmeidaThe caching replaces the old exports table and allows for
13f0bf8a98SDaniel W. S. Almeidaa wide variety of values to be caches.
14f0bf8a98SDaniel W. S. Almeida
15f0bf8a98SDaniel W. S. AlmeidaThere are a number of caches that are similar in structure though
16f0bf8a98SDaniel W. S. Almeidaquite possibly very different in content and use.  There is a corpus
17f0bf8a98SDaniel W. S. Almeidaof common code for managing these caches.
18f0bf8a98SDaniel W. S. Almeida
19f0bf8a98SDaniel W. S. AlmeidaExamples of caches that are likely to be needed are:
20f0bf8a98SDaniel W. S. Almeida
21f0bf8a98SDaniel W. S. Almeida  - mapping from IP address to client name
22f0bf8a98SDaniel W. S. Almeida  - mapping from client name and filesystem to export options
23f0bf8a98SDaniel W. S. Almeida  - mapping from UID to list of GIDs, to work around NFS's limitation
24f0bf8a98SDaniel W. S. Almeida    of 16 gids.
25f0bf8a98SDaniel W. S. Almeida  - mappings between local UID/GID and remote UID/GID for sites that
26f0bf8a98SDaniel W. S. Almeida    do not have uniform uid assignment
27f0bf8a98SDaniel W. S. Almeida  - mapping from network identify to public key for crypto authentication.
28f0bf8a98SDaniel W. S. Almeida
29f0bf8a98SDaniel W. S. AlmeidaThe common code handles such things as:
30f0bf8a98SDaniel W. S. Almeida
31f0bf8a98SDaniel W. S. Almeida   - general cache lookup with correct locking
32f0bf8a98SDaniel W. S. Almeida   - supporting 'NEGATIVE' as well as positive entries
33f0bf8a98SDaniel W. S. Almeida   - allowing an EXPIRED time on cache items, and removing
34f0bf8a98SDaniel W. S. Almeida     items after they expire, and are no longer in-use.
35f0bf8a98SDaniel W. S. Almeida   - making requests to user-space to fill in cache entries
36f0bf8a98SDaniel W. S. Almeida   - allowing user-space to directly set entries in the cache
37f0bf8a98SDaniel W. S. Almeida   - delaying RPC requests that depend on as-yet incomplete
38f0bf8a98SDaniel W. S. Almeida     cache entries, and replaying those requests when the cache entry
39f0bf8a98SDaniel W. S. Almeida     is complete.
40f0bf8a98SDaniel W. S. Almeida   - clean out old entries as they expire.
41f0bf8a98SDaniel W. S. Almeida
42f0bf8a98SDaniel W. S. AlmeidaCreating a Cache
43f0bf8a98SDaniel W. S. Almeida----------------
44f0bf8a98SDaniel W. S. Almeida
45f0bf8a98SDaniel W. S. Almeida-  A cache needs a datum to store.  This is in the form of a
46f0bf8a98SDaniel W. S. Almeida   structure definition that must contain a struct cache_head
47f0bf8a98SDaniel W. S. Almeida   as an element, usually the first.
48f0bf8a98SDaniel W. S. Almeida   It will also contain a key and some content.
49f0bf8a98SDaniel W. S. Almeida   Each cache element is reference counted and contains
50f0bf8a98SDaniel W. S. Almeida   expiry and update times for use in cache management.
51f0bf8a98SDaniel W. S. Almeida-  A cache needs a "cache_detail" structure that
52f0bf8a98SDaniel W. S. Almeida   describes the cache.  This stores the hash table, some
53f0bf8a98SDaniel W. S. Almeida   parameters for cache management, and some operations detailing how
54f0bf8a98SDaniel W. S. Almeida   to work with particular cache items.
55f0bf8a98SDaniel W. S. Almeida
56f0bf8a98SDaniel W. S. Almeida   The operations are:
57f0bf8a98SDaniel W. S. Almeida
58f0bf8a98SDaniel W. S. Almeida    struct cache_head \*alloc(void)
59f0bf8a98SDaniel W. S. Almeida      This simply allocates appropriate memory and returns
60f0bf8a98SDaniel W. S. Almeida      a pointer to the cache_detail embedded within the
61f0bf8a98SDaniel W. S. Almeida      structure
62f0bf8a98SDaniel W. S. Almeida
63f0bf8a98SDaniel W. S. Almeida    void cache_put(struct kref \*)
64f0bf8a98SDaniel W. S. Almeida      This is called when the last reference to an item is
65f0bf8a98SDaniel W. S. Almeida      dropped.  The pointer passed is to the 'ref' field
66f0bf8a98SDaniel W. S. Almeida      in the cache_head.  cache_put should release any
67f0bf8a98SDaniel W. S. Almeida      references create by 'cache_init' and, if CACHE_VALID
68f0bf8a98SDaniel W. S. Almeida      is set, any references created by cache_update.
69f0bf8a98SDaniel W. S. Almeida      It should then release the memory allocated by
70f0bf8a98SDaniel W. S. Almeida      'alloc'.
71f0bf8a98SDaniel W. S. Almeida
72f0bf8a98SDaniel W. S. Almeida    int match(struct cache_head \*orig, struct cache_head \*new)
73f0bf8a98SDaniel W. S. Almeida      test if the keys in the two structures match.  Return
74f0bf8a98SDaniel W. S. Almeida      1 if they do, 0 if they don't.
75f0bf8a98SDaniel W. S. Almeida
76f0bf8a98SDaniel W. S. Almeida    void init(struct cache_head \*orig, struct cache_head \*new)
77f0bf8a98SDaniel W. S. Almeida      Set the 'key' fields in 'new' from 'orig'.  This may
78f0bf8a98SDaniel W. S. Almeida      include taking references to shared objects.
79f0bf8a98SDaniel W. S. Almeida
80f0bf8a98SDaniel W. S. Almeida    void update(struct cache_head \*orig, struct cache_head \*new)
81*d56b699dSBjorn Helgaas      Set the 'content' fields in 'new' from 'orig'.
82f0bf8a98SDaniel W. S. Almeida
83f0bf8a98SDaniel W. S. Almeida    int cache_show(struct seq_file \*m, struct cache_detail \*cd, struct cache_head \*h)
84f0bf8a98SDaniel W. S. Almeida      Optional.  Used to provide a /proc file that lists the
85f0bf8a98SDaniel W. S. Almeida      contents of a cache.  This should show one item,
86f0bf8a98SDaniel W. S. Almeida      usually on just one line.
87f0bf8a98SDaniel W. S. Almeida
88f0bf8a98SDaniel W. S. Almeida    int cache_request(struct cache_detail \*cd, struct cache_head \*h, char \*\*bpp, int \*blen)
89f0bf8a98SDaniel W. S. Almeida      Format a request to be send to user-space for an item
90f0bf8a98SDaniel W. S. Almeida      to be instantiated.  \*bpp is a buffer of size \*blen.
91f0bf8a98SDaniel W. S. Almeida      bpp should be moved forward over the encoded message,
92f0bf8a98SDaniel W. S. Almeida      and  \*blen should be reduced to show how much free
93f0bf8a98SDaniel W. S. Almeida      space remains.  Return 0 on success or <0 if not
94f0bf8a98SDaniel W. S. Almeida      enough room or other problem.
95f0bf8a98SDaniel W. S. Almeida
96f0bf8a98SDaniel W. S. Almeida    int cache_parse(struct cache_detail \*cd, char \*buf, int len)
97f0bf8a98SDaniel W. S. Almeida      A message from user space has arrived to fill out a
98f0bf8a98SDaniel W. S. Almeida      cache entry.  It is in 'buf' of length 'len'.
99f0bf8a98SDaniel W. S. Almeida      cache_parse should parse this, find the item in the
100f0bf8a98SDaniel W. S. Almeida      cache with sunrpc_cache_lookup_rcu, and update the item
101f0bf8a98SDaniel W. S. Almeida      with sunrpc_cache_update.
102f0bf8a98SDaniel W. S. Almeida
103f0bf8a98SDaniel W. S. Almeida
104f0bf8a98SDaniel W. S. Almeida-  A cache needs to be registered using cache_register().  This
105f0bf8a98SDaniel W. S. Almeida   includes it on a list of caches that will be regularly
106f0bf8a98SDaniel W. S. Almeida   cleaned to discard old data.
107f0bf8a98SDaniel W. S. Almeida
108f0bf8a98SDaniel W. S. AlmeidaUsing a cache
109f0bf8a98SDaniel W. S. Almeida-------------
110f0bf8a98SDaniel W. S. Almeida
111f0bf8a98SDaniel W. S. AlmeidaTo find a value in a cache, call sunrpc_cache_lookup_rcu passing a pointer
112f0bf8a98SDaniel W. S. Almeidato the cache_head in a sample item with the 'key' fields filled in.
113f0bf8a98SDaniel W. S. AlmeidaThis will be passed to ->match to identify the target entry.  If no
114f0bf8a98SDaniel W. S. Almeidaentry is found, a new entry will be create, added to the cache, and
115f0bf8a98SDaniel W. S. Almeidamarked as not containing valid data.
116f0bf8a98SDaniel W. S. Almeida
117f0bf8a98SDaniel W. S. AlmeidaThe item returned is typically passed to cache_check which will check
118f0bf8a98SDaniel W. S. Almeidaif the data is valid, and may initiate an up-call to get fresh data.
119f0bf8a98SDaniel W. S. Almeidacache_check will return -ENOENT in the entry is negative or if an up
120f0bf8a98SDaniel W. S. Almeidacall is needed but not possible, -EAGAIN if an upcall is pending,
121f0bf8a98SDaniel W. S. Almeidaor 0 if the data is valid;
122f0bf8a98SDaniel W. S. Almeida
123f0bf8a98SDaniel W. S. Almeidacache_check can be passed a "struct cache_req\*".  This structure is
124f0bf8a98SDaniel W. S. Almeidatypically embedded in the actual request and can be used to create a
125f0bf8a98SDaniel W. S. Almeidadeferred copy of the request (struct cache_deferred_req).  This is
126f0bf8a98SDaniel W. S. Almeidadone when the found cache item is not uptodate, but the is reason to
127f0bf8a98SDaniel W. S. Almeidabelieve that userspace might provide information soon.  When the cache
128f0bf8a98SDaniel W. S. Almeidaitem does become valid, the deferred copy of the request will be
129f0bf8a98SDaniel W. S. Almeidarevisited (->revisit).  It is expected that this method will
130f0bf8a98SDaniel W. S. Almeidareschedule the request for processing.
131f0bf8a98SDaniel W. S. Almeida
132f0bf8a98SDaniel W. S. AlmeidaThe value returned by sunrpc_cache_lookup_rcu can also be passed to
133f0bf8a98SDaniel W. S. Almeidasunrpc_cache_update to set the content for the item.  A second item is
134f0bf8a98SDaniel W. S. Almeidapassed which should hold the content.  If the item found by _lookup
135f0bf8a98SDaniel W. S. Almeidahas valid data, then it is discarded and a new item is created.  This
136f0bf8a98SDaniel W. S. Almeidasaves any user of an item from worrying about content changing while
137f0bf8a98SDaniel W. S. Almeidait is being inspected.  If the item found by _lookup does not contain
138f0bf8a98SDaniel W. S. Almeidavalid data, then the content is copied across and CACHE_VALID is set.
139f0bf8a98SDaniel W. S. Almeida
140f0bf8a98SDaniel W. S. AlmeidaPopulating a cache
141f0bf8a98SDaniel W. S. Almeida------------------
142f0bf8a98SDaniel W. S. Almeida
143f0bf8a98SDaniel W. S. AlmeidaEach cache has a name, and when the cache is registered, a directory
144f0bf8a98SDaniel W. S. Almeidawith that name is created in /proc/net/rpc
145f0bf8a98SDaniel W. S. Almeida
146f0bf8a98SDaniel W. S. AlmeidaThis directory contains a file called 'channel' which is a channel
147f0bf8a98SDaniel W. S. Almeidafor communicating between kernel and user for populating the cache.
148f0bf8a98SDaniel W. S. AlmeidaThis directory may later contain other files of interacting
149f0bf8a98SDaniel W. S. Almeidawith the cache.
150f0bf8a98SDaniel W. S. Almeida
151f0bf8a98SDaniel W. S. AlmeidaThe 'channel' works a bit like a datagram socket. Each 'write' is
152f0bf8a98SDaniel W. S. Almeidapassed as a whole to the cache for parsing and interpretation.
153f0bf8a98SDaniel W. S. AlmeidaEach cache can treat the write requests differently, but it is
154f0bf8a98SDaniel W. S. Almeidaexpected that a message written will contain:
155f0bf8a98SDaniel W. S. Almeida
156f0bf8a98SDaniel W. S. Almeida  - a key
157f0bf8a98SDaniel W. S. Almeida  - an expiry time
158f0bf8a98SDaniel W. S. Almeida  - a content.
159f0bf8a98SDaniel W. S. Almeida
160f0bf8a98SDaniel W. S. Almeidawith the intention that an item in the cache with the give key
161f0bf8a98SDaniel W. S. Almeidashould be create or updated to have the given content, and the
162f0bf8a98SDaniel W. S. Almeidaexpiry time should be set on that item.
163f0bf8a98SDaniel W. S. Almeida
164f0bf8a98SDaniel W. S. AlmeidaReading from a channel is a bit more interesting.  When a cache
165f0bf8a98SDaniel W. S. Almeidalookup fails, or when it succeeds but finds an entry that may soon
166f0bf8a98SDaniel W. S. Almeidaexpire, a request is lodged for that cache item to be updated by
167f0bf8a98SDaniel W. S. Almeidauser-space.  These requests appear in the channel file.
168f0bf8a98SDaniel W. S. Almeida
169f0bf8a98SDaniel W. S. AlmeidaSuccessive reads will return successive requests.
170f0bf8a98SDaniel W. S. AlmeidaIf there are no more requests to return, read will return EOF, but a
171f0bf8a98SDaniel W. S. Almeidaselect or poll for read will block waiting for another request to be
172f0bf8a98SDaniel W. S. Almeidaadded.
173f0bf8a98SDaniel W. S. Almeida
174f0bf8a98SDaniel W. S. AlmeidaThus a user-space helper is likely to::
175f0bf8a98SDaniel W. S. Almeida
176f0bf8a98SDaniel W. S. Almeida  open the channel.
177f0bf8a98SDaniel W. S. Almeida    select for readable
178f0bf8a98SDaniel W. S. Almeida    read a request
179f0bf8a98SDaniel W. S. Almeida    write a response
180f0bf8a98SDaniel W. S. Almeida  loop.
181f0bf8a98SDaniel W. S. Almeida
182f0bf8a98SDaniel W. S. AlmeidaIf it dies and needs to be restarted, any requests that have not been
183f0bf8a98SDaniel W. S. Almeidaanswered will still appear in the file and will be read by the new
184f0bf8a98SDaniel W. S. Almeidainstance of the helper.
185f0bf8a98SDaniel W. S. Almeida
186f0bf8a98SDaniel W. S. AlmeidaEach cache should define a "cache_parse" method which takes a message
187f0bf8a98SDaniel W. S. Almeidawritten from user-space and processes it.  It should return an error
188f0bf8a98SDaniel W. S. Almeida(which propagates back to the write syscall) or 0.
189f0bf8a98SDaniel W. S. Almeida
190f0bf8a98SDaniel W. S. AlmeidaEach cache should also define a "cache_request" method which
191f0bf8a98SDaniel W. S. Almeidatakes a cache item and encodes a request into the buffer
192f0bf8a98SDaniel W. S. Almeidaprovided.
193f0bf8a98SDaniel W. S. Almeida
194f0bf8a98SDaniel W. S. Almeida.. note::
195f0bf8a98SDaniel W. S. Almeida  If a cache has no active readers on the channel, and has had not
196f0bf8a98SDaniel W. S. Almeida  active readers for more than 60 seconds, further requests will not be
197f0bf8a98SDaniel W. S. Almeida  added to the channel but instead all lookups that do not find a valid
198f0bf8a98SDaniel W. S. Almeida  entry will fail.  This is partly for backward compatibility: The
199f0bf8a98SDaniel W. S. Almeida  previous nfs exports table was deemed to be authoritative and a
200f0bf8a98SDaniel W. S. Almeida  failed lookup meant a definite 'no'.
201f0bf8a98SDaniel W. S. Almeida
202f0bf8a98SDaniel W. S. Almeidarequest/response format
203f0bf8a98SDaniel W. S. Almeida-----------------------
204f0bf8a98SDaniel W. S. Almeida
205f0bf8a98SDaniel W. S. AlmeidaWhile each cache is free to use its own format for requests
206f0bf8a98SDaniel W. S. Almeidaand responses over channel, the following is recommended as
207f0bf8a98SDaniel W. S. Almeidaappropriate and support routines are available to help:
208f0bf8a98SDaniel W. S. AlmeidaEach request or response record should be printable ASCII
209f0bf8a98SDaniel W. S. Almeidawith precisely one newline character which should be at the end.
210f0bf8a98SDaniel W. S. AlmeidaFields within the record should be separated by spaces, normally one.
211f0bf8a98SDaniel W. S. AlmeidaIf spaces, newlines, or nul characters are needed in a field they
212f0bf8a98SDaniel W. S. Almeidamuch be quoted.  two mechanisms are available:
213f0bf8a98SDaniel W. S. Almeida
214f0bf8a98SDaniel W. S. Almeida-  If a field begins '\x' then it must contain an even number of
215f0bf8a98SDaniel W. S. Almeida   hex digits, and pairs of these digits provide the bytes in the
216f0bf8a98SDaniel W. S. Almeida   field.
217f0bf8a98SDaniel W. S. Almeida-  otherwise a \ in the field must be followed by 3 octal digits
218f0bf8a98SDaniel W. S. Almeida   which give the code for a byte.  Other characters are treated
219f0bf8a98SDaniel W. S. Almeida   as them selves.  At the very least, space, newline, nul, and
220f0bf8a98SDaniel W. S. Almeida   '\' must be quoted in this way.
221