xref: /linux/Documentation/filesystems/fiemap.rst (revision 4b4193256c8d3bc3a5397b5cd9494c2ad386317d)
1e6f7df74SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
2e6f7df74SMauro Carvalho Chehab
3e6f7df74SMauro Carvalho Chehab============
4e6f7df74SMauro Carvalho ChehabFiemap Ioctl
5e6f7df74SMauro Carvalho Chehab============
6e6f7df74SMauro Carvalho Chehab
7e6f7df74SMauro Carvalho ChehabThe fiemap ioctl is an efficient method for userspace to get file
8e6f7df74SMauro Carvalho Chehabextent mappings. Instead of block-by-block mapping (such as bmap), fiemap
9e6f7df74SMauro Carvalho Chehabreturns a list of extents.
10e6f7df74SMauro Carvalho Chehab
11e6f7df74SMauro Carvalho Chehab
12e6f7df74SMauro Carvalho ChehabRequest Basics
13e6f7df74SMauro Carvalho Chehab--------------
14e6f7df74SMauro Carvalho Chehab
15e6f7df74SMauro Carvalho ChehabA fiemap request is encoded within struct fiemap::
16e6f7df74SMauro Carvalho Chehab
17e6f7df74SMauro Carvalho Chehab  struct fiemap {
18e6f7df74SMauro Carvalho Chehab	__u64	fm_start;	 /* logical offset (inclusive) at
19e6f7df74SMauro Carvalho Chehab				  * which to start mapping (in) */
20e6f7df74SMauro Carvalho Chehab	__u64	fm_length;	 /* logical length of mapping which
21e6f7df74SMauro Carvalho Chehab				  * userspace cares about (in) */
22e6f7df74SMauro Carvalho Chehab	__u32	fm_flags;	 /* FIEMAP_FLAG_* flags for request (in/out) */
23e6f7df74SMauro Carvalho Chehab	__u32	fm_mapped_extents; /* number of extents that were
24e6f7df74SMauro Carvalho Chehab				    * mapped (out) */
25e6f7df74SMauro Carvalho Chehab	__u32	fm_extent_count; /* size of fm_extents array (in) */
26e6f7df74SMauro Carvalho Chehab	__u32	fm_reserved;
27e6f7df74SMauro Carvalho Chehab	struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
28e6f7df74SMauro Carvalho Chehab  };
29e6f7df74SMauro Carvalho Chehab
30e6f7df74SMauro Carvalho Chehab
31e6f7df74SMauro Carvalho Chehabfm_start, and fm_length specify the logical range within the file
32e6f7df74SMauro Carvalho Chehabwhich the process would like mappings for. Extents returned mirror
33e6f7df74SMauro Carvalho Chehabthose on disk - that is, the logical offset of the 1st returned extent
34e6f7df74SMauro Carvalho Chehabmay start before fm_start, and the range covered by the last returned
35e6f7df74SMauro Carvalho Chehabextent may end after fm_length. All offsets and lengths are in bytes.
36e6f7df74SMauro Carvalho Chehab
37e6f7df74SMauro Carvalho ChehabCertain flags to modify the way in which mappings are looked up can be
38e6f7df74SMauro Carvalho Chehabset in fm_flags. If the kernel doesn't understand some particular
39e6f7df74SMauro Carvalho Chehabflags, it will return EBADR and the contents of fm_flags will contain
40e6f7df74SMauro Carvalho Chehabthe set of flags which caused the error. If the kernel is compatible
41e6f7df74SMauro Carvalho Chehabwith all flags passed, the contents of fm_flags will be unmodified.
42e6f7df74SMauro Carvalho ChehabIt is up to userspace to determine whether rejection of a particular
43e6f7df74SMauro Carvalho Chehabflag is fatal to its operation. This scheme is intended to allow the
44e6f7df74SMauro Carvalho Chehabfiemap interface to grow in the future but without losing
45e6f7df74SMauro Carvalho Chehabcompatibility with old software.
46e6f7df74SMauro Carvalho Chehab
47e6f7df74SMauro Carvalho Chehabfm_extent_count specifies the number of elements in the fm_extents[] array
48e6f7df74SMauro Carvalho Chehabthat can be used to return extents.  If fm_extent_count is zero, then the
49e6f7df74SMauro Carvalho Chehabfm_extents[] array is ignored (no extents will be returned), and the
50e6f7df74SMauro Carvalho Chehabfm_mapped_extents count will hold the number of extents needed in
51e6f7df74SMauro Carvalho Chehabfm_extents[] to hold the file's current mapping.  Note that there is
52e6f7df74SMauro Carvalho Chehabnothing to prevent the file from changing between calls to FIEMAP.
53e6f7df74SMauro Carvalho Chehab
54e6f7df74SMauro Carvalho ChehabThe following flags can be set in fm_flags:
55e6f7df74SMauro Carvalho Chehab
56e6f7df74SMauro Carvalho ChehabFIEMAP_FLAG_SYNC
57e6f7df74SMauro Carvalho Chehab  If this flag is set, the kernel will sync the file before mapping extents.
58e6f7df74SMauro Carvalho Chehab
59e6f7df74SMauro Carvalho ChehabFIEMAP_FLAG_XATTR
60e6f7df74SMauro Carvalho Chehab  If this flag is set, the extents returned will describe the inodes
61e6f7df74SMauro Carvalho Chehab  extended attribute lookup tree, instead of its data tree.
62e6f7df74SMauro Carvalho Chehab
63e6f7df74SMauro Carvalho Chehab
64e6f7df74SMauro Carvalho ChehabExtent Mapping
65e6f7df74SMauro Carvalho Chehab--------------
66e6f7df74SMauro Carvalho Chehab
67e6f7df74SMauro Carvalho ChehabExtent information is returned within the embedded fm_extents array
68e6f7df74SMauro Carvalho Chehabwhich userspace must allocate along with the fiemap structure. The
69e6f7df74SMauro Carvalho Chehabnumber of elements in the fiemap_extents[] array should be passed via
70e6f7df74SMauro Carvalho Chehabfm_extent_count. The number of extents mapped by kernel will be
71e6f7df74SMauro Carvalho Chehabreturned via fm_mapped_extents. If the number of fiemap_extents
72e6f7df74SMauro Carvalho Chehaballocated is less than would be required to map the requested range,
73e6f7df74SMauro Carvalho Chehabthe maximum number of extents that can be mapped in the fm_extent[]
74e6f7df74SMauro Carvalho Chehabarray will be returned and fm_mapped_extents will be equal to
75e6f7df74SMauro Carvalho Chehabfm_extent_count. In that case, the last extent in the array will not
76e6f7df74SMauro Carvalho Chehabcomplete the requested range and will not have the FIEMAP_EXTENT_LAST
77e6f7df74SMauro Carvalho Chehabflag set (see the next section on extent flags).
78e6f7df74SMauro Carvalho Chehab
79e6f7df74SMauro Carvalho ChehabEach extent is described by a single fiemap_extent structure as
80e6f7df74SMauro Carvalho Chehabreturned in fm_extents::
81e6f7df74SMauro Carvalho Chehab
82e6f7df74SMauro Carvalho Chehab    struct fiemap_extent {
83e6f7df74SMauro Carvalho Chehab	    __u64	fe_logical;  /* logical offset in bytes for the start of
84e6f7df74SMauro Carvalho Chehab				* the extent */
85e6f7df74SMauro Carvalho Chehab	    __u64	fe_physical; /* physical offset in bytes for the start
86e6f7df74SMauro Carvalho Chehab				* of the extent */
87e6f7df74SMauro Carvalho Chehab	    __u64	fe_length;   /* length in bytes for the extent */
88e6f7df74SMauro Carvalho Chehab	    __u64	fe_reserved64[2];
89e6f7df74SMauro Carvalho Chehab	    __u32	fe_flags;    /* FIEMAP_EXTENT_* flags for this extent */
90e6f7df74SMauro Carvalho Chehab	    __u32	fe_reserved[3];
91e6f7df74SMauro Carvalho Chehab    };
92e6f7df74SMauro Carvalho Chehab
93e6f7df74SMauro Carvalho ChehabAll offsets and lengths are in bytes and mirror those on disk.  It is valid
94e6f7df74SMauro Carvalho Chehabfor an extents logical offset to start before the request or its logical
95e6f7df74SMauro Carvalho Chehablength to extend past the request.  Unless FIEMAP_EXTENT_NOT_ALIGNED is
96e6f7df74SMauro Carvalho Chehabreturned, fe_logical, fe_physical, and fe_length will be aligned to the
97e6f7df74SMauro Carvalho Chehabblock size of the file system.  With the exception of extents flagged as
98e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_MERGED, adjacent extents will not be merged.
99e6f7df74SMauro Carvalho Chehab
100e6f7df74SMauro Carvalho ChehabThe fe_flags field contains flags which describe the extent returned.
101e6f7df74SMauro Carvalho ChehabA special flag, FIEMAP_EXTENT_LAST is always set on the last extent in
102e6f7df74SMauro Carvalho Chehabthe file so that the process making fiemap calls can determine when no
103e6f7df74SMauro Carvalho Chehabmore extents are available, without having to call the ioctl again.
104e6f7df74SMauro Carvalho Chehab
105e6f7df74SMauro Carvalho ChehabSome flags are intentionally vague and will always be set in the
106e6f7df74SMauro Carvalho Chehabpresence of other more specific flags. This way a program looking for
107e6f7df74SMauro Carvalho Chehaba general property does not have to know all existing and future flags
108e6f7df74SMauro Carvalho Chehabwhich imply that property.
109e6f7df74SMauro Carvalho Chehab
110e6f7df74SMauro Carvalho ChehabFor example, if FIEMAP_EXTENT_DATA_INLINE or FIEMAP_EXTENT_DATA_TAIL
111e6f7df74SMauro Carvalho Chehabare set, FIEMAP_EXTENT_NOT_ALIGNED will also be set. A program looking
112e6f7df74SMauro Carvalho Chehabfor inline or tail-packed data can key on the specific flag. Software
113e6f7df74SMauro Carvalho Chehabwhich simply cares not to try operating on non-aligned extents
114e6f7df74SMauro Carvalho Chehabhowever, can just key on FIEMAP_EXTENT_NOT_ALIGNED, and not have to
115e6f7df74SMauro Carvalho Chehabworry about all present and future flags which might imply unaligned
116e6f7df74SMauro Carvalho Chehabdata. Note that the opposite is not true - it would be valid for
117e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_NOT_ALIGNED to appear alone.
118e6f7df74SMauro Carvalho Chehab
119e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_LAST
120e6f7df74SMauro Carvalho Chehab  This is generally the last extent in the file. A mapping attempt past
121e6f7df74SMauro Carvalho Chehab  this extent may return nothing. Some implementations set this flag to
122e6f7df74SMauro Carvalho Chehab  indicate this extent is the last one in the range queried by the user
123e6f7df74SMauro Carvalho Chehab  (via fiemap->fm_length).
124e6f7df74SMauro Carvalho Chehab
125e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_UNKNOWN
126e6f7df74SMauro Carvalho Chehab  The location of this extent is currently unknown. This may indicate
127e6f7df74SMauro Carvalho Chehab  the data is stored on an inaccessible volume or that no storage has
128e6f7df74SMauro Carvalho Chehab  been allocated for the file yet.
129e6f7df74SMauro Carvalho Chehab
130e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DELALLOC
131e6f7df74SMauro Carvalho Chehab  This will also set FIEMAP_EXTENT_UNKNOWN.
132e6f7df74SMauro Carvalho Chehab
133e6f7df74SMauro Carvalho Chehab  Delayed allocation - while there is data for this extent, its
134e6f7df74SMauro Carvalho Chehab  physical location has not been allocated yet.
135e6f7df74SMauro Carvalho Chehab
136e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_ENCODED
137e6f7df74SMauro Carvalho Chehab  This extent does not consist of plain filesystem blocks but is
138e6f7df74SMauro Carvalho Chehab  encoded (e.g. encrypted or compressed).  Reading the data in this
139e6f7df74SMauro Carvalho Chehab  extent via I/O to the block device will have undefined results.
140e6f7df74SMauro Carvalho Chehab
141e6f7df74SMauro Carvalho ChehabNote that it is *always* undefined to try to update the data
142e6f7df74SMauro Carvalho Chehabin-place by writing to the indicated location without the
143e6f7df74SMauro Carvalho Chehabassistance of the filesystem, or to access the data using the
144e6f7df74SMauro Carvalho Chehabinformation returned by the FIEMAP interface while the filesystem
145e6f7df74SMauro Carvalho Chehabis mounted.  In other words, user applications may only read the
146e6f7df74SMauro Carvalho Chehabextent data via I/O to the block device while the filesystem is
147e6f7df74SMauro Carvalho Chehabunmounted, and then only if the FIEMAP_EXTENT_ENCODED flag is
148e6f7df74SMauro Carvalho Chehabclear; user applications must not try reading or writing to the
149e6f7df74SMauro Carvalho Chehabfilesystem via the block device under any other circumstances.
150e6f7df74SMauro Carvalho Chehab
151e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_ENCRYPTED
152e6f7df74SMauro Carvalho Chehab  This will also set FIEMAP_EXTENT_ENCODED
153e6f7df74SMauro Carvalho Chehab  The data in this extent has been encrypted by the file system.
154e6f7df74SMauro Carvalho Chehab
155e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_NOT_ALIGNED
156e6f7df74SMauro Carvalho Chehab  Extent offsets and length are not guaranteed to be block aligned.
157e6f7df74SMauro Carvalho Chehab
158e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_INLINE
159e6f7df74SMauro Carvalho Chehab  This will also set FIEMAP_EXTENT_NOT_ALIGNED
160e6f7df74SMauro Carvalho Chehab  Data is located within a meta data block.
161e6f7df74SMauro Carvalho Chehab
162e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_TAIL
163e6f7df74SMauro Carvalho Chehab  This will also set FIEMAP_EXTENT_NOT_ALIGNED
164e6f7df74SMauro Carvalho Chehab  Data is packed into a block with data from other files.
165e6f7df74SMauro Carvalho Chehab
166e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_UNWRITTEN
167e6f7df74SMauro Carvalho Chehab  Unwritten extent - the extent is allocated but its data has not been
168e6f7df74SMauro Carvalho Chehab  initialized.  This indicates the extent's data will be all zero if read
169e6f7df74SMauro Carvalho Chehab  through the filesystem but the contents are undefined if read directly from
170e6f7df74SMauro Carvalho Chehab  the device.
171e6f7df74SMauro Carvalho Chehab
172e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_MERGED
173e6f7df74SMauro Carvalho Chehab  This will be set when a file does not support extents, i.e., it uses a block
174e6f7df74SMauro Carvalho Chehab  based addressing scheme.  Since returning an extent for each block back to
175e6f7df74SMauro Carvalho Chehab  userspace would be highly inefficient, the kernel will try to merge most
176e6f7df74SMauro Carvalho Chehab  adjacent blocks into 'extents'.
177e6f7df74SMauro Carvalho Chehab
178e6f7df74SMauro Carvalho Chehab
179e6f7df74SMauro Carvalho ChehabVFS -> File System Implementation
180e6f7df74SMauro Carvalho Chehab---------------------------------
181e6f7df74SMauro Carvalho Chehab
182e6f7df74SMauro Carvalho ChehabFile systems wishing to support fiemap must implement a ->fiemap callback on
183e6f7df74SMauro Carvalho Chehabtheir inode_operations structure. The fs ->fiemap call is responsible for
184e6f7df74SMauro Carvalho Chehabdefining its set of supported fiemap flags, and calling a helper function on
185e6f7df74SMauro Carvalho Chehabeach discovered extent::
186e6f7df74SMauro Carvalho Chehab
187e6f7df74SMauro Carvalho Chehab  struct inode_operations {
188e6f7df74SMauro Carvalho Chehab       ...
189e6f7df74SMauro Carvalho Chehab
190e6f7df74SMauro Carvalho Chehab       int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
191e6f7df74SMauro Carvalho Chehab                     u64 len);
192e6f7df74SMauro Carvalho Chehab
193e6f7df74SMauro Carvalho Chehab->fiemap is passed struct fiemap_extent_info which describes the
194e6f7df74SMauro Carvalho Chehabfiemap request::
195e6f7df74SMauro Carvalho Chehab
196e6f7df74SMauro Carvalho Chehab  struct fiemap_extent_info {
197e6f7df74SMauro Carvalho Chehab	unsigned int fi_flags;		/* Flags as passed from user */
198e6f7df74SMauro Carvalho Chehab	unsigned int fi_extents_mapped;	/* Number of mapped extents */
199e6f7df74SMauro Carvalho Chehab	unsigned int fi_extents_max;	/* Size of fiemap_extent array */
200e6f7df74SMauro Carvalho Chehab	struct fiemap_extent *fi_extents_start;	/* Start of fiemap_extent array */
201e6f7df74SMauro Carvalho Chehab  };
202e6f7df74SMauro Carvalho Chehab
203e6f7df74SMauro Carvalho ChehabIt is intended that the file system should not need to access any of this
204e6f7df74SMauro Carvalho Chehabstructure directly. Filesystem handlers should be tolerant to signals and return
205e6f7df74SMauro Carvalho ChehabEINTR once fatal signal received.
206e6f7df74SMauro Carvalho Chehab
207e6f7df74SMauro Carvalho Chehab
208e6f7df74SMauro Carvalho ChehabFlag checking should be done at the beginning of the ->fiemap callback via the
209*0b166a57SLinus Torvaldsfiemap_prep() helper::
210e6f7df74SMauro Carvalho Chehab
211*0b166a57SLinus Torvalds  int fiemap_prep(struct inode *inode, struct fiemap_extent_info *fieinfo,
212*0b166a57SLinus Torvalds		  u64 start, u64 *len, u32 supported_flags);
213e6f7df74SMauro Carvalho Chehab
214e6f7df74SMauro Carvalho ChehabThe struct fieinfo should be passed in as received from ioctl_fiemap(). The
215e6f7df74SMauro Carvalho Chehabset of fiemap flags which the fs understands should be passed via fs_flags. If
216*0b166a57SLinus Torvaldsfiemap_prep finds invalid user flags, it will place the bad values in
217e6f7df74SMauro Carvalho Chehabfieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from
218*0b166a57SLinus Torvaldsfiemap_prep(), it should immediately exit, returning that error back to
219*0b166a57SLinus Torvaldsioctl_fiemap().  Additionally the range is validate against the supported
220*0b166a57SLinus Torvaldsmaximum file size.
221e6f7df74SMauro Carvalho Chehab
222e6f7df74SMauro Carvalho Chehab
223e6f7df74SMauro Carvalho ChehabFor each extent in the request range, the file system should call
224e6f7df74SMauro Carvalho Chehabthe helper function, fiemap_fill_next_extent()::
225e6f7df74SMauro Carvalho Chehab
226e6f7df74SMauro Carvalho Chehab  int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical,
227e6f7df74SMauro Carvalho Chehab			      u64 phys, u64 len, u32 flags, u32 dev);
228e6f7df74SMauro Carvalho Chehab
229e6f7df74SMauro Carvalho Chehabfiemap_fill_next_extent() will use the passed values to populate the
230e6f7df74SMauro Carvalho Chehabnext free extent in the fm_extents array. 'General' extent flags will
231e6f7df74SMauro Carvalho Chehabautomatically be set from specific flags on behalf of the calling file
232e6f7df74SMauro Carvalho Chehabsystem so that the userspace API is not broken.
233e6f7df74SMauro Carvalho Chehab
234e6f7df74SMauro Carvalho Chehabfiemap_fill_next_extent() returns 0 on success, and 1 when the
235e6f7df74SMauro Carvalho Chehabuser-supplied fm_extents array is full. If an error is encountered
236e6f7df74SMauro Carvalho Chehabwhile copying the extent to user memory, -EFAULT will be returned.
237