1*e6f7df74SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 2*e6f7df74SMauro Carvalho Chehab 3*e6f7df74SMauro Carvalho Chehab============ 4*e6f7df74SMauro Carvalho ChehabFiemap Ioctl 5*e6f7df74SMauro Carvalho Chehab============ 6*e6f7df74SMauro Carvalho Chehab 7*e6f7df74SMauro Carvalho ChehabThe fiemap ioctl is an efficient method for userspace to get file 8*e6f7df74SMauro Carvalho Chehabextent mappings. Instead of block-by-block mapping (such as bmap), fiemap 9*e6f7df74SMauro Carvalho Chehabreturns a list of extents. 10*e6f7df74SMauro Carvalho Chehab 11*e6f7df74SMauro Carvalho Chehab 12*e6f7df74SMauro Carvalho ChehabRequest Basics 13*e6f7df74SMauro Carvalho Chehab-------------- 14*e6f7df74SMauro Carvalho Chehab 15*e6f7df74SMauro Carvalho ChehabA fiemap request is encoded within struct fiemap:: 16*e6f7df74SMauro Carvalho Chehab 17*e6f7df74SMauro Carvalho Chehab struct fiemap { 18*e6f7df74SMauro Carvalho Chehab __u64 fm_start; /* logical offset (inclusive) at 19*e6f7df74SMauro Carvalho Chehab * which to start mapping (in) */ 20*e6f7df74SMauro Carvalho Chehab __u64 fm_length; /* logical length of mapping which 21*e6f7df74SMauro Carvalho Chehab * userspace cares about (in) */ 22*e6f7df74SMauro Carvalho Chehab __u32 fm_flags; /* FIEMAP_FLAG_* flags for request (in/out) */ 23*e6f7df74SMauro Carvalho Chehab __u32 fm_mapped_extents; /* number of extents that were 24*e6f7df74SMauro Carvalho Chehab * mapped (out) */ 25*e6f7df74SMauro Carvalho Chehab __u32 fm_extent_count; /* size of fm_extents array (in) */ 26*e6f7df74SMauro Carvalho Chehab __u32 fm_reserved; 27*e6f7df74SMauro Carvalho Chehab struct fiemap_extent fm_extents[0]; /* array of mapped extents (out) */ 28*e6f7df74SMauro Carvalho Chehab }; 29*e6f7df74SMauro Carvalho Chehab 30*e6f7df74SMauro Carvalho Chehab 31*e6f7df74SMauro Carvalho Chehabfm_start, and fm_length specify the logical range within the file 32*e6f7df74SMauro Carvalho Chehabwhich the process would like mappings for. Extents returned mirror 33*e6f7df74SMauro Carvalho Chehabthose on disk - that is, the logical offset of the 1st returned extent 34*e6f7df74SMauro Carvalho Chehabmay start before fm_start, and the range covered by the last returned 35*e6f7df74SMauro Carvalho Chehabextent may end after fm_length. All offsets and lengths are in bytes. 36*e6f7df74SMauro Carvalho Chehab 37*e6f7df74SMauro Carvalho ChehabCertain flags to modify the way in which mappings are looked up can be 38*e6f7df74SMauro Carvalho Chehabset in fm_flags. If the kernel doesn't understand some particular 39*e6f7df74SMauro Carvalho Chehabflags, it will return EBADR and the contents of fm_flags will contain 40*e6f7df74SMauro Carvalho Chehabthe set of flags which caused the error. If the kernel is compatible 41*e6f7df74SMauro Carvalho Chehabwith all flags passed, the contents of fm_flags will be unmodified. 42*e6f7df74SMauro Carvalho ChehabIt is up to userspace to determine whether rejection of a particular 43*e6f7df74SMauro Carvalho Chehabflag is fatal to its operation. This scheme is intended to allow the 44*e6f7df74SMauro Carvalho Chehabfiemap interface to grow in the future but without losing 45*e6f7df74SMauro Carvalho Chehabcompatibility with old software. 46*e6f7df74SMauro Carvalho Chehab 47*e6f7df74SMauro Carvalho Chehabfm_extent_count specifies the number of elements in the fm_extents[] array 48*e6f7df74SMauro Carvalho Chehabthat can be used to return extents. If fm_extent_count is zero, then the 49*e6f7df74SMauro Carvalho Chehabfm_extents[] array is ignored (no extents will be returned), and the 50*e6f7df74SMauro Carvalho Chehabfm_mapped_extents count will hold the number of extents needed in 51*e6f7df74SMauro Carvalho Chehabfm_extents[] to hold the file's current mapping. Note that there is 52*e6f7df74SMauro Carvalho Chehabnothing to prevent the file from changing between calls to FIEMAP. 53*e6f7df74SMauro Carvalho Chehab 54*e6f7df74SMauro Carvalho ChehabThe following flags can be set in fm_flags: 55*e6f7df74SMauro Carvalho Chehab 56*e6f7df74SMauro Carvalho ChehabFIEMAP_FLAG_SYNC 57*e6f7df74SMauro Carvalho Chehab If this flag is set, the kernel will sync the file before mapping extents. 58*e6f7df74SMauro Carvalho Chehab 59*e6f7df74SMauro Carvalho ChehabFIEMAP_FLAG_XATTR 60*e6f7df74SMauro Carvalho Chehab If this flag is set, the extents returned will describe the inodes 61*e6f7df74SMauro Carvalho Chehab extended attribute lookup tree, instead of its data tree. 62*e6f7df74SMauro Carvalho Chehab 63*e6f7df74SMauro Carvalho Chehab 64*e6f7df74SMauro Carvalho ChehabExtent Mapping 65*e6f7df74SMauro Carvalho Chehab-------------- 66*e6f7df74SMauro Carvalho Chehab 67*e6f7df74SMauro Carvalho ChehabExtent information is returned within the embedded fm_extents array 68*e6f7df74SMauro Carvalho Chehabwhich userspace must allocate along with the fiemap structure. The 69*e6f7df74SMauro Carvalho Chehabnumber of elements in the fiemap_extents[] array should be passed via 70*e6f7df74SMauro Carvalho Chehabfm_extent_count. The number of extents mapped by kernel will be 71*e6f7df74SMauro Carvalho Chehabreturned via fm_mapped_extents. If the number of fiemap_extents 72*e6f7df74SMauro Carvalho Chehaballocated is less than would be required to map the requested range, 73*e6f7df74SMauro Carvalho Chehabthe maximum number of extents that can be mapped in the fm_extent[] 74*e6f7df74SMauro Carvalho Chehabarray will be returned and fm_mapped_extents will be equal to 75*e6f7df74SMauro Carvalho Chehabfm_extent_count. In that case, the last extent in the array will not 76*e6f7df74SMauro Carvalho Chehabcomplete the requested range and will not have the FIEMAP_EXTENT_LAST 77*e6f7df74SMauro Carvalho Chehabflag set (see the next section on extent flags). 78*e6f7df74SMauro Carvalho Chehab 79*e6f7df74SMauro Carvalho ChehabEach extent is described by a single fiemap_extent structure as 80*e6f7df74SMauro Carvalho Chehabreturned in fm_extents:: 81*e6f7df74SMauro Carvalho Chehab 82*e6f7df74SMauro Carvalho Chehab struct fiemap_extent { 83*e6f7df74SMauro Carvalho Chehab __u64 fe_logical; /* logical offset in bytes for the start of 84*e6f7df74SMauro Carvalho Chehab * the extent */ 85*e6f7df74SMauro Carvalho Chehab __u64 fe_physical; /* physical offset in bytes for the start 86*e6f7df74SMauro Carvalho Chehab * of the extent */ 87*e6f7df74SMauro Carvalho Chehab __u64 fe_length; /* length in bytes for the extent */ 88*e6f7df74SMauro Carvalho Chehab __u64 fe_reserved64[2]; 89*e6f7df74SMauro Carvalho Chehab __u32 fe_flags; /* FIEMAP_EXTENT_* flags for this extent */ 90*e6f7df74SMauro Carvalho Chehab __u32 fe_reserved[3]; 91*e6f7df74SMauro Carvalho Chehab }; 92*e6f7df74SMauro Carvalho Chehab 93*e6f7df74SMauro Carvalho ChehabAll offsets and lengths are in bytes and mirror those on disk. It is valid 94*e6f7df74SMauro Carvalho Chehabfor an extents logical offset to start before the request or its logical 95*e6f7df74SMauro Carvalho Chehablength to extend past the request. Unless FIEMAP_EXTENT_NOT_ALIGNED is 96*e6f7df74SMauro Carvalho Chehabreturned, fe_logical, fe_physical, and fe_length will be aligned to the 97*e6f7df74SMauro Carvalho Chehabblock size of the file system. With the exception of extents flagged as 98*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_MERGED, adjacent extents will not be merged. 99*e6f7df74SMauro Carvalho Chehab 100*e6f7df74SMauro Carvalho ChehabThe fe_flags field contains flags which describe the extent returned. 101*e6f7df74SMauro Carvalho ChehabA special flag, FIEMAP_EXTENT_LAST is always set on the last extent in 102*e6f7df74SMauro Carvalho Chehabthe file so that the process making fiemap calls can determine when no 103*e6f7df74SMauro Carvalho Chehabmore extents are available, without having to call the ioctl again. 104*e6f7df74SMauro Carvalho Chehab 105*e6f7df74SMauro Carvalho ChehabSome flags are intentionally vague and will always be set in the 106*e6f7df74SMauro Carvalho Chehabpresence of other more specific flags. This way a program looking for 107*e6f7df74SMauro Carvalho Chehaba general property does not have to know all existing and future flags 108*e6f7df74SMauro Carvalho Chehabwhich imply that property. 109*e6f7df74SMauro Carvalho Chehab 110*e6f7df74SMauro Carvalho ChehabFor example, if FIEMAP_EXTENT_DATA_INLINE or FIEMAP_EXTENT_DATA_TAIL 111*e6f7df74SMauro Carvalho Chehabare set, FIEMAP_EXTENT_NOT_ALIGNED will also be set. A program looking 112*e6f7df74SMauro Carvalho Chehabfor inline or tail-packed data can key on the specific flag. Software 113*e6f7df74SMauro Carvalho Chehabwhich simply cares not to try operating on non-aligned extents 114*e6f7df74SMauro Carvalho Chehabhowever, can just key on FIEMAP_EXTENT_NOT_ALIGNED, and not have to 115*e6f7df74SMauro Carvalho Chehabworry about all present and future flags which might imply unaligned 116*e6f7df74SMauro Carvalho Chehabdata. Note that the opposite is not true - it would be valid for 117*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_NOT_ALIGNED to appear alone. 118*e6f7df74SMauro Carvalho Chehab 119*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_LAST 120*e6f7df74SMauro Carvalho Chehab This is generally the last extent in the file. A mapping attempt past 121*e6f7df74SMauro Carvalho Chehab this extent may return nothing. Some implementations set this flag to 122*e6f7df74SMauro Carvalho Chehab indicate this extent is the last one in the range queried by the user 123*e6f7df74SMauro Carvalho Chehab (via fiemap->fm_length). 124*e6f7df74SMauro Carvalho Chehab 125*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_UNKNOWN 126*e6f7df74SMauro Carvalho Chehab The location of this extent is currently unknown. This may indicate 127*e6f7df74SMauro Carvalho Chehab the data is stored on an inaccessible volume or that no storage has 128*e6f7df74SMauro Carvalho Chehab been allocated for the file yet. 129*e6f7df74SMauro Carvalho Chehab 130*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DELALLOC 131*e6f7df74SMauro Carvalho Chehab This will also set FIEMAP_EXTENT_UNKNOWN. 132*e6f7df74SMauro Carvalho Chehab 133*e6f7df74SMauro Carvalho Chehab Delayed allocation - while there is data for this extent, its 134*e6f7df74SMauro Carvalho Chehab physical location has not been allocated yet. 135*e6f7df74SMauro Carvalho Chehab 136*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_ENCODED 137*e6f7df74SMauro Carvalho Chehab This extent does not consist of plain filesystem blocks but is 138*e6f7df74SMauro Carvalho Chehab encoded (e.g. encrypted or compressed). Reading the data in this 139*e6f7df74SMauro Carvalho Chehab extent via I/O to the block device will have undefined results. 140*e6f7df74SMauro Carvalho Chehab 141*e6f7df74SMauro Carvalho ChehabNote that it is *always* undefined to try to update the data 142*e6f7df74SMauro Carvalho Chehabin-place by writing to the indicated location without the 143*e6f7df74SMauro Carvalho Chehabassistance of the filesystem, or to access the data using the 144*e6f7df74SMauro Carvalho Chehabinformation returned by the FIEMAP interface while the filesystem 145*e6f7df74SMauro Carvalho Chehabis mounted. In other words, user applications may only read the 146*e6f7df74SMauro Carvalho Chehabextent data via I/O to the block device while the filesystem is 147*e6f7df74SMauro Carvalho Chehabunmounted, and then only if the FIEMAP_EXTENT_ENCODED flag is 148*e6f7df74SMauro Carvalho Chehabclear; user applications must not try reading or writing to the 149*e6f7df74SMauro Carvalho Chehabfilesystem via the block device under any other circumstances. 150*e6f7df74SMauro Carvalho Chehab 151*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_ENCRYPTED 152*e6f7df74SMauro Carvalho Chehab This will also set FIEMAP_EXTENT_ENCODED 153*e6f7df74SMauro Carvalho Chehab The data in this extent has been encrypted by the file system. 154*e6f7df74SMauro Carvalho Chehab 155*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_NOT_ALIGNED 156*e6f7df74SMauro Carvalho Chehab Extent offsets and length are not guaranteed to be block aligned. 157*e6f7df74SMauro Carvalho Chehab 158*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_INLINE 159*e6f7df74SMauro Carvalho Chehab This will also set FIEMAP_EXTENT_NOT_ALIGNED 160*e6f7df74SMauro Carvalho Chehab Data is located within a meta data block. 161*e6f7df74SMauro Carvalho Chehab 162*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_DATA_TAIL 163*e6f7df74SMauro Carvalho Chehab This will also set FIEMAP_EXTENT_NOT_ALIGNED 164*e6f7df74SMauro Carvalho Chehab Data is packed into a block with data from other files. 165*e6f7df74SMauro Carvalho Chehab 166*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_UNWRITTEN 167*e6f7df74SMauro Carvalho Chehab Unwritten extent - the extent is allocated but its data has not been 168*e6f7df74SMauro Carvalho Chehab initialized. This indicates the extent's data will be all zero if read 169*e6f7df74SMauro Carvalho Chehab through the filesystem but the contents are undefined if read directly from 170*e6f7df74SMauro Carvalho Chehab the device. 171*e6f7df74SMauro Carvalho Chehab 172*e6f7df74SMauro Carvalho ChehabFIEMAP_EXTENT_MERGED 173*e6f7df74SMauro Carvalho Chehab This will be set when a file does not support extents, i.e., it uses a block 174*e6f7df74SMauro Carvalho Chehab based addressing scheme. Since returning an extent for each block back to 175*e6f7df74SMauro Carvalho Chehab userspace would be highly inefficient, the kernel will try to merge most 176*e6f7df74SMauro Carvalho Chehab adjacent blocks into 'extents'. 177*e6f7df74SMauro Carvalho Chehab 178*e6f7df74SMauro Carvalho Chehab 179*e6f7df74SMauro Carvalho ChehabVFS -> File System Implementation 180*e6f7df74SMauro Carvalho Chehab--------------------------------- 181*e6f7df74SMauro Carvalho Chehab 182*e6f7df74SMauro Carvalho ChehabFile systems wishing to support fiemap must implement a ->fiemap callback on 183*e6f7df74SMauro Carvalho Chehabtheir inode_operations structure. The fs ->fiemap call is responsible for 184*e6f7df74SMauro Carvalho Chehabdefining its set of supported fiemap flags, and calling a helper function on 185*e6f7df74SMauro Carvalho Chehabeach discovered extent:: 186*e6f7df74SMauro Carvalho Chehab 187*e6f7df74SMauro Carvalho Chehab struct inode_operations { 188*e6f7df74SMauro Carvalho Chehab ... 189*e6f7df74SMauro Carvalho Chehab 190*e6f7df74SMauro Carvalho Chehab int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start, 191*e6f7df74SMauro Carvalho Chehab u64 len); 192*e6f7df74SMauro Carvalho Chehab 193*e6f7df74SMauro Carvalho Chehab->fiemap is passed struct fiemap_extent_info which describes the 194*e6f7df74SMauro Carvalho Chehabfiemap request:: 195*e6f7df74SMauro Carvalho Chehab 196*e6f7df74SMauro Carvalho Chehab struct fiemap_extent_info { 197*e6f7df74SMauro Carvalho Chehab unsigned int fi_flags; /* Flags as passed from user */ 198*e6f7df74SMauro Carvalho Chehab unsigned int fi_extents_mapped; /* Number of mapped extents */ 199*e6f7df74SMauro Carvalho Chehab unsigned int fi_extents_max; /* Size of fiemap_extent array */ 200*e6f7df74SMauro Carvalho Chehab struct fiemap_extent *fi_extents_start; /* Start of fiemap_extent array */ 201*e6f7df74SMauro Carvalho Chehab }; 202*e6f7df74SMauro Carvalho Chehab 203*e6f7df74SMauro Carvalho ChehabIt is intended that the file system should not need to access any of this 204*e6f7df74SMauro Carvalho Chehabstructure directly. Filesystem handlers should be tolerant to signals and return 205*e6f7df74SMauro Carvalho ChehabEINTR once fatal signal received. 206*e6f7df74SMauro Carvalho Chehab 207*e6f7df74SMauro Carvalho Chehab 208*e6f7df74SMauro Carvalho ChehabFlag checking should be done at the beginning of the ->fiemap callback via the 209*e6f7df74SMauro Carvalho Chehabfiemap_check_flags() helper:: 210*e6f7df74SMauro Carvalho Chehab 211*e6f7df74SMauro Carvalho Chehab int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags); 212*e6f7df74SMauro Carvalho Chehab 213*e6f7df74SMauro Carvalho ChehabThe struct fieinfo should be passed in as received from ioctl_fiemap(). The 214*e6f7df74SMauro Carvalho Chehabset of fiemap flags which the fs understands should be passed via fs_flags. If 215*e6f7df74SMauro Carvalho Chehabfiemap_check_flags finds invalid user flags, it will place the bad values in 216*e6f7df74SMauro Carvalho Chehabfieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from 217*e6f7df74SMauro Carvalho Chehabfiemap_check_flags(), it should immediately exit, returning that error back to 218*e6f7df74SMauro Carvalho Chehabioctl_fiemap(). 219*e6f7df74SMauro Carvalho Chehab 220*e6f7df74SMauro Carvalho Chehab 221*e6f7df74SMauro Carvalho ChehabFor each extent in the request range, the file system should call 222*e6f7df74SMauro Carvalho Chehabthe helper function, fiemap_fill_next_extent():: 223*e6f7df74SMauro Carvalho Chehab 224*e6f7df74SMauro Carvalho Chehab int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical, 225*e6f7df74SMauro Carvalho Chehab u64 phys, u64 len, u32 flags, u32 dev); 226*e6f7df74SMauro Carvalho Chehab 227*e6f7df74SMauro Carvalho Chehabfiemap_fill_next_extent() will use the passed values to populate the 228*e6f7df74SMauro Carvalho Chehabnext free extent in the fm_extents array. 'General' extent flags will 229*e6f7df74SMauro Carvalho Chehabautomatically be set from specific flags on behalf of the calling file 230*e6f7df74SMauro Carvalho Chehabsystem so that the userspace API is not broken. 231*e6f7df74SMauro Carvalho Chehab 232*e6f7df74SMauro Carvalho Chehabfiemap_fill_next_extent() returns 0 on success, and 1 when the 233*e6f7df74SMauro Carvalho Chehabuser-supplied fm_extents array is full. If an error is encountered 234*e6f7df74SMauro Carvalho Chehabwhile copying the extent to user memory, -EFAULT will be returned. 235