1*25b532ceSMauro Carvalho Chehab==================== 2*25b532ceSMauro Carvalho ChehabChanges since 2.5.0: 3*25b532ceSMauro Carvalho Chehab==================== 4*25b532ceSMauro Carvalho Chehab 5*25b532ceSMauro Carvalho Chehab--- 6*25b532ceSMauro Carvalho Chehab 7*25b532ceSMauro Carvalho Chehab**recommended** 8*25b532ceSMauro Carvalho Chehab 9*25b532ceSMauro Carvalho ChehabNew helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(), 10*25b532ceSMauro Carvalho Chehabsb_set_blocksize() and sb_min_blocksize(). 11*25b532ceSMauro Carvalho Chehab 12*25b532ceSMauro Carvalho ChehabUse them. 13*25b532ceSMauro Carvalho Chehab 14*25b532ceSMauro Carvalho Chehab(sb_find_get_block() replaces 2.4's get_hash_table()) 15*25b532ceSMauro Carvalho Chehab 16*25b532ceSMauro Carvalho Chehab--- 17*25b532ceSMauro Carvalho Chehab 18*25b532ceSMauro Carvalho Chehab**recommended** 19*25b532ceSMauro Carvalho Chehab 20*25b532ceSMauro Carvalho ChehabNew methods: ->alloc_inode() and ->destroy_inode(). 21*25b532ceSMauro Carvalho Chehab 22*25b532ceSMauro Carvalho ChehabRemove inode->u.foo_inode_i 23*25b532ceSMauro Carvalho Chehab 24*25b532ceSMauro Carvalho ChehabDeclare:: 25*25b532ceSMauro Carvalho Chehab 26*25b532ceSMauro Carvalho Chehab struct foo_inode_info { 27*25b532ceSMauro Carvalho Chehab /* fs-private stuff */ 28*25b532ceSMauro Carvalho Chehab struct inode vfs_inode; 29*25b532ceSMauro Carvalho Chehab }; 30*25b532ceSMauro Carvalho Chehab static inline struct foo_inode_info *FOO_I(struct inode *inode) 31*25b532ceSMauro Carvalho Chehab { 32*25b532ceSMauro Carvalho Chehab return list_entry(inode, struct foo_inode_info, vfs_inode); 33*25b532ceSMauro Carvalho Chehab } 34*25b532ceSMauro Carvalho Chehab 35*25b532ceSMauro Carvalho ChehabUse FOO_I(inode) instead of &inode->u.foo_inode_i; 36*25b532ceSMauro Carvalho Chehab 37*25b532ceSMauro Carvalho ChehabAdd foo_alloc_inode() and foo_destroy_inode() - the former should allocate 38*25b532ceSMauro Carvalho Chehabfoo_inode_info and return the address of ->vfs_inode, the latter should free 39*25b532ceSMauro Carvalho ChehabFOO_I(inode) (see in-tree filesystems for examples). 40*25b532ceSMauro Carvalho Chehab 41*25b532ceSMauro Carvalho ChehabMake them ->alloc_inode and ->destroy_inode in your super_operations. 42*25b532ceSMauro Carvalho Chehab 43*25b532ceSMauro Carvalho ChehabKeep in mind that now you need explicit initialization of private data 44*25b532ceSMauro Carvalho Chehabtypically between calling iget_locked() and unlocking the inode. 45*25b532ceSMauro Carvalho Chehab 46*25b532ceSMauro Carvalho ChehabAt some point that will become mandatory. 47*25b532ceSMauro Carvalho Chehab 48*25b532ceSMauro Carvalho Chehab--- 49*25b532ceSMauro Carvalho Chehab 50*25b532ceSMauro Carvalho Chehab**mandatory** 51*25b532ceSMauro Carvalho Chehab 52*25b532ceSMauro Carvalho ChehabChange of file_system_type method (->read_super to ->get_sb) 53*25b532ceSMauro Carvalho Chehab 54*25b532ceSMauro Carvalho Chehab->read_super() is no more. Ditto for DECLARE_FSTYPE and DECLARE_FSTYPE_DEV. 55*25b532ceSMauro Carvalho Chehab 56*25b532ceSMauro Carvalho ChehabTurn your foo_read_super() into a function that would return 0 in case of 57*25b532ceSMauro Carvalho Chehabsuccess and negative number in case of error (-EINVAL unless you have more 58*25b532ceSMauro Carvalho Chehabinformative error value to report). Call it foo_fill_super(). Now declare:: 59*25b532ceSMauro Carvalho Chehab 60*25b532ceSMauro Carvalho Chehab int foo_get_sb(struct file_system_type *fs_type, 61*25b532ceSMauro Carvalho Chehab int flags, const char *dev_name, void *data, struct vfsmount *mnt) 62*25b532ceSMauro Carvalho Chehab { 63*25b532ceSMauro Carvalho Chehab return get_sb_bdev(fs_type, flags, dev_name, data, foo_fill_super, 64*25b532ceSMauro Carvalho Chehab mnt); 65*25b532ceSMauro Carvalho Chehab } 66*25b532ceSMauro Carvalho Chehab 67*25b532ceSMauro Carvalho Chehab(or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of 68*25b532ceSMauro Carvalho Chehabfilesystem). 69*25b532ceSMauro Carvalho Chehab 70*25b532ceSMauro Carvalho ChehabReplace DECLARE_FSTYPE... with explicit initializer and have ->get_sb set as 71*25b532ceSMauro Carvalho Chehabfoo_get_sb. 72*25b532ceSMauro Carvalho Chehab 73*25b532ceSMauro Carvalho Chehab--- 74*25b532ceSMauro Carvalho Chehab 75*25b532ceSMauro Carvalho Chehab**mandatory** 76*25b532ceSMauro Carvalho Chehab 77*25b532ceSMauro Carvalho ChehabLocking change: ->s_vfs_rename_sem is taken only by cross-directory renames. 78*25b532ceSMauro Carvalho ChehabMost likely there is no need to change anything, but if you relied on 79*25b532ceSMauro Carvalho Chehabglobal exclusion between renames for some internal purpose - you need to 80*25b532ceSMauro Carvalho Chehabchange your internal locking. Otherwise exclusion warranties remain the 81*25b532ceSMauro Carvalho Chehabsame (i.e. parents and victim are locked, etc.). 82*25b532ceSMauro Carvalho Chehab 83*25b532ceSMauro Carvalho Chehab--- 84*25b532ceSMauro Carvalho Chehab 85*25b532ceSMauro Carvalho Chehab**informational** 86*25b532ceSMauro Carvalho Chehab 87*25b532ceSMauro Carvalho ChehabNow we have the exclusion between ->lookup() and directory removal (by 88*25b532ceSMauro Carvalho Chehab->rmdir() and ->rename()). If you used to need that exclusion and do 89*25b532ceSMauro Carvalho Chehabit by internal locking (most of filesystems couldn't care less) - you 90*25b532ceSMauro Carvalho Chehabcan relax your locking. 91*25b532ceSMauro Carvalho Chehab 92*25b532ceSMauro Carvalho Chehab--- 93*25b532ceSMauro Carvalho Chehab 94*25b532ceSMauro Carvalho Chehab**mandatory** 95*25b532ceSMauro Carvalho Chehab 96*25b532ceSMauro Carvalho Chehab->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(), 97*25b532ceSMauro Carvalho Chehab->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename() 98*25b532ceSMauro Carvalho Chehaband ->readdir() are called without BKL now. Grab it on entry, drop upon return 99*25b532ceSMauro Carvalho Chehab- that will guarantee the same locking you used to have. If your method or its 100*25b532ceSMauro Carvalho Chehabparts do not need BKL - better yet, now you can shift lock_kernel() and 101*25b532ceSMauro Carvalho Chehabunlock_kernel() so that they would protect exactly what needs to be 102*25b532ceSMauro Carvalho Chehabprotected. 103*25b532ceSMauro Carvalho Chehab 104*25b532ceSMauro Carvalho Chehab--- 105*25b532ceSMauro Carvalho Chehab 106*25b532ceSMauro Carvalho Chehab**mandatory** 107*25b532ceSMauro Carvalho Chehab 108*25b532ceSMauro Carvalho ChehabBKL is also moved from around sb operations. BKL should have been shifted into 109*25b532ceSMauro Carvalho Chehabindividual fs sb_op functions. If you don't need it, remove it. 110*25b532ceSMauro Carvalho Chehab 111*25b532ceSMauro Carvalho Chehab--- 112*25b532ceSMauro Carvalho Chehab 113*25b532ceSMauro Carvalho Chehab**informational** 114*25b532ceSMauro Carvalho Chehab 115*25b532ceSMauro Carvalho Chehabcheck for ->link() target not being a directory is done by callers. Feel 116*25b532ceSMauro Carvalho Chehabfree to drop it... 117*25b532ceSMauro Carvalho Chehab 118*25b532ceSMauro Carvalho Chehab--- 119*25b532ceSMauro Carvalho Chehab 120*25b532ceSMauro Carvalho Chehab**informational** 121*25b532ceSMauro Carvalho Chehab 122*25b532ceSMauro Carvalho Chehab->link() callers hold ->i_mutex on the object we are linking to. Some of your 123*25b532ceSMauro Carvalho Chehabproblems might be over... 124*25b532ceSMauro Carvalho Chehab 125*25b532ceSMauro Carvalho Chehab--- 126*25b532ceSMauro Carvalho Chehab 127*25b532ceSMauro Carvalho Chehab**mandatory** 128*25b532ceSMauro Carvalho Chehab 129*25b532ceSMauro Carvalho Chehabnew file_system_type method - kill_sb(superblock). If you are converting 130*25b532ceSMauro Carvalho Chehaban existing filesystem, set it according to ->fs_flags:: 131*25b532ceSMauro Carvalho Chehab 132*25b532ceSMauro Carvalho Chehab FS_REQUIRES_DEV - kill_block_super 133*25b532ceSMauro Carvalho Chehab FS_LITTER - kill_litter_super 134*25b532ceSMauro Carvalho Chehab neither - kill_anon_super 135*25b532ceSMauro Carvalho Chehab 136*25b532ceSMauro Carvalho ChehabFS_LITTER is gone - just remove it from fs_flags. 137*25b532ceSMauro Carvalho Chehab 138*25b532ceSMauro Carvalho Chehab--- 139*25b532ceSMauro Carvalho Chehab 140*25b532ceSMauro Carvalho Chehab**mandatory** 141*25b532ceSMauro Carvalho Chehab 142*25b532ceSMauro Carvalho ChehabFS_SINGLE is gone (actually, that had happened back when ->get_sb() 143*25b532ceSMauro Carvalho Chehabwent in - and hadn't been documented ;-/). Just remove it from fs_flags 144*25b532ceSMauro Carvalho Chehab(and see ->get_sb() entry for other actions). 145*25b532ceSMauro Carvalho Chehab 146*25b532ceSMauro Carvalho Chehab--- 147*25b532ceSMauro Carvalho Chehab 148*25b532ceSMauro Carvalho Chehab**mandatory** 149*25b532ceSMauro Carvalho Chehab 150*25b532ceSMauro Carvalho Chehab->setattr() is called without BKL now. Caller _always_ holds ->i_mutex, so 151*25b532ceSMauro Carvalho Chehabwatch for ->i_mutex-grabbing code that might be used by your ->setattr(). 152*25b532ceSMauro Carvalho ChehabCallers of notify_change() need ->i_mutex now. 153*25b532ceSMauro Carvalho Chehab 154*25b532ceSMauro Carvalho Chehab--- 155*25b532ceSMauro Carvalho Chehab 156*25b532ceSMauro Carvalho Chehab**recommended** 157*25b532ceSMauro Carvalho Chehab 158*25b532ceSMauro Carvalho ChehabNew super_block field ``struct export_operations *s_export_op`` for 159*25b532ceSMauro Carvalho Chehabexplicit support for exporting, e.g. via NFS. The structure is fully 160*25b532ceSMauro Carvalho Chehabdocumented at its declaration in include/linux/fs.h, and in 161*25b532ceSMauro Carvalho ChehabDocumentation/filesystems/nfs/Exporting. 162*25b532ceSMauro Carvalho Chehab 163*25b532ceSMauro Carvalho ChehabBriefly it allows for the definition of decode_fh and encode_fh operations 164*25b532ceSMauro Carvalho Chehabto encode and decode filehandles, and allows the filesystem to use 165*25b532ceSMauro Carvalho Chehaba standard helper function for decode_fh, and provide file-system specific 166*25b532ceSMauro Carvalho Chehabsupport for this helper, particularly get_parent. 167*25b532ceSMauro Carvalho Chehab 168*25b532ceSMauro Carvalho ChehabIt is planned that this will be required for exporting once the code 169*25b532ceSMauro Carvalho Chehabsettles down a bit. 170*25b532ceSMauro Carvalho Chehab 171*25b532ceSMauro Carvalho Chehab**mandatory** 172*25b532ceSMauro Carvalho Chehab 173*25b532ceSMauro Carvalho Chehabs_export_op is now required for exporting a filesystem. 174*25b532ceSMauro Carvalho Chehabisofs, ext2, ext3, resierfs, fat 175*25b532ceSMauro Carvalho Chehabcan be used as examples of very different filesystems. 176*25b532ceSMauro Carvalho Chehab 177*25b532ceSMauro Carvalho Chehab--- 178*25b532ceSMauro Carvalho Chehab 179*25b532ceSMauro Carvalho Chehab**mandatory** 180*25b532ceSMauro Carvalho Chehab 181*25b532ceSMauro Carvalho Chehabiget4() and the read_inode2 callback have been superseded by iget5_locked() 182*25b532ceSMauro Carvalho Chehabwhich has the following prototype:: 183*25b532ceSMauro Carvalho Chehab 184*25b532ceSMauro Carvalho Chehab struct inode *iget5_locked(struct super_block *sb, unsigned long ino, 185*25b532ceSMauro Carvalho Chehab int (*test)(struct inode *, void *), 186*25b532ceSMauro Carvalho Chehab int (*set)(struct inode *, void *), 187*25b532ceSMauro Carvalho Chehab void *data); 188*25b532ceSMauro Carvalho Chehab 189*25b532ceSMauro Carvalho Chehab'test' is an additional function that can be used when the inode 190*25b532ceSMauro Carvalho Chehabnumber is not sufficient to identify the actual file object. 'set' 191*25b532ceSMauro Carvalho Chehabshould be a non-blocking function that initializes those parts of a 192*25b532ceSMauro Carvalho Chehabnewly created inode to allow the test function to succeed. 'data' is 193*25b532ceSMauro Carvalho Chehabpassed as an opaque value to both test and set functions. 194*25b532ceSMauro Carvalho Chehab 195*25b532ceSMauro Carvalho ChehabWhen the inode has been created by iget5_locked(), it will be returned with the 196*25b532ceSMauro Carvalho ChehabI_NEW flag set and will still be locked. The filesystem then needs to finalize 197*25b532ceSMauro Carvalho Chehabthe initialization. Once the inode is initialized it must be unlocked by 198*25b532ceSMauro Carvalho Chehabcalling unlock_new_inode(). 199*25b532ceSMauro Carvalho Chehab 200*25b532ceSMauro Carvalho ChehabThe filesystem is responsible for setting (and possibly testing) i_ino 201*25b532ceSMauro Carvalho Chehabwhen appropriate. There is also a simpler iget_locked function that 202*25b532ceSMauro Carvalho Chehabjust takes the superblock and inode number as arguments and does the 203*25b532ceSMauro Carvalho Chehabtest and set for you. 204*25b532ceSMauro Carvalho Chehab 205*25b532ceSMauro Carvalho Chehabe.g.:: 206*25b532ceSMauro Carvalho Chehab 207*25b532ceSMauro Carvalho Chehab inode = iget_locked(sb, ino); 208*25b532ceSMauro Carvalho Chehab if (inode->i_state & I_NEW) { 209*25b532ceSMauro Carvalho Chehab err = read_inode_from_disk(inode); 210*25b532ceSMauro Carvalho Chehab if (err < 0) { 211*25b532ceSMauro Carvalho Chehab iget_failed(inode); 212*25b532ceSMauro Carvalho Chehab return err; 213*25b532ceSMauro Carvalho Chehab } 214*25b532ceSMauro Carvalho Chehab unlock_new_inode(inode); 215*25b532ceSMauro Carvalho Chehab } 216*25b532ceSMauro Carvalho Chehab 217*25b532ceSMauro Carvalho ChehabNote that if the process of setting up a new inode fails, then iget_failed() 218*25b532ceSMauro Carvalho Chehabshould be called on the inode to render it dead, and an appropriate error 219*25b532ceSMauro Carvalho Chehabshould be passed back to the caller. 220*25b532ceSMauro Carvalho Chehab 221*25b532ceSMauro Carvalho Chehab--- 222*25b532ceSMauro Carvalho Chehab 223*25b532ceSMauro Carvalho Chehab**recommended** 224*25b532ceSMauro Carvalho Chehab 225*25b532ceSMauro Carvalho Chehab->getattr() finally getting used. See instances in nfs, minix, etc. 226*25b532ceSMauro Carvalho Chehab 227*25b532ceSMauro Carvalho Chehab--- 228*25b532ceSMauro Carvalho Chehab 229*25b532ceSMauro Carvalho Chehab**mandatory** 230*25b532ceSMauro Carvalho Chehab 231*25b532ceSMauro Carvalho Chehab->revalidate() is gone. If your filesystem had it - provide ->getattr() 232*25b532ceSMauro Carvalho Chehaband let it call whatever you had as ->revlidate() + (for symlinks that 233*25b532ceSMauro Carvalho Chehabhad ->revalidate()) add calls in ->follow_link()/->readlink(). 234*25b532ceSMauro Carvalho Chehab 235*25b532ceSMauro Carvalho Chehab--- 236*25b532ceSMauro Carvalho Chehab 237*25b532ceSMauro Carvalho Chehab**mandatory** 238*25b532ceSMauro Carvalho Chehab 239*25b532ceSMauro Carvalho Chehab->d_parent changes are not protected by BKL anymore. Read access is safe 240*25b532ceSMauro Carvalho Chehabif at least one of the following is true: 241*25b532ceSMauro Carvalho Chehab 242*25b532ceSMauro Carvalho Chehab * filesystem has no cross-directory rename() 243*25b532ceSMauro Carvalho Chehab * we know that parent had been locked (e.g. we are looking at 244*25b532ceSMauro Carvalho Chehab ->d_parent of ->lookup() argument). 245*25b532ceSMauro Carvalho Chehab * we are called from ->rename(). 246*25b532ceSMauro Carvalho Chehab * the child's ->d_lock is held 247*25b532ceSMauro Carvalho Chehab 248*25b532ceSMauro Carvalho ChehabAudit your code and add locking if needed. Notice that any place that is 249*25b532ceSMauro Carvalho Chehabnot protected by the conditions above is risky even in the old tree - you 250*25b532ceSMauro Carvalho Chehabhad been relying on BKL and that's prone to screwups. Old tree had quite 251*25b532ceSMauro Carvalho Chehaba few holes of that kind - unprotected access to ->d_parent leading to 252*25b532ceSMauro Carvalho Chehabanything from oops to silent memory corruption. 253*25b532ceSMauro Carvalho Chehab 254*25b532ceSMauro Carvalho Chehab--- 255*25b532ceSMauro Carvalho Chehab 256*25b532ceSMauro Carvalho Chehab**mandatory** 257*25b532ceSMauro Carvalho Chehab 258*25b532ceSMauro Carvalho ChehabFS_NOMOUNT is gone. If you use it - just set SB_NOUSER in flags 259*25b532ceSMauro Carvalho Chehab(see rootfs for one kind of solution and bdev/socket/pipe for another). 260*25b532ceSMauro Carvalho Chehab 261*25b532ceSMauro Carvalho Chehab--- 262*25b532ceSMauro Carvalho Chehab 263*25b532ceSMauro Carvalho Chehab**recommended** 264*25b532ceSMauro Carvalho Chehab 265*25b532ceSMauro Carvalho ChehabUse bdev_read_only(bdev) instead of is_read_only(kdev). The latter 266*25b532ceSMauro Carvalho Chehabis still alive, but only because of the mess in drivers/s390/block/dasd.c. 267*25b532ceSMauro Carvalho ChehabAs soon as it gets fixed is_read_only() will die. 268*25b532ceSMauro Carvalho Chehab 269*25b532ceSMauro Carvalho Chehab--- 270*25b532ceSMauro Carvalho Chehab 271*25b532ceSMauro Carvalho Chehab**mandatory** 272*25b532ceSMauro Carvalho Chehab 273*25b532ceSMauro Carvalho Chehab->permission() is called without BKL now. Grab it on entry, drop upon 274*25b532ceSMauro Carvalho Chehabreturn - that will guarantee the same locking you used to have. If 275*25b532ceSMauro Carvalho Chehabyour method or its parts do not need BKL - better yet, now you can 276*25b532ceSMauro Carvalho Chehabshift lock_kernel() and unlock_kernel() so that they would protect 277*25b532ceSMauro Carvalho Chehabexactly what needs to be protected. 278*25b532ceSMauro Carvalho Chehab 279*25b532ceSMauro Carvalho Chehab--- 280*25b532ceSMauro Carvalho Chehab 281*25b532ceSMauro Carvalho Chehab**mandatory** 282*25b532ceSMauro Carvalho Chehab 283*25b532ceSMauro Carvalho Chehab->statfs() is now called without BKL held. BKL should have been 284*25b532ceSMauro Carvalho Chehabshifted into individual fs sb_op functions where it's not clear that 285*25b532ceSMauro Carvalho Chehabit's safe to remove it. If you don't need it, remove it. 286*25b532ceSMauro Carvalho Chehab 287*25b532ceSMauro Carvalho Chehab--- 288*25b532ceSMauro Carvalho Chehab 289*25b532ceSMauro Carvalho Chehab**mandatory** 290*25b532ceSMauro Carvalho Chehab 291*25b532ceSMauro Carvalho Chehabis_read_only() is gone; use bdev_read_only() instead. 292*25b532ceSMauro Carvalho Chehab 293*25b532ceSMauro Carvalho Chehab--- 294*25b532ceSMauro Carvalho Chehab 295*25b532ceSMauro Carvalho Chehab**mandatory** 296*25b532ceSMauro Carvalho Chehab 297*25b532ceSMauro Carvalho Chehabdestroy_buffers() is gone; use invalidate_bdev(). 298*25b532ceSMauro Carvalho Chehab 299*25b532ceSMauro Carvalho Chehab--- 300*25b532ceSMauro Carvalho Chehab 301*25b532ceSMauro Carvalho Chehab**mandatory** 302*25b532ceSMauro Carvalho Chehab 303*25b532ceSMauro Carvalho Chehabfsync_dev() is gone; use fsync_bdev(). NOTE: lvm breakage is 304*25b532ceSMauro Carvalho Chehabdeliberate; as soon as struct block_device * is propagated in a reasonable 305*25b532ceSMauro Carvalho Chehabway by that code fixing will become trivial; until then nothing can be 306*25b532ceSMauro Carvalho Chehabdone. 307*25b532ceSMauro Carvalho Chehab 308*25b532ceSMauro Carvalho Chehab**mandatory** 309*25b532ceSMauro Carvalho Chehab 310*25b532ceSMauro Carvalho Chehabblock truncatation on error exit from ->write_begin, and ->direct_IO 311*25b532ceSMauro Carvalho Chehabmoved from generic methods (block_write_begin, cont_write_begin, 312*25b532ceSMauro Carvalho Chehabnobh_write_begin, blockdev_direct_IO*) to callers. Take a look at 313*25b532ceSMauro Carvalho Chehabext2_write_failed and callers for an example. 314*25b532ceSMauro Carvalho Chehab 315*25b532ceSMauro Carvalho Chehab**mandatory** 316*25b532ceSMauro Carvalho Chehab 317*25b532ceSMauro Carvalho Chehab->truncate is gone. The whole truncate sequence needs to be 318*25b532ceSMauro Carvalho Chehabimplemented in ->setattr, which is now mandatory for filesystems 319*25b532ceSMauro Carvalho Chehabimplementing on-disk size changes. Start with a copy of the old inode_setattr 320*25b532ceSMauro Carvalho Chehaband vmtruncate, and the reorder the vmtruncate + foofs_vmtruncate sequence to 321*25b532ceSMauro Carvalho Chehabbe in order of zeroing blocks using block_truncate_page or similar helpers, 322*25b532ceSMauro Carvalho Chehabsize update and on finally on-disk truncation which should not fail. 323*25b532ceSMauro Carvalho Chehabsetattr_prepare (which used to be inode_change_ok) now includes the size checks 324*25b532ceSMauro Carvalho Chehabfor ATTR_SIZE and must be called in the beginning of ->setattr unconditionally. 325*25b532ceSMauro Carvalho Chehab 326*25b532ceSMauro Carvalho Chehab**mandatory** 327*25b532ceSMauro Carvalho Chehab 328*25b532ceSMauro Carvalho Chehab->clear_inode() and ->delete_inode() are gone; ->evict_inode() should 329*25b532ceSMauro Carvalho Chehabbe used instead. It gets called whenever the inode is evicted, whether it has 330*25b532ceSMauro Carvalho Chehabremaining links or not. Caller does *not* evict the pagecache or inode-associated 331*25b532ceSMauro Carvalho Chehabmetadata buffers; the method has to use truncate_inode_pages_final() to get rid 332*25b532ceSMauro Carvalho Chehabof those. Caller makes sure async writeback cannot be running for the inode while 333*25b532ceSMauro Carvalho Chehab(or after) ->evict_inode() is called. 334*25b532ceSMauro Carvalho Chehab 335*25b532ceSMauro Carvalho Chehab->drop_inode() returns int now; it's called on final iput() with 336*25b532ceSMauro Carvalho Chehabinode->i_lock held and it returns true if filesystems wants the inode to be 337*25b532ceSMauro Carvalho Chehabdropped. As before, generic_drop_inode() is still the default and it's been 338*25b532ceSMauro Carvalho Chehabupdated appropriately. generic_delete_inode() is also alive and it consists 339*25b532ceSMauro Carvalho Chehabsimply of return 1. Note that all actual eviction work is done by caller after 340*25b532ceSMauro Carvalho Chehab->drop_inode() returns. 341*25b532ceSMauro Carvalho Chehab 342*25b532ceSMauro Carvalho ChehabAs before, clear_inode() must be called exactly once on each call of 343*25b532ceSMauro Carvalho Chehab->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike 344*25b532ceSMauro Carvalho Chehabbefore, if you are using inode-associated metadata buffers (i.e. 345*25b532ceSMauro Carvalho Chehabmark_buffer_dirty_inode()), it's your responsibility to call 346*25b532ceSMauro Carvalho Chehabinvalidate_inode_buffers() before clear_inode(). 347*25b532ceSMauro Carvalho Chehab 348*25b532ceSMauro Carvalho ChehabNOTE: checking i_nlink in the beginning of ->write_inode() and bailing out 349*25b532ceSMauro Carvalho Chehabif it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput() 350*25b532ceSMauro Carvalho Chehabmay happen while the inode is in the middle of ->write_inode(); e.g. if you blindly 351*25b532ceSMauro Carvalho Chehabfree the on-disk inode, you may end up doing that while ->write_inode() is writing 352*25b532ceSMauro Carvalho Chehabto it. 353*25b532ceSMauro Carvalho Chehab 354*25b532ceSMauro Carvalho Chehab--- 355*25b532ceSMauro Carvalho Chehab 356*25b532ceSMauro Carvalho Chehab**mandatory** 357*25b532ceSMauro Carvalho Chehab 358*25b532ceSMauro Carvalho Chehab.d_delete() now only advises the dcache as to whether or not to cache 359*25b532ceSMauro Carvalho Chehabunreferenced dentries, and is now only called when the dentry refcount goes to 360*25b532ceSMauro Carvalho Chehab0. Even on 0 refcount transition, it must be able to tolerate being called 0, 361*25b532ceSMauro Carvalho Chehab1, or more times (eg. constant, idempotent). 362*25b532ceSMauro Carvalho Chehab 363*25b532ceSMauro Carvalho Chehab--- 364*25b532ceSMauro Carvalho Chehab 365*25b532ceSMauro Carvalho Chehab**mandatory** 366*25b532ceSMauro Carvalho Chehab 367*25b532ceSMauro Carvalho Chehab.d_compare() calling convention and locking rules are significantly 368*25b532ceSMauro Carvalho Chehabchanged. Read updated documentation in Documentation/filesystems/vfs.rst (and 369*25b532ceSMauro Carvalho Chehablook at examples of other filesystems) for guidance. 370*25b532ceSMauro Carvalho Chehab 371*25b532ceSMauro Carvalho Chehab--- 372*25b532ceSMauro Carvalho Chehab 373*25b532ceSMauro Carvalho Chehab**mandatory** 374*25b532ceSMauro Carvalho Chehab 375*25b532ceSMauro Carvalho Chehab.d_hash() calling convention and locking rules are significantly 376*25b532ceSMauro Carvalho Chehabchanged. Read updated documentation in Documentation/filesystems/vfs.rst (and 377*25b532ceSMauro Carvalho Chehablook at examples of other filesystems) for guidance. 378*25b532ceSMauro Carvalho Chehab 379*25b532ceSMauro Carvalho Chehab--- 380*25b532ceSMauro Carvalho Chehab 381*25b532ceSMauro Carvalho Chehab**mandatory** 382*25b532ceSMauro Carvalho Chehab 383*25b532ceSMauro Carvalho Chehabdcache_lock is gone, replaced by fine grained locks. See fs/dcache.c 384*25b532ceSMauro Carvalho Chehabfor details of what locks to replace dcache_lock with in order to protect 385*25b532ceSMauro Carvalho Chehabparticular things. Most of the time, a filesystem only needs ->d_lock, which 386*25b532ceSMauro Carvalho Chehabprotects *all* the dcache state of a given dentry. 387*25b532ceSMauro Carvalho Chehab 388*25b532ceSMauro Carvalho Chehab--- 389*25b532ceSMauro Carvalho Chehab 390*25b532ceSMauro Carvalho Chehab**mandatory** 391*25b532ceSMauro Carvalho Chehab 392*25b532ceSMauro Carvalho ChehabFilesystems must RCU-free their inodes, if they can have been accessed 393*25b532ceSMauro Carvalho Chehabvia rcu-walk path walk (basically, if the file can have had a path name in the 394*25b532ceSMauro Carvalho Chehabvfs namespace). 395*25b532ceSMauro Carvalho Chehab 396*25b532ceSMauro Carvalho ChehabEven though i_dentry and i_rcu share storage in a union, we will 397*25b532ceSMauro Carvalho Chehabinitialize the former in inode_init_always(), so just leave it alone in 398*25b532ceSMauro Carvalho Chehabthe callback. It used to be necessary to clean it there, but not anymore 399*25b532ceSMauro Carvalho Chehab(starting at 3.2). 400*25b532ceSMauro Carvalho Chehab 401*25b532ceSMauro Carvalho Chehab--- 402*25b532ceSMauro Carvalho Chehab 403*25b532ceSMauro Carvalho Chehab**recommended** 404*25b532ceSMauro Carvalho Chehab 405*25b532ceSMauro Carvalho Chehabvfs now tries to do path walking in "rcu-walk mode", which avoids 406*25b532ceSMauro Carvalho Chehabatomic operations and scalability hazards on dentries and inodes (see 407*25b532ceSMauro Carvalho ChehabDocumentation/filesystems/path-lookup.txt). d_hash and d_compare changes 408*25b532ceSMauro Carvalho Chehab(above) are examples of the changes required to support this. For more complex 409*25b532ceSMauro Carvalho Chehabfilesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so 410*25b532ceSMauro Carvalho Chehabno changes are required to the filesystem. However, this is costly and loses 411*25b532ceSMauro Carvalho Chehabthe benefits of rcu-walk mode. We will begin to add filesystem callbacks that 412*25b532ceSMauro Carvalho Chehabare rcu-walk aware, shown below. Filesystems should take advantage of this 413*25b532ceSMauro Carvalho Chehabwhere possible. 414*25b532ceSMauro Carvalho Chehab 415*25b532ceSMauro Carvalho Chehab--- 416*25b532ceSMauro Carvalho Chehab 417*25b532ceSMauro Carvalho Chehab**mandatory** 418*25b532ceSMauro Carvalho Chehab 419*25b532ceSMauro Carvalho Chehabd_revalidate is a callback that is made on every path element (if 420*25b532ceSMauro Carvalho Chehabthe filesystem provides it), which requires dropping out of rcu-walk mode. This 421*25b532ceSMauro Carvalho Chehabmay now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be 422*25b532ceSMauro Carvalho Chehabreturned if the filesystem cannot handle rcu-walk. See 423*25b532ceSMauro Carvalho ChehabDocumentation/filesystems/vfs.rst for more details. 424*25b532ceSMauro Carvalho Chehab 425*25b532ceSMauro Carvalho Chehabpermission is an inode permission check that is called on many or all 426*25b532ceSMauro Carvalho Chehabdirectory inodes on the way down a path walk (to check for exec permission). It 427*25b532ceSMauro Carvalho Chehabmust now be rcu-walk aware (mask & MAY_NOT_BLOCK). See 428*25b532ceSMauro Carvalho ChehabDocumentation/filesystems/vfs.rst for more details. 429*25b532ceSMauro Carvalho Chehab 430*25b532ceSMauro Carvalho Chehab--- 431*25b532ceSMauro Carvalho Chehab 432*25b532ceSMauro Carvalho Chehab**mandatory** 433*25b532ceSMauro Carvalho Chehab 434*25b532ceSMauro Carvalho ChehabIn ->fallocate() you must check the mode option passed in. If your 435*25b532ceSMauro Carvalho Chehabfilesystem does not support hole punching (deallocating space in the middle of a 436*25b532ceSMauro Carvalho Chehabfile) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode. 437*25b532ceSMauro Carvalho ChehabCurrently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set, 438*25b532ceSMauro Carvalho Chehabso the i_size should not change when hole punching, even when puching the end of 439*25b532ceSMauro Carvalho Chehaba file off. 440*25b532ceSMauro Carvalho Chehab 441*25b532ceSMauro Carvalho Chehab--- 442*25b532ceSMauro Carvalho Chehab 443*25b532ceSMauro Carvalho Chehab**mandatory** 444*25b532ceSMauro Carvalho Chehab 445*25b532ceSMauro Carvalho Chehab->get_sb() is gone. Switch to use of ->mount(). Typically it's just 446*25b532ceSMauro Carvalho Chehaba matter of switching from calling ``get_sb_``... to ``mount_``... and changing 447*25b532ceSMauro Carvalho Chehabthe function type. If you were doing it manually, just switch from setting 448*25b532ceSMauro Carvalho Chehab->mnt_root to some pointer to returning that pointer. On errors return 449*25b532ceSMauro Carvalho ChehabERR_PTR(...). 450*25b532ceSMauro Carvalho Chehab 451*25b532ceSMauro Carvalho Chehab--- 452*25b532ceSMauro Carvalho Chehab 453*25b532ceSMauro Carvalho Chehab**mandatory** 454*25b532ceSMauro Carvalho Chehab 455*25b532ceSMauro Carvalho Chehab->permission() and generic_permission()have lost flags 456*25b532ceSMauro Carvalho Chehabargument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask. 457*25b532ceSMauro Carvalho Chehab 458*25b532ceSMauro Carvalho Chehabgeneric_permission() has also lost the check_acl argument; ACL checking 459*25b532ceSMauro Carvalho Chehabhas been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl 460*25b532ceSMauro Carvalho Chehabto read an ACL from disk. 461*25b532ceSMauro Carvalho Chehab 462*25b532ceSMauro Carvalho Chehab--- 463*25b532ceSMauro Carvalho Chehab 464*25b532ceSMauro Carvalho Chehab**mandatory** 465*25b532ceSMauro Carvalho Chehab 466*25b532ceSMauro Carvalho ChehabIf you implement your own ->llseek() you must handle SEEK_HOLE and 467*25b532ceSMauro Carvalho ChehabSEEK_DATA. You can hanle this by returning -EINVAL, but it would be nicer to 468*25b532ceSMauro Carvalho Chehabsupport it in some way. The generic handler assumes that the entire file is 469*25b532ceSMauro Carvalho Chehabdata and there is a virtual hole at the end of the file. So if the provided 470*25b532ceSMauro Carvalho Chehaboffset is less than i_size and SEEK_DATA is specified, return the same offset. 471*25b532ceSMauro Carvalho ChehabIf the above is true for the offset and you are given SEEK_HOLE, return the end 472*25b532ceSMauro Carvalho Chehabof the file. If the offset is i_size or greater return -ENXIO in either case. 473*25b532ceSMauro Carvalho Chehab 474*25b532ceSMauro Carvalho Chehab**mandatory** 475*25b532ceSMauro Carvalho Chehab 476*25b532ceSMauro Carvalho ChehabIf you have your own ->fsync() you must make sure to call 477*25b532ceSMauro Carvalho Chehabfilemap_write_and_wait_range() so that all dirty pages are synced out properly. 478*25b532ceSMauro Carvalho ChehabYou must also keep in mind that ->fsync() is not called with i_mutex held 479*25b532ceSMauro Carvalho Chehabanymore, so if you require i_mutex locking you must make sure to take it and 480*25b532ceSMauro Carvalho Chehabrelease it yourself. 481*25b532ceSMauro Carvalho Chehab 482*25b532ceSMauro Carvalho Chehab--- 483*25b532ceSMauro Carvalho Chehab 484*25b532ceSMauro Carvalho Chehab**mandatory** 485*25b532ceSMauro Carvalho Chehab 486*25b532ceSMauro Carvalho Chehabd_alloc_root() is gone, along with a lot of bugs caused by code 487*25b532ceSMauro Carvalho Chehabmisusing it. Replacement: d_make_root(inode). On success d_make_root(inode) 488*25b532ceSMauro Carvalho Chehaballocates and returns a new dentry instantiated with the passed in inode. 489*25b532ceSMauro Carvalho ChehabOn failure NULL is returned and the passed in inode is dropped so the reference 490*25b532ceSMauro Carvalho Chehabto inode is consumed in all cases and failure handling need not do any cleanup 491*25b532ceSMauro Carvalho Chehabfor the inode. If d_make_root(inode) is passed a NULL inode it returns NULL 492*25b532ceSMauro Carvalho Chehaband also requires no further error handling. Typical usage is:: 493*25b532ceSMauro Carvalho Chehab 494*25b532ceSMauro Carvalho Chehab inode = foofs_new_inode(....); 495*25b532ceSMauro Carvalho Chehab s->s_root = d_make_root(inode); 496*25b532ceSMauro Carvalho Chehab if (!s->s_root) 497*25b532ceSMauro Carvalho Chehab /* Nothing needed for the inode cleanup */ 498*25b532ceSMauro Carvalho Chehab return -ENOMEM; 499*25b532ceSMauro Carvalho Chehab ... 500*25b532ceSMauro Carvalho Chehab 501*25b532ceSMauro Carvalho Chehab--- 502*25b532ceSMauro Carvalho Chehab 503*25b532ceSMauro Carvalho Chehab**mandatory** 504*25b532ceSMauro Carvalho Chehab 505*25b532ceSMauro Carvalho ChehabThe witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and 506*25b532ceSMauro Carvalho Chehab->lookup() do *not* take struct nameidata anymore; just the flags. 507*25b532ceSMauro Carvalho Chehab 508*25b532ceSMauro Carvalho Chehab--- 509*25b532ceSMauro Carvalho Chehab 510*25b532ceSMauro Carvalho Chehab**mandatory** 511*25b532ceSMauro Carvalho Chehab 512*25b532ceSMauro Carvalho Chehab->create() doesn't take ``struct nameidata *``; unlike the previous 513*25b532ceSMauro Carvalho Chehabtwo, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that 514*25b532ceSMauro Carvalho Chehablocal filesystems can ignore tha argument - they are guaranteed that the 515*25b532ceSMauro Carvalho Chehabobject doesn't exist. It's remote/distributed ones that might care... 516*25b532ceSMauro Carvalho Chehab 517*25b532ceSMauro Carvalho Chehab--- 518*25b532ceSMauro Carvalho Chehab 519*25b532ceSMauro Carvalho Chehab**mandatory** 520*25b532ceSMauro Carvalho Chehab 521*25b532ceSMauro Carvalho ChehabFS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate() 522*25b532ceSMauro Carvalho Chehabin your dentry operations instead. 523*25b532ceSMauro Carvalho Chehab 524*25b532ceSMauro Carvalho Chehab--- 525*25b532ceSMauro Carvalho Chehab 526*25b532ceSMauro Carvalho Chehab**mandatory** 527*25b532ceSMauro Carvalho Chehab 528*25b532ceSMauro Carvalho Chehabvfs_readdir() is gone; switch to iterate_dir() instead 529*25b532ceSMauro Carvalho Chehab 530*25b532ceSMauro Carvalho Chehab--- 531*25b532ceSMauro Carvalho Chehab 532*25b532ceSMauro Carvalho Chehab**mandatory** 533*25b532ceSMauro Carvalho Chehab 534*25b532ceSMauro Carvalho Chehab->readdir() is gone now; switch to ->iterate() 535*25b532ceSMauro Carvalho Chehab 536*25b532ceSMauro Carvalho Chehab**mandatory** 537*25b532ceSMauro Carvalho Chehab 538*25b532ceSMauro Carvalho Chehabvfs_follow_link has been removed. Filesystems must use nd_set_link 539*25b532ceSMauro Carvalho Chehabfrom ->follow_link for normal symlinks, or nd_jump_link for magic 540*25b532ceSMauro Carvalho Chehab/proc/<pid> style links. 541*25b532ceSMauro Carvalho Chehab 542*25b532ceSMauro Carvalho Chehab--- 543*25b532ceSMauro Carvalho Chehab 544*25b532ceSMauro Carvalho Chehab**mandatory** 545*25b532ceSMauro Carvalho Chehab 546*25b532ceSMauro Carvalho Chehabiget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be 547*25b532ceSMauro Carvalho Chehabcalled with both ->i_lock and inode_hash_lock held; the former is *not* 548*25b532ceSMauro Carvalho Chehabtaken anymore, so verify that your callbacks do not rely on it (none 549*25b532ceSMauro Carvalho Chehabof the in-tree instances did). inode_hash_lock is still held, 550*25b532ceSMauro Carvalho Chehabof course, so they are still serialized wrt removal from inode hash, 551*25b532ceSMauro Carvalho Chehabas well as wrt set() callback of iget5_locked(). 552*25b532ceSMauro Carvalho Chehab 553*25b532ceSMauro Carvalho Chehab--- 554*25b532ceSMauro Carvalho Chehab 555*25b532ceSMauro Carvalho Chehab**mandatory** 556*25b532ceSMauro Carvalho Chehab 557*25b532ceSMauro Carvalho Chehabd_materialise_unique() is gone; d_splice_alias() does everything you 558*25b532ceSMauro Carvalho Chehabneed now. Remember that they have opposite orders of arguments ;-/ 559*25b532ceSMauro Carvalho Chehab 560*25b532ceSMauro Carvalho Chehab--- 561*25b532ceSMauro Carvalho Chehab 562*25b532ceSMauro Carvalho Chehab**mandatory** 563*25b532ceSMauro Carvalho Chehab 564*25b532ceSMauro Carvalho Chehabf_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid 565*25b532ceSMauro Carvalho Chehabit entirely. 566*25b532ceSMauro Carvalho Chehab 567*25b532ceSMauro Carvalho Chehab--- 568*25b532ceSMauro Carvalho Chehab 569*25b532ceSMauro Carvalho Chehab**mandatory** 570*25b532ceSMauro Carvalho Chehab 571*25b532ceSMauro Carvalho Chehabnever call ->read() and ->write() directly; use __vfs_{read,write} or 572*25b532ceSMauro Carvalho Chehabwrappers; instead of checking for ->write or ->read being NULL, look for 573*25b532ceSMauro Carvalho ChehabFMODE_CAN_{WRITE,READ} in file->f_mode. 574*25b532ceSMauro Carvalho Chehab 575*25b532ceSMauro Carvalho Chehab--- 576*25b532ceSMauro Carvalho Chehab 577*25b532ceSMauro Carvalho Chehab**mandatory** 578*25b532ceSMauro Carvalho Chehab 579*25b532ceSMauro Carvalho Chehabdo _not_ use new_sync_{read,write} for ->read/->write; leave it NULL 580*25b532ceSMauro Carvalho Chehabinstead. 581*25b532ceSMauro Carvalho Chehab 582*25b532ceSMauro Carvalho Chehab--- 583*25b532ceSMauro Carvalho Chehab 584*25b532ceSMauro Carvalho Chehab**mandatory** 585*25b532ceSMauro Carvalho Chehab ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter. 586*25b532ceSMauro Carvalho Chehab 587*25b532ceSMauro Carvalho Chehab--- 588*25b532ceSMauro Carvalho Chehab 589*25b532ceSMauro Carvalho Chehab**recommended** 590*25b532ceSMauro Carvalho Chehab 591*25b532ceSMauro Carvalho Chehabfor embedded ("fast") symlinks just set inode->i_link to wherever the 592*25b532ceSMauro Carvalho Chehabsymlink body is and use simple_follow_link() as ->follow_link(). 593*25b532ceSMauro Carvalho Chehab 594*25b532ceSMauro Carvalho Chehab--- 595*25b532ceSMauro Carvalho Chehab 596*25b532ceSMauro Carvalho Chehab**mandatory** 597*25b532ceSMauro Carvalho Chehab 598*25b532ceSMauro Carvalho Chehabcalling conventions for ->follow_link() have changed. Instead of returning 599*25b532ceSMauro Carvalho Chehabcookie and using nd_set_link() to store the body to traverse, we return 600*25b532ceSMauro Carvalho Chehabthe body to traverse and store the cookie using explicit void ** argument. 601*25b532ceSMauro Carvalho Chehabnameidata isn't passed at all - nd_jump_link() doesn't need it and 602*25b532ceSMauro Carvalho Chehabnd_[gs]et_link() is gone. 603*25b532ceSMauro Carvalho Chehab 604*25b532ceSMauro Carvalho Chehab--- 605*25b532ceSMauro Carvalho Chehab 606*25b532ceSMauro Carvalho Chehab**mandatory** 607*25b532ceSMauro Carvalho Chehab 608*25b532ceSMauro Carvalho Chehabcalling conventions for ->put_link() have changed. It gets inode instead of 609*25b532ceSMauro Carvalho Chehabdentry, it does not get nameidata at all and it gets called only when cookie 610*25b532ceSMauro Carvalho Chehabis non-NULL. Note that link body isn't available anymore, so if you need it, 611*25b532ceSMauro Carvalho Chehabstore it as cookie. 612*25b532ceSMauro Carvalho Chehab 613*25b532ceSMauro Carvalho Chehab--- 614*25b532ceSMauro Carvalho Chehab 615*25b532ceSMauro Carvalho Chehab**mandatory** 616*25b532ceSMauro Carvalho Chehab 617*25b532ceSMauro Carvalho Chehabany symlink that might use page_follow_link_light/page_put_link() must 618*25b532ceSMauro Carvalho Chehabhave inode_nohighmem(inode) called before anything might start playing with 619*25b532ceSMauro Carvalho Chehabits pagecache. No highmem pages should end up in the pagecache of such 620*25b532ceSMauro Carvalho Chehabsymlinks. That includes any preseeding that might be done during symlink 621*25b532ceSMauro Carvalho Chehabcreation. __page_symlink() will honour the mapping gfp flags, so once 622*25b532ceSMauro Carvalho Chehabyou've done inode_nohighmem() it's safe to use, but if you allocate and 623*25b532ceSMauro Carvalho Chehabinsert the page manually, make sure to use the right gfp flags. 624*25b532ceSMauro Carvalho Chehab 625*25b532ceSMauro Carvalho Chehab--- 626*25b532ceSMauro Carvalho Chehab 627*25b532ceSMauro Carvalho Chehab**mandatory** 628*25b532ceSMauro Carvalho Chehab 629*25b532ceSMauro Carvalho Chehab->follow_link() is replaced with ->get_link(); same API, except that 630*25b532ceSMauro Carvalho Chehab 631*25b532ceSMauro Carvalho Chehab * ->get_link() gets inode as a separate argument 632*25b532ceSMauro Carvalho Chehab * ->get_link() may be called in RCU mode - in that case NULL 633*25b532ceSMauro Carvalho Chehab dentry is passed 634*25b532ceSMauro Carvalho Chehab 635*25b532ceSMauro Carvalho Chehab--- 636*25b532ceSMauro Carvalho Chehab 637*25b532ceSMauro Carvalho Chehab**mandatory** 638*25b532ceSMauro Carvalho Chehab 639*25b532ceSMauro Carvalho Chehab->get_link() gets struct delayed_call ``*done`` now, and should do 640*25b532ceSMauro Carvalho Chehabset_delayed_call() where it used to set ``*cookie``. 641*25b532ceSMauro Carvalho Chehab 642*25b532ceSMauro Carvalho Chehab->put_link() is gone - just give the destructor to set_delayed_call() 643*25b532ceSMauro Carvalho Chehabin ->get_link(). 644*25b532ceSMauro Carvalho Chehab 645*25b532ceSMauro Carvalho Chehab--- 646*25b532ceSMauro Carvalho Chehab 647*25b532ceSMauro Carvalho Chehab**mandatory** 648*25b532ceSMauro Carvalho Chehab 649*25b532ceSMauro Carvalho Chehab->getxattr() and xattr_handler.get() get dentry and inode passed separately. 650*25b532ceSMauro Carvalho Chehabdentry might be yet to be attached to inode, so do _not_ use its ->d_inode 651*25b532ceSMauro Carvalho Chehabin the instances. Rationale: !@#!@# security_d_instantiate() needs to be 652*25b532ceSMauro Carvalho Chehabcalled before we attach dentry to inode. 653*25b532ceSMauro Carvalho Chehab 654*25b532ceSMauro Carvalho Chehab--- 655*25b532ceSMauro Carvalho Chehab 656*25b532ceSMauro Carvalho Chehab**mandatory** 657*25b532ceSMauro Carvalho Chehab 658*25b532ceSMauro Carvalho Chehabsymlinks are no longer the only inodes that do *not* have i_bdev/i_cdev/ 659*25b532ceSMauro Carvalho Chehabi_pipe/i_link union zeroed out at inode eviction. As the result, you can't 660*25b532ceSMauro Carvalho Chehabassume that non-NULL value in ->i_nlink at ->destroy_inode() implies that 661*25b532ceSMauro Carvalho Chehabit's a symlink. Checking ->i_mode is really needed now. In-tree we had 662*25b532ceSMauro Carvalho Chehabto fix shmem_destroy_callback() that used to take that kind of shortcut; 663*25b532ceSMauro Carvalho Chehabwatch out, since that shortcut is no longer valid. 664*25b532ceSMauro Carvalho Chehab 665*25b532ceSMauro Carvalho Chehab--- 666*25b532ceSMauro Carvalho Chehab 667*25b532ceSMauro Carvalho Chehab**mandatory** 668*25b532ceSMauro Carvalho Chehab 669*25b532ceSMauro Carvalho Chehab->i_mutex is replaced with ->i_rwsem now. inode_lock() et.al. work as 670*25b532ceSMauro Carvalho Chehabthey used to - they just take it exclusive. However, ->lookup() may be 671*25b532ceSMauro Carvalho Chehabcalled with parent locked shared. Its instances must not 672*25b532ceSMauro Carvalho Chehab 673*25b532ceSMauro Carvalho Chehab * use d_instantiate) and d_rehash() separately - use d_add() or 674*25b532ceSMauro Carvalho Chehab d_splice_alias() instead. 675*25b532ceSMauro Carvalho Chehab * use d_rehash() alone - call d_add(new_dentry, NULL) instead. 676*25b532ceSMauro Carvalho Chehab * in the unlikely case when (read-only) access to filesystem 677*25b532ceSMauro Carvalho Chehab data structures needs exclusion for some reason, arrange it 678*25b532ceSMauro Carvalho Chehab yourself. None of the in-tree filesystems needed that. 679*25b532ceSMauro Carvalho Chehab * rely on ->d_parent and ->d_name not changing after dentry has 680*25b532ceSMauro Carvalho Chehab been fed to d_add() or d_splice_alias(). Again, none of the 681*25b532ceSMauro Carvalho Chehab in-tree instances relied upon that. 682*25b532ceSMauro Carvalho Chehab 683*25b532ceSMauro Carvalho ChehabWe are guaranteed that lookups of the same name in the same directory 684*25b532ceSMauro Carvalho Chehabwill not happen in parallel ("same" in the sense of your ->d_compare()). 685*25b532ceSMauro Carvalho ChehabLookups on different names in the same directory can and do happen in 686*25b532ceSMauro Carvalho Chehabparallel now. 687*25b532ceSMauro Carvalho Chehab 688*25b532ceSMauro Carvalho Chehab--- 689*25b532ceSMauro Carvalho Chehab 690*25b532ceSMauro Carvalho Chehab**recommended** 691*25b532ceSMauro Carvalho Chehab 692*25b532ceSMauro Carvalho Chehab->iterate_shared() is added; it's a parallel variant of ->iterate(). 693*25b532ceSMauro Carvalho ChehabExclusion on struct file level is still provided (as well as that 694*25b532ceSMauro Carvalho Chehabbetween it and lseek on the same struct file), but if your directory 695*25b532ceSMauro Carvalho Chehabhas been opened several times, you can get these called in parallel. 696*25b532ceSMauro Carvalho ChehabExclusion between that method and all directory-modifying ones is 697*25b532ceSMauro Carvalho Chehabstill provided, of course. 698*25b532ceSMauro Carvalho Chehab 699*25b532ceSMauro Carvalho ChehabOften enough ->iterate() can serve as ->iterate_shared() without any 700*25b532ceSMauro Carvalho Chehabchanges - it is a read-only operation, after all. If you have any 701*25b532ceSMauro Carvalho Chehabper-inode or per-dentry in-core data structures modified by ->iterate(), 702*25b532ceSMauro Carvalho Chehabyou might need something to serialize the access to them. If you 703*25b532ceSMauro Carvalho Chehabdo dcache pre-seeding, you'll need to switch to d_alloc_parallel() for 704*25b532ceSMauro Carvalho Chehabthat; look for in-tree examples. 705*25b532ceSMauro Carvalho Chehab 706*25b532ceSMauro Carvalho ChehabOld method is only used if the new one is absent; eventually it will 707*25b532ceSMauro Carvalho Chehabbe removed. Switch while you still can; the old one won't stay. 708*25b532ceSMauro Carvalho Chehab 709*25b532ceSMauro Carvalho Chehab--- 710*25b532ceSMauro Carvalho Chehab 711*25b532ceSMauro Carvalho Chehab**mandatory** 712*25b532ceSMauro Carvalho Chehab 713*25b532ceSMauro Carvalho Chehab->atomic_open() calls without O_CREAT may happen in parallel. 714*25b532ceSMauro Carvalho Chehab 715*25b532ceSMauro Carvalho Chehab--- 716*25b532ceSMauro Carvalho Chehab 717*25b532ceSMauro Carvalho Chehab**mandatory** 718*25b532ceSMauro Carvalho Chehab 719*25b532ceSMauro Carvalho Chehab->setxattr() and xattr_handler.set() get dentry and inode passed separately. 720*25b532ceSMauro Carvalho Chehabdentry might be yet to be attached to inode, so do _not_ use its ->d_inode 721*25b532ceSMauro Carvalho Chehabin the instances. Rationale: !@#!@# security_d_instantiate() needs to be 722*25b532ceSMauro Carvalho Chehabcalled before we attach dentry to inode and !@#!@##!@$!$#!@#$!@$!@$ smack 723*25b532ceSMauro Carvalho Chehab->d_instantiate() uses not just ->getxattr() but ->setxattr() as well. 724*25b532ceSMauro Carvalho Chehab 725*25b532ceSMauro Carvalho Chehab--- 726*25b532ceSMauro Carvalho Chehab 727*25b532ceSMauro Carvalho Chehab**mandatory** 728*25b532ceSMauro Carvalho Chehab 729*25b532ceSMauro Carvalho Chehab->d_compare() doesn't get parent as a separate argument anymore. If you 730*25b532ceSMauro Carvalho Chehabused it for finding the struct super_block involved, dentry->d_sb will 731*25b532ceSMauro Carvalho Chehabwork just as well; if it's something more complicated, use dentry->d_parent. 732*25b532ceSMauro Carvalho ChehabJust be careful not to assume that fetching it more than once will yield 733*25b532ceSMauro Carvalho Chehabthe same value - in RCU mode it could change under you. 734*25b532ceSMauro Carvalho Chehab 735*25b532ceSMauro Carvalho Chehab--- 736*25b532ceSMauro Carvalho Chehab 737*25b532ceSMauro Carvalho Chehab**mandatory** 738*25b532ceSMauro Carvalho Chehab 739*25b532ceSMauro Carvalho Chehab->rename() has an added flags argument. Any flags not handled by the 740*25b532ceSMauro Carvalho Chehabfilesystem should result in EINVAL being returned. 741*25b532ceSMauro Carvalho Chehab 742*25b532ceSMauro Carvalho Chehab--- 743*25b532ceSMauro Carvalho Chehab 744*25b532ceSMauro Carvalho Chehab 745*25b532ceSMauro Carvalho Chehab**recommended** 746*25b532ceSMauro Carvalho Chehab 747*25b532ceSMauro Carvalho Chehab->readlink is optional for symlinks. Don't set, unless filesystem needs 748*25b532ceSMauro Carvalho Chehabto fake something for readlink(2). 749*25b532ceSMauro Carvalho Chehab 750*25b532ceSMauro Carvalho Chehab--- 751*25b532ceSMauro Carvalho Chehab 752*25b532ceSMauro Carvalho Chehab**mandatory** 753*25b532ceSMauro Carvalho Chehab 754*25b532ceSMauro Carvalho Chehab->getattr() is now passed a struct path rather than a vfsmount and 755*25b532ceSMauro Carvalho Chehabdentry separately, and it now has request_mask and query_flags arguments 756*25b532ceSMauro Carvalho Chehabto specify the fields and sync type requested by statx. Filesystems not 757*25b532ceSMauro Carvalho Chehabsupporting any statx-specific features may ignore the new arguments. 758*25b532ceSMauro Carvalho Chehab 759*25b532ceSMauro Carvalho Chehab--- 760*25b532ceSMauro Carvalho Chehab 761*25b532ceSMauro Carvalho Chehab**mandatory** 762*25b532ceSMauro Carvalho Chehab 763*25b532ceSMauro Carvalho Chehab->atomic_open() calling conventions have changed. Gone is ``int *opened``, 764*25b532ceSMauro Carvalho Chehabalong with FILE_OPENED/FILE_CREATED. In place of those we have 765*25b532ceSMauro Carvalho ChehabFMODE_OPENED/FMODE_CREATED, set in file->f_mode. Additionally, return 766*25b532ceSMauro Carvalho Chehabvalue for 'called finish_no_open(), open it yourself' case has become 767*25b532ceSMauro Carvalho Chehab0, not 1. Since finish_no_open() itself is returning 0 now, that part 768*25b532ceSMauro Carvalho Chehabdoes not need any changes in ->atomic_open() instances. 769*25b532ceSMauro Carvalho Chehab 770*25b532ceSMauro Carvalho Chehab--- 771*25b532ceSMauro Carvalho Chehab 772*25b532ceSMauro Carvalho Chehab**mandatory** 773*25b532ceSMauro Carvalho Chehab 774*25b532ceSMauro Carvalho Chehaballoc_file() has become static now; two wrappers are to be used instead. 775*25b532ceSMauro Carvalho Chehaballoc_file_pseudo(inode, vfsmount, name, flags, ops) is for the cases 776*25b532ceSMauro Carvalho Chehabwhen dentry needs to be created; that's the majority of old alloc_file() 777*25b532ceSMauro Carvalho Chehabusers. Calling conventions: on success a reference to new struct file 778*25b532ceSMauro Carvalho Chehabis returned and callers reference to inode is subsumed by that. On 779*25b532ceSMauro Carvalho Chehabfailure, ERR_PTR() is returned and no caller's references are affected, 780*25b532ceSMauro Carvalho Chehabso the caller needs to drop the inode reference it held. 781*25b532ceSMauro Carvalho Chehaballoc_file_clone(file, flags, ops) does not affect any caller's references. 782*25b532ceSMauro Carvalho ChehabOn success you get a new struct file sharing the mount/dentry with the 783*25b532ceSMauro Carvalho Chehaboriginal, on failure - ERR_PTR(). 784*25b532ceSMauro Carvalho Chehab 785*25b532ceSMauro Carvalho Chehab--- 786*25b532ceSMauro Carvalho Chehab 787*25b532ceSMauro Carvalho Chehab**mandatory** 788*25b532ceSMauro Carvalho Chehab 789*25b532ceSMauro Carvalho Chehab->clone_file_range() and ->dedupe_file_range have been replaced with 790*25b532ceSMauro Carvalho Chehab->remap_file_range(). See Documentation/filesystems/vfs.rst for more 791*25b532ceSMauro Carvalho Chehabinformation. 792*25b532ceSMauro Carvalho Chehab 793*25b532ceSMauro Carvalho Chehab--- 794*25b532ceSMauro Carvalho Chehab 795*25b532ceSMauro Carvalho Chehab**recommended** 796*25b532ceSMauro Carvalho Chehab 797*25b532ceSMauro Carvalho Chehab->lookup() instances doing an equivalent of:: 798*25b532ceSMauro Carvalho Chehab 799*25b532ceSMauro Carvalho Chehab if (IS_ERR(inode)) 800*25b532ceSMauro Carvalho Chehab return ERR_CAST(inode); 801*25b532ceSMauro Carvalho Chehab return d_splice_alias(inode, dentry); 802*25b532ceSMauro Carvalho Chehab 803*25b532ceSMauro Carvalho Chehabdon't need to bother with the check - d_splice_alias() will do the 804*25b532ceSMauro Carvalho Chehabright thing when given ERR_PTR(...) as inode. Moreover, passing NULL 805*25b532ceSMauro Carvalho Chehabinode to d_splice_alias() will also do the right thing (equivalent of 806*25b532ceSMauro Carvalho Chehabd_add(dentry, NULL); return NULL;), so that kind of special cases 807*25b532ceSMauro Carvalho Chehabalso doesn't need a separate treatment. 808*25b532ceSMauro Carvalho Chehab 809*25b532ceSMauro Carvalho Chehab--- 810*25b532ceSMauro Carvalho Chehab 811*25b532ceSMauro Carvalho Chehab**strongly recommended** 812*25b532ceSMauro Carvalho Chehab 813*25b532ceSMauro Carvalho Chehabtake the RCU-delayed parts of ->destroy_inode() into a new method - 814*25b532ceSMauro Carvalho Chehab->free_inode(). If ->destroy_inode() becomes empty - all the better, 815*25b532ceSMauro Carvalho Chehabjust get rid of it. Synchronous work (e.g. the stuff that can't 816*25b532ceSMauro Carvalho Chehabbe done from an RCU callback, or any WARN_ON() where we want the 817*25b532ceSMauro Carvalho Chehabstack trace) *might* be movable to ->evict_inode(); however, 818*25b532ceSMauro Carvalho Chehabthat goes only for the things that are not needed to balance something 819*25b532ceSMauro Carvalho Chehabdone by ->alloc_inode(). IOW, if it's cleaning up the stuff that 820*25b532ceSMauro Carvalho Chehabmight have accumulated over the life of in-core inode, ->evict_inode() 821*25b532ceSMauro Carvalho Chehabmight be a fit. 822*25b532ceSMauro Carvalho Chehab 823*25b532ceSMauro Carvalho ChehabRules for inode destruction: 824*25b532ceSMauro Carvalho Chehab 825*25b532ceSMauro Carvalho Chehab * if ->destroy_inode() is non-NULL, it gets called 826*25b532ceSMauro Carvalho Chehab * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() 827*25b532ceSMauro Carvalho Chehab * combination of NULL ->destroy_inode and NULL ->free_inode is 828*25b532ceSMauro Carvalho Chehab treated as NULL/free_inode_nonrcu, to preserve the compatibility. 829*25b532ceSMauro Carvalho Chehab 830*25b532ceSMauro Carvalho ChehabNote that the callback (be it via ->free_inode() or explicit call_rcu() 831*25b532ceSMauro Carvalho Chehabin ->destroy_inode()) is *NOT* ordered wrt superblock destruction; 832*25b532ceSMauro Carvalho Chehabas the matter of fact, the superblock and all associated structures 833*25b532ceSMauro Carvalho Chehabmight be already gone. The filesystem driver is guaranteed to be still 834*25b532ceSMauro Carvalho Chehabthere, but that's it. Freeing memory in the callback is fine; doing 835*25b532ceSMauro Carvalho Chehabmore than that is possible, but requires a lot of care and is best 836*25b532ceSMauro Carvalho Chehabavoided. 837*25b532ceSMauro Carvalho Chehab 838*25b532ceSMauro Carvalho Chehab--- 839*25b532ceSMauro Carvalho Chehab 840*25b532ceSMauro Carvalho Chehab**mandatory** 841*25b532ceSMauro Carvalho Chehab 842*25b532ceSMauro Carvalho ChehabDCACHE_RCUACCESS is gone; having an RCU delay on dentry freeing is the 843*25b532ceSMauro Carvalho Chehabdefault. DCACHE_NORCU opts out, and only d_alloc_pseudo() has any 844*25b532ceSMauro Carvalho Chehabbusiness doing so. 845*25b532ceSMauro Carvalho Chehab 846*25b532ceSMauro Carvalho Chehab--- 847*25b532ceSMauro Carvalho Chehab 848*25b532ceSMauro Carvalho Chehab**mandatory** 849*25b532ceSMauro Carvalho Chehab 850*25b532ceSMauro Carvalho Chehabd_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are 851*25b532ceSMauro Carvalho Chehabvery suspect (and won't work in modules). Such uses are very likely to 852*25b532ceSMauro Carvalho Chehabbe misspelled d_alloc_anon(). 853