1.. SPDX-License-Identifier: GPL-2.0 2 3===================== 4Multigrain Timestamps 5===================== 6 7Introduction 8============ 9Historically, the kernel has always used coarse time values to stamp inodes. 10This value is updated every jiffy, so any change that happens within that jiffy 11will end up with the same timestamp. 12 13When the kernel goes to stamp an inode (due to a read or write), it first gets 14the current time and then compares it to the existing timestamp(s) to see 15whether anything will change. If nothing changed, then it can avoid updating 16the inode's metadata. 17 18Coarse timestamps are therefore good from a performance standpoint, since they 19reduce the need for metadata updates, but bad from the standpoint of 20determining whether anything has changed, since a lot of things can happen in a 21jiffy. 22 23They are particularly troublesome with NFSv3, where unchanging timestamps can 24make it difficult to tell whether to invalidate caches. NFSv4 provides a 25dedicated change attribute that should always show a visible change, but not 26all filesystems implement this properly, causing the NFS server to substitute 27the ctime in many cases. 28 29Multigrain timestamps aim to remedy this by selectively using fine-grained 30timestamps when a file has had its timestamps queried recently, and the current 31coarse-grained time does not cause a change. 32 33Inode Timestamps 34================ 35There are currently 3 timestamps in the inode that are updated to the current 36wallclock time on different activity: 37 38ctime: 39 The inode change time. This is stamped with the current time whenever 40 the inode's metadata is changed. Note that this value is not settable 41 from userland. 42 43mtime: 44 The inode modification time. This is stamped with the current time 45 any time a file's contents change. 46 47atime: 48 The inode access time. This is stamped whenever an inode's contents are 49 read. Widely considered to be a terrible mistake. Usually avoided with 50 options like noatime or relatime. 51 52Updating the mtime always implies a change to the ctime, but updating the 53atime due to a read request does not. 54 55Multigrain timestamps are only tracked for the ctime and the mtime. atimes are 56not affected and always use the coarse-grained value (subject to the floor). 57 58Inode Timestamp Ordering 59======================== 60 61In addition to just providing info about changes to individual files, file 62timestamps also serve an important purpose in applications like "make". These 63programs measure timestamps in order to determine whether source files might be 64newer than cached objects. 65 66Userland applications like make can only determine ordering based on 67operational boundaries. For a syscall those are the syscall entry and exit 68points. For io_uring or nfsd operations, that's the request submission and 69response. In the case of concurrent operations, userland can make no 70determination about the order in which things will occur. 71 72For instance, if a single thread modifies one file, and then another file in 73sequence, the second file must show an equal or later mtime than the first. The 74same is true if two threads are issuing similar operations that do not overlap 75in time. 76 77If however, two threads have racing syscalls that overlap in time, then there 78is no such guarantee, and the second file may appear to have been modified 79before, after or at the same time as the first, regardless of which one was 80submitted first. 81 82Note that the above assumes that the system doesn't experience a backward jump 83of the realtime clock. If that occurs at an inopportune time, then timestamps 84can appear to go backward, even on a properly functioning system. 85 86Multigrain Timestamp Implementation 87=================================== 88Multigrain timestamps are aimed at ensuring that changes to a single file are 89always recognizable, without violating the ordering guarantees when multiple 90different files are modified. This affects the mtime and the ctime, but the 91atime will always use coarse-grained timestamps. 92 93It uses an unused bit in the i_ctime_nsec field to indicate whether the mtime 94or ctime has been queried. If either or both have, then the kernel takes 95special care to ensure the next timestamp update will display a visible change. 96This ensures tight cache coherency for use-cases like NFS, without sacrificing 97the benefits of reduced metadata updates when files aren't being watched. 98 99The Ctime Floor Value 100===================== 101It's not sufficient to simply use fine or coarse-grained timestamps based on 102whether the mtime or ctime has been queried. A file could get a fine grained 103timestamp, and then a second file modified later could get a coarse-grained one 104that appears earlier than the first, which would break the kernel's timestamp 105ordering guarantees. 106 107To mitigate this problem, maintain a global floor value that ensures that 108this can't happen. The two files in the above example may appear to have been 109modified at the same time in such a case, but they will never show the reverse 110order. To avoid problems with realtime clock jumps, the floor is managed as a 111monotonic ktime_t, and the values are converted to realtime clock values as 112needed. 113 114Implementation Notes 115==================== 116Multigrain timestamps are intended for use by local filesystems that get 117ctime values from the local clock. This is in contrast to network filesystems 118and the like that just mirror timestamp values from a server. 119 120For most filesystems, it's sufficient to just set the FS_MGTIME flag in the 121fstype->fs_flags in order to opt-in, providing the ctime is only ever set via 122inode_set_ctime_current(). If the filesystem has a ->getattr routine that 123doesn't call generic_fillattr, then it should call fill_mg_cmtime() to 124fill those values. For setattr, it should use setattr_copy() to update the 125timestamps, or otherwise mimic its behavior. 126