1========================================== 2Explicit volatile write back cache control 3========================================== 4 5Introduction 6------------ 7 8Many storage devices, especially in the consumer market, come with volatile 9write back caches. That means the devices signal I/O completion to the 10operating system before data actually has hit the non-volatile storage. This 11behavior obviously speeds up various workloads, but it means the operating 12system needs to force data out to the non-volatile storage when it performs 13a data integrity operation like fsync, sync or an unmount. 14 15The Linux block layer provides two simple mechanisms that let filesystems 16control the caching behavior of the storage device. These mechanisms are 17a forced cache flush, and the Force Unit Access (FUA) flag for requests. 18 19 20Explicit cache flushes 21---------------------- 22 23The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from 24the filesystem and will make sure the volatile cache of the storage device 25has been flushed before the actual I/O operation is started. This explicitly 26guarantees that previously completed write requests are on non-volatile 27storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be 28set on an otherwise empty bio structure, which causes only an explicit cache 29flush without any dependent I/O. It is recommend to use 30the blkdev_issue_flush() helper for a pure cache flush. 31 32 33Forced Unit Access 34------------------ 35 36The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the 37filesystem and will make sure that I/O completion for this request is only 38signaled after the data has been committed to non-volatile storage. 39 40 41Implementation details for filesystems 42-------------------------------------- 43 44Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to 45worry if the underlying devices need any explicit cache flushing and how 46the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags 47may both be set on a single bio. 48 49Feature settings for block drivers 50---------------------------------- 51 52For devices that do not support volatile write caches there is no driver 53support required, the block layer completes empty REQ_PREFLUSH requests before 54entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from 55requests that have a payload. 56 57For devices with volatile write caches the driver needs to tell the block layer 58that it supports flushing caches by setting the 59 60 BLK_FEAT_WRITE_CACHE 61 62flag in the queue_limits feature field. For devices that also support the FUA 63bit the block layer needs to be told to pass on the REQ_FUA bit by also setting 64the 65 66 BLK_FEAT_FUA 67 68flag in the features field of the queue_limits structure. 69 70Implementation details for bio based block drivers 71-------------------------------------------------- 72 73For bio based drivers the REQ_PREFLUSH and REQ_FUA bit are simply passed on to 74the driver if the driver sets the BLK_FEAT_WRITE_CACHE flag and the driver 75needs to handle them. 76 77*NOTE*: The REQ_FUA bit also gets passed on when the BLK_FEAT_FUA flags is 78_not_ set. Any bio based driver that sets BLK_FEAT_WRITE_CACHE also needs to 79handle REQ_FUA. 80 81For remapping drivers the REQ_FUA bits need to be propagated to underlying 82devices, and a global flush needs to be implemented for bios with the 83REQ_PREFLUSH bit set. 84 85Implementation details for blk-mq drivers 86----------------------------------------- 87 88When the BLK_FEAT_WRITE_CACHE flag is set, REQ_OP_WRITE | REQ_PREFLUSH requests 89with a payload are automatically turned into a sequence of a REQ_OP_FLUSH 90request followed by the actual write by the block layer. 91 92When the BLK_FEAT_FUA flags is set, the REQ_FUA bit is simply passed on for the 93REQ_OP_WRITE request, else a REQ_OP_FLUSH request is sent by the block layer 94after the completion of the write request for bio submissions with the REQ_FUA 95bit set. 96