Documentation/block/deadline-iosched.rst

*898bd37aSMauro Carvalho Chehab==============================
*898bd37aSMauro Carvalho ChehabDeadline IO scheduler tunables
*898bd37aSMauro Carvalho Chehab==============================
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabThis little file attempts to document how the deadline io scheduler works.
*898bd37aSMauro Carvalho ChehabIn particular, it will clarify the meaning of the exposed tunables that may be
*898bd37aSMauro Carvalho Chehabof interest to power users.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabSelecting IO schedulers
*898bd37aSMauro Carvalho Chehab-----------------------
*898bd37aSMauro Carvalho ChehabRefer to Documentation/block/switching-sched.rst for information on
*898bd37aSMauro Carvalho Chehabselecting an io scheduler on a per-device basis.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab------------------------------------------------------------------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehabread_expire	(in ms)
*898bd37aSMauro Carvalho Chehab-----------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabThe goal of the deadline io scheduler is to attempt to guarantee a start
*898bd37aSMauro Carvalho Chehabservice time for a request. As we focus mainly on read latencies, this is
*898bd37aSMauro Carvalho Chehabtunable. When a read request first enters the io scheduler, it is assigned
*898bd37aSMauro Carvalho Chehaba deadline that is the current time + the read_expire value in units of
*898bd37aSMauro Carvalho Chehabmilliseconds.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehabwrite_expire	(in ms)
*898bd37aSMauro Carvalho Chehab-----------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabSimilar to read_expire mentioned above, but for writes.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehabfifo_batch	(number of requests)
*898bd37aSMauro Carvalho Chehab------------------------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabRequests are grouped into ``batches`` of a particular data direction (read or
*898bd37aSMauro Carvalho Chehabwrite) which are serviced in increasing sector order.  To limit extra seeking,
*898bd37aSMauro Carvalho Chehabdeadline expiries are only checked between batches.  fifo_batch controls the
*898bd37aSMauro Carvalho Chehabmaximum number of requests per batch.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabThis parameter tunes the balance between per-request latency and aggregate
*898bd37aSMauro Carvalho Chehabthroughput.  When low latency is the primary concern, smaller is better (where
*898bd37aSMauro Carvalho Chehaba value of 1 yields first-come first-served behaviour).  Increasing fifo_batch
*898bd37aSMauro Carvalho Chehabgenerally improves throughput, at the cost of latency variation.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehabwrites_starved	(number of dispatches)
*898bd37aSMauro Carvalho Chehab--------------------------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabWhen we have to move requests from the io scheduler queue to the block
*898bd37aSMauro Carvalho Chehabdevice dispatch queue, we always give a preference to reads. However, we
*898bd37aSMauro Carvalho Chehabdon't want to starve writes indefinitely either. So writes_starved controls
*898bd37aSMauro Carvalho Chehabhow many times we give preference to reads over writes. When that has been
*898bd37aSMauro Carvalho Chehabdone writes_starved number of times, we dispatch some writes based on the
*898bd37aSMauro Carvalho Chehabsame criteria as reads.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehabfront_merges	(bool)
*898bd37aSMauro Carvalho Chehab----------------------
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabSometimes it happens that a request enters the io scheduler that is contiguous
*898bd37aSMauro Carvalho Chehabwith a request that is already on the queue. Either it fits in the back of that
*898bd37aSMauro Carvalho Chehabrequest, or it fits at the front. That is called either a back merge candidate
*898bd37aSMauro Carvalho Chehabor a front merge candidate. Due to the way files are typically laid out,
*898bd37aSMauro Carvalho Chehabback merges are much more common than front merges. For some work loads, you
*898bd37aSMauro Carvalho Chehabmay even know that it is a waste of time to spend any time attempting to
*898bd37aSMauro Carvalho Chehabfront merge requests. Setting front_merges to 0 disables this functionality.
*898bd37aSMauro Carvalho ChehabFront merges may still occur due to the cached last_merge hint, but since
*898bd37aSMauro Carvalho Chehabthat comes at basically 0 cost we leave that on. We simply disable the
*898bd37aSMauro Carvalho Chehabrbtree front sector lookup when the io scheduler merge function is called.
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho Chehab
*898bd37aSMauro Carvalho ChehabNov 11 2002, Jens Axboe <jens.axboe@oracle.com>