1898bd37aSMauro Carvalho Chehab=================== 2898bd37aSMauro Carvalho ChehabBlock io priorities 3898bd37aSMauro Carvalho Chehab=================== 4898bd37aSMauro Carvalho Chehab 5898bd37aSMauro Carvalho Chehab 6898bd37aSMauro Carvalho ChehabIntro 7898bd37aSMauro Carvalho Chehab----- 8898bd37aSMauro Carvalho Chehab 9*b2e792aeSChristian LoehleThe io priority feature enables users to io nice processes or process groups, 10*b2e792aeSChristian Loehlesimilar to what has been possible with cpu scheduling for ages. Support for io 11*b2e792aeSChristian Loehlepriorities is io scheduler dependent and currently supported by bfq and 12*b2e792aeSChristian Loehlemq-deadline. 13898bd37aSMauro Carvalho Chehab 14898bd37aSMauro Carvalho ChehabScheduling classes 15898bd37aSMauro Carvalho Chehab------------------ 16898bd37aSMauro Carvalho Chehab 17*b2e792aeSChristian LoehleThree generic scheduling classes are implemented for io priorities that 18*b2e792aeSChristian Loehledetermine how io is served for a process. 19898bd37aSMauro Carvalho Chehab 20898bd37aSMauro Carvalho ChehabIOPRIO_CLASS_RT: This is the realtime io class. This scheduling class is given 21898bd37aSMauro Carvalho Chehabhigher priority than any other in the system, processes from this class are 22898bd37aSMauro Carvalho Chehabgiven first access to the disk every time. Thus it needs to be used with some 23898bd37aSMauro Carvalho Chehabcare, one io RT process can starve the entire system. Within the RT class, 24898bd37aSMauro Carvalho Chehabthere are 8 levels of class data that determine exactly how much time this 25898bd37aSMauro Carvalho Chehabprocess needs the disk for on each service. In the future this might change 26898bd37aSMauro Carvalho Chehabto be more directly mappable to performance, by passing in a wanted data 27898bd37aSMauro Carvalho Chehabrate instead. 28898bd37aSMauro Carvalho Chehab 29898bd37aSMauro Carvalho ChehabIOPRIO_CLASS_BE: This is the best-effort scheduling class, which is the default 30898bd37aSMauro Carvalho Chehabfor any process that hasn't set a specific io priority. The class data 31898bd37aSMauro Carvalho Chehabdetermines how much io bandwidth the process will get, it's directly mappable 32898bd37aSMauro Carvalho Chehabto the cpu nice levels just more coarsely implemented. 0 is the highest 33898bd37aSMauro Carvalho ChehabBE prio level, 7 is the lowest. The mapping between cpu nice level and io 34898bd37aSMauro Carvalho Chehabnice level is determined as: io_nice = (cpu_nice + 20) / 5. 35898bd37aSMauro Carvalho Chehab 36898bd37aSMauro Carvalho ChehabIOPRIO_CLASS_IDLE: This is the idle scheduling class, processes running at this 37898bd37aSMauro Carvalho Chehablevel only get io time when no one else needs the disk. The idle class has no 38898bd37aSMauro Carvalho Chehabclass data, since it doesn't really apply here. 39898bd37aSMauro Carvalho Chehab 40898bd37aSMauro Carvalho ChehabTools 41898bd37aSMauro Carvalho Chehab----- 42898bd37aSMauro Carvalho Chehab 43898bd37aSMauro Carvalho ChehabSee below for a sample ionice tool. Usage:: 44898bd37aSMauro Carvalho Chehab 45898bd37aSMauro Carvalho Chehab # ionice -c<class> -n<level> -p<pid> 46898bd37aSMauro Carvalho Chehab 47898bd37aSMauro Carvalho ChehabIf pid isn't given, the current process is assumed. IO priority settings 48898bd37aSMauro Carvalho Chehabare inherited on fork, so you can use ionice to start the process at a given 49898bd37aSMauro Carvalho Chehablevel:: 50898bd37aSMauro Carvalho Chehab 51898bd37aSMauro Carvalho Chehab # ionice -c2 -n0 /bin/ls 52898bd37aSMauro Carvalho Chehab 53898bd37aSMauro Carvalho Chehabwill run ls at the best-effort scheduling class at the highest priority. 54898bd37aSMauro Carvalho ChehabFor a running process, you can give the pid instead:: 55898bd37aSMauro Carvalho Chehab 56898bd37aSMauro Carvalho Chehab # ionice -c1 -n2 -p100 57898bd37aSMauro Carvalho Chehab 58898bd37aSMauro Carvalho Chehabwill change pid 100 to run at the realtime scheduling class, at priority 2. 59898bd37aSMauro Carvalho Chehab 60898bd37aSMauro Carvalho Chehabionice.c tool:: 61898bd37aSMauro Carvalho Chehab 62898bd37aSMauro Carvalho Chehab #include <stdio.h> 63898bd37aSMauro Carvalho Chehab #include <stdlib.h> 64898bd37aSMauro Carvalho Chehab #include <errno.h> 65898bd37aSMauro Carvalho Chehab #include <getopt.h> 66898bd37aSMauro Carvalho Chehab #include <unistd.h> 67898bd37aSMauro Carvalho Chehab #include <sys/ptrace.h> 68898bd37aSMauro Carvalho Chehab #include <asm/unistd.h> 69898bd37aSMauro Carvalho Chehab 70898bd37aSMauro Carvalho Chehab extern int sys_ioprio_set(int, int, int); 71898bd37aSMauro Carvalho Chehab extern int sys_ioprio_get(int, int); 72898bd37aSMauro Carvalho Chehab 73898bd37aSMauro Carvalho Chehab #if defined(__i386__) 74898bd37aSMauro Carvalho Chehab #define __NR_ioprio_set 289 75898bd37aSMauro Carvalho Chehab #define __NR_ioprio_get 290 76898bd37aSMauro Carvalho Chehab #elif defined(__ppc__) 77898bd37aSMauro Carvalho Chehab #define __NR_ioprio_set 273 78898bd37aSMauro Carvalho Chehab #define __NR_ioprio_get 274 79898bd37aSMauro Carvalho Chehab #elif defined(__x86_64__) 80898bd37aSMauro Carvalho Chehab #define __NR_ioprio_set 251 81898bd37aSMauro Carvalho Chehab #define __NR_ioprio_get 252 82898bd37aSMauro Carvalho Chehab #else 83898bd37aSMauro Carvalho Chehab #error "Unsupported arch" 84898bd37aSMauro Carvalho Chehab #endif 85898bd37aSMauro Carvalho Chehab 86898bd37aSMauro Carvalho Chehab static inline int ioprio_set(int which, int who, int ioprio) 87898bd37aSMauro Carvalho Chehab { 88898bd37aSMauro Carvalho Chehab return syscall(__NR_ioprio_set, which, who, ioprio); 89898bd37aSMauro Carvalho Chehab } 90898bd37aSMauro Carvalho Chehab 91898bd37aSMauro Carvalho Chehab static inline int ioprio_get(int which, int who) 92898bd37aSMauro Carvalho Chehab { 93898bd37aSMauro Carvalho Chehab return syscall(__NR_ioprio_get, which, who); 94898bd37aSMauro Carvalho Chehab } 95898bd37aSMauro Carvalho Chehab 96898bd37aSMauro Carvalho Chehab enum { 97898bd37aSMauro Carvalho Chehab IOPRIO_CLASS_NONE, 98898bd37aSMauro Carvalho Chehab IOPRIO_CLASS_RT, 99898bd37aSMauro Carvalho Chehab IOPRIO_CLASS_BE, 100898bd37aSMauro Carvalho Chehab IOPRIO_CLASS_IDLE, 101898bd37aSMauro Carvalho Chehab }; 102898bd37aSMauro Carvalho Chehab 103898bd37aSMauro Carvalho Chehab enum { 104898bd37aSMauro Carvalho Chehab IOPRIO_WHO_PROCESS = 1, 105898bd37aSMauro Carvalho Chehab IOPRIO_WHO_PGRP, 106898bd37aSMauro Carvalho Chehab IOPRIO_WHO_USER, 107898bd37aSMauro Carvalho Chehab }; 108898bd37aSMauro Carvalho Chehab 109898bd37aSMauro Carvalho Chehab #define IOPRIO_CLASS_SHIFT 13 110898bd37aSMauro Carvalho Chehab 111898bd37aSMauro Carvalho Chehab const char *to_prio[] = { "none", "realtime", "best-effort", "idle", }; 112898bd37aSMauro Carvalho Chehab 113898bd37aSMauro Carvalho Chehab int main(int argc, char *argv[]) 114898bd37aSMauro Carvalho Chehab { 115898bd37aSMauro Carvalho Chehab int ioprio = 4, set = 0, ioprio_class = IOPRIO_CLASS_BE; 116898bd37aSMauro Carvalho Chehab int c, pid = 0; 117898bd37aSMauro Carvalho Chehab 118898bd37aSMauro Carvalho Chehab while ((c = getopt(argc, argv, "+n:c:p:")) != EOF) { 119898bd37aSMauro Carvalho Chehab switch (c) { 120898bd37aSMauro Carvalho Chehab case 'n': 121898bd37aSMauro Carvalho Chehab ioprio = strtol(optarg, NULL, 10); 122898bd37aSMauro Carvalho Chehab set = 1; 123898bd37aSMauro Carvalho Chehab break; 124898bd37aSMauro Carvalho Chehab case 'c': 125898bd37aSMauro Carvalho Chehab ioprio_class = strtol(optarg, NULL, 10); 126898bd37aSMauro Carvalho Chehab set = 1; 127898bd37aSMauro Carvalho Chehab break; 128898bd37aSMauro Carvalho Chehab case 'p': 129898bd37aSMauro Carvalho Chehab pid = strtol(optarg, NULL, 10); 130898bd37aSMauro Carvalho Chehab break; 131898bd37aSMauro Carvalho Chehab } 132898bd37aSMauro Carvalho Chehab } 133898bd37aSMauro Carvalho Chehab 134898bd37aSMauro Carvalho Chehab switch (ioprio_class) { 135898bd37aSMauro Carvalho Chehab case IOPRIO_CLASS_NONE: 136898bd37aSMauro Carvalho Chehab ioprio_class = IOPRIO_CLASS_BE; 137898bd37aSMauro Carvalho Chehab break; 138898bd37aSMauro Carvalho Chehab case IOPRIO_CLASS_RT: 139898bd37aSMauro Carvalho Chehab case IOPRIO_CLASS_BE: 140898bd37aSMauro Carvalho Chehab break; 141898bd37aSMauro Carvalho Chehab case IOPRIO_CLASS_IDLE: 142898bd37aSMauro Carvalho Chehab ioprio = 7; 143898bd37aSMauro Carvalho Chehab break; 144898bd37aSMauro Carvalho Chehab default: 145898bd37aSMauro Carvalho Chehab printf("bad prio class %d\n", ioprio_class); 146898bd37aSMauro Carvalho Chehab return 1; 147898bd37aSMauro Carvalho Chehab } 148898bd37aSMauro Carvalho Chehab 149898bd37aSMauro Carvalho Chehab if (!set) { 150898bd37aSMauro Carvalho Chehab if (!pid && argv[optind]) 151898bd37aSMauro Carvalho Chehab pid = strtol(argv[optind], NULL, 10); 152898bd37aSMauro Carvalho Chehab 153898bd37aSMauro Carvalho Chehab ioprio = ioprio_get(IOPRIO_WHO_PROCESS, pid); 154898bd37aSMauro Carvalho Chehab 155898bd37aSMauro Carvalho Chehab printf("pid=%d, %d\n", pid, ioprio); 156898bd37aSMauro Carvalho Chehab 157898bd37aSMauro Carvalho Chehab if (ioprio == -1) 158898bd37aSMauro Carvalho Chehab perror("ioprio_get"); 159898bd37aSMauro Carvalho Chehab else { 160898bd37aSMauro Carvalho Chehab ioprio_class = ioprio >> IOPRIO_CLASS_SHIFT; 161898bd37aSMauro Carvalho Chehab ioprio = ioprio & 0xff; 162898bd37aSMauro Carvalho Chehab printf("%s: prio %d\n", to_prio[ioprio_class], ioprio); 163898bd37aSMauro Carvalho Chehab } 164898bd37aSMauro Carvalho Chehab } else { 165898bd37aSMauro Carvalho Chehab if (ioprio_set(IOPRIO_WHO_PROCESS, pid, ioprio | ioprio_class << IOPRIO_CLASS_SHIFT) == -1) { 166898bd37aSMauro Carvalho Chehab perror("ioprio_set"); 167898bd37aSMauro Carvalho Chehab return 1; 168898bd37aSMauro Carvalho Chehab } 169898bd37aSMauro Carvalho Chehab 170898bd37aSMauro Carvalho Chehab if (argv[optind]) 171898bd37aSMauro Carvalho Chehab execvp(argv[optind], &argv[optind]); 172898bd37aSMauro Carvalho Chehab } 173898bd37aSMauro Carvalho Chehab 174898bd37aSMauro Carvalho Chehab return 0; 175898bd37aSMauro Carvalho Chehab } 176898bd37aSMauro Carvalho Chehab 177898bd37aSMauro Carvalho Chehab 178898bd37aSMauro Carvalho ChehabMarch 11 2005, Jens Axboe <jens.axboe@oracle.com> 179