1d6a3b247SMauro Carvalho Chehab================================================================= 2d6a3b247SMauro Carvalho ChehabCPU Scheduler implementation hints for architecture specific code 3d6a3b247SMauro Carvalho Chehab================================================================= 4d6a3b247SMauro Carvalho Chehab 5d6a3b247SMauro Carvalho Chehab Nick Piggin, 2005 6d6a3b247SMauro Carvalho Chehab 7d6a3b247SMauro Carvalho ChehabContext switch 8d6a3b247SMauro Carvalho Chehab============== 9d6a3b247SMauro Carvalho Chehab1. Runqueue locking 10d6a3b247SMauro Carvalho ChehabBy default, the switch_to arch function is called with the runqueue 11d6a3b247SMauro Carvalho Chehablocked. This is usually not a problem unless switch_to may need to 12d6a3b247SMauro Carvalho Chehabtake the runqueue lock. This is usually due to a wake up operation in 13*94483490SArd Biesheuvelthe context switch. 14d6a3b247SMauro Carvalho Chehab 15d6a3b247SMauro Carvalho ChehabTo request the scheduler call switch_to with the runqueue unlocked, 16d6a3b247SMauro Carvalho Chehabyou must `#define __ARCH_WANT_UNLOCKED_CTXSW` in a header file 17d6a3b247SMauro Carvalho Chehab(typically the one where switch_to is defined). 18d6a3b247SMauro Carvalho Chehab 19d6a3b247SMauro Carvalho ChehabUnlocked context switches introduce only a very minor performance 20d6a3b247SMauro Carvalho Chehabpenalty to the core scheduler implementation in the CONFIG_SMP case. 21d6a3b247SMauro Carvalho Chehab 22d6a3b247SMauro Carvalho ChehabCPU idle 23d6a3b247SMauro Carvalho Chehab======== 24d6a3b247SMauro Carvalho ChehabYour cpu_idle routines need to obey the following rules: 25d6a3b247SMauro Carvalho Chehab 26d6a3b247SMauro Carvalho Chehab1. Preempt should now disabled over idle routines. Should only 27d6a3b247SMauro Carvalho Chehab be enabled to call schedule() then disabled again. 28d6a3b247SMauro Carvalho Chehab 29d6a3b247SMauro Carvalho Chehab2. need_resched/TIF_NEED_RESCHED is only ever set, and will never 30d6a3b247SMauro Carvalho Chehab be cleared until the running task has called schedule(). Idle 31d6a3b247SMauro Carvalho Chehab threads need only ever query need_resched, and may never set or 32d6a3b247SMauro Carvalho Chehab clear it. 33d6a3b247SMauro Carvalho Chehab 34d6a3b247SMauro Carvalho Chehab3. When cpu_idle finds (need_resched() == 'true'), it should call 35d6a3b247SMauro Carvalho Chehab schedule(). It should not call schedule() otherwise. 36d6a3b247SMauro Carvalho Chehab 37d6a3b247SMauro Carvalho Chehab4. The only time interrupts need to be disabled when checking 38d6a3b247SMauro Carvalho Chehab need_resched is if we are about to sleep the processor until 39d6a3b247SMauro Carvalho Chehab the next interrupt (this doesn't provide any protection of 40d6a3b247SMauro Carvalho Chehab need_resched, it prevents losing an interrupt): 41d6a3b247SMauro Carvalho Chehab 42d6a3b247SMauro Carvalho Chehab 4a. Common problem with this type of sleep appears to be:: 43d6a3b247SMauro Carvalho Chehab 44d6a3b247SMauro Carvalho Chehab local_irq_disable(); 45d6a3b247SMauro Carvalho Chehab if (!need_resched()) { 46d6a3b247SMauro Carvalho Chehab local_irq_enable(); 47d6a3b247SMauro Carvalho Chehab *** resched interrupt arrives here *** 48d6a3b247SMauro Carvalho Chehab __asm__("sleep until next interrupt"); 49d6a3b247SMauro Carvalho Chehab } 50d6a3b247SMauro Carvalho Chehab 51d6a3b247SMauro Carvalho Chehab5. TIF_POLLING_NRFLAG can be set by idle routines that do not 52d6a3b247SMauro Carvalho Chehab need an interrupt to wake them up when need_resched goes high. 53d6a3b247SMauro Carvalho Chehab In other words, they must be periodically polling need_resched, 54d6a3b247SMauro Carvalho Chehab although it may be reasonable to do some background work or enter 55d6a3b247SMauro Carvalho Chehab a low CPU priority. 56d6a3b247SMauro Carvalho Chehab 57d6a3b247SMauro Carvalho Chehab - 5a. If TIF_POLLING_NRFLAG is set, and we do decide to enter 58d6a3b247SMauro Carvalho Chehab an interrupt sleep, it needs to be cleared then a memory 59d6a3b247SMauro Carvalho Chehab barrier issued (followed by a test of need_resched with 60d6a3b247SMauro Carvalho Chehab interrupts disabled, as explained in 3). 61d6a3b247SMauro Carvalho Chehab 62d6a3b247SMauro Carvalho Chehabarch/x86/kernel/process.c has examples of both polling and 63d6a3b247SMauro Carvalho Chehabsleeping idle functions. 64d6a3b247SMauro Carvalho Chehab 65d6a3b247SMauro Carvalho Chehab 66d6a3b247SMauro Carvalho ChehabPossible arch/ problems 67d6a3b247SMauro Carvalho Chehab======================= 68d6a3b247SMauro Carvalho Chehab 69d6a3b247SMauro Carvalho ChehabPossible arch problems I found (and either tried to fix or didn't): 70d6a3b247SMauro Carvalho Chehab 71d6a3b247SMauro Carvalho Chehabsparc - IRQs on at this point(?), change local_irq_save to _disable. 72d6a3b247SMauro Carvalho Chehab - TODO: needs secondary CPUs to disable preempt (See #1) 73