1 /* 2 * CDDL HEADER START 3 * 4 * The contents of this file are subject to the terms of the 5 * Common Development and Distribution License (the "License"). 6 * You may not use this file except in compliance with the License. 7 * 8 * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE 9 * or http://www.opensolaris.org/os/licensing. 10 * See the License for the specific language governing permissions 11 * and limitations under the License. 12 * 13 * When distributing Covered Code, include this CDDL HEADER in each 14 * file and include the License file at usr/src/OPENSOLARIS.LICENSE. 15 * If applicable, add the following below this CDDL HEADER, with the 16 * fields enclosed by brackets "[]" replaced with your own identifying 17 * information: Portions Copyright [yyyy] [name of copyright owner] 18 * 19 * CDDL HEADER END 20 */ 21 /* 22 * Copyright 2007 Sun Microsystems, Inc. All rights reserved. 23 * Use is subject to license terms. 24 */ 25 26 #pragma ident "%Z%%M% %I% %E% SMI" 27 28 /* 29 * Big Theory Statement for mutual exclusion locking primitives. 30 * 31 * A mutex serializes multiple threads so that only one thread 32 * (the "owner" of the mutex) is active at a time. See mutex(9F) 33 * for a full description of the interfaces and programming model. 34 * The rest of this comment describes the implementation. 35 * 36 * Mutexes come in two flavors: adaptive and spin. mutex_init(9F) 37 * determines the type based solely on the iblock cookie (PIL) argument. 38 * PIL > LOCK_LEVEL implies a spin lock; everything else is adaptive. 39 * 40 * Spin mutexes block interrupts and spin until the lock becomes available. 41 * A thread may not sleep, or call any function that might sleep, while 42 * holding a spin mutex. With few exceptions, spin mutexes should only 43 * be used to synchronize with interrupt handlers. 44 * 45 * Adaptive mutexes (the default type) spin if the owner is running on 46 * another CPU and block otherwise. This policy is based on the assumption 47 * that mutex hold times are typically short enough that the time spent 48 * spinning is less than the time it takes to block. If you need mutual 49 * exclusion semantics with long hold times, consider an rwlock(9F) as 50 * RW_WRITER. Better still, reconsider the algorithm: if it requires 51 * mutual exclusion for long periods of time, it's probably not scalable. 52 * 53 * Adaptive mutexes are overwhelmingly more common than spin mutexes, 54 * so mutex_enter() assumes that the lock is adaptive. We get away 55 * with this by structuring mutexes so that an attempt to acquire a 56 * spin mutex as adaptive always fails. When mutex_enter() fails 57 * it punts to mutex_vector_enter(), which does all the hard stuff. 58 * 59 * mutex_vector_enter() first checks the type. If it's spin mutex, 60 * we just call lock_set_spl() and return. If it's an adaptive mutex, 61 * we check to see what the owner is doing. If the owner is running, 62 * we spin until the lock becomes available; if not, we mark the lock 63 * as having waiters and block. 64 * 65 * Blocking on a mutex is surprisingly delicate dance because, for speed, 66 * mutex_exit() doesn't use an atomic instruction. Thus we have to work 67 * a little harder in the (rarely-executed) blocking path to make sure 68 * we don't block on a mutex that's just been released -- otherwise we 69 * might never be woken up. 70 * 71 * The logic for synchronizing mutex_vector_enter() with mutex_exit() 72 * in the face of preemption and relaxed memory ordering is as follows: 73 * 74 * (1) Preemption in the middle of mutex_exit() must cause mutex_exit() 75 * to restart. Each platform must enforce this by checking the 76 * interrupted PC in the interrupt handler (or on return from trap -- 77 * whichever is more convenient for the platform). If the PC 78 * lies within the critical region of mutex_exit(), the interrupt 79 * handler must reset the PC back to the beginning of mutex_exit(). 80 * The critical region consists of all instructions up to, but not 81 * including, the store that clears the lock (which, of course, 82 * must never be executed twice.) 83 * 84 * This ensures that the owner will always check for waiters after 85 * resuming from a previous preemption. 86 * 87 * (2) A thread resuming in mutex_exit() does (at least) the following: 88 * 89 * when resuming: set CPU_THREAD = owner 90 * membar #StoreLoad 91 * 92 * in mutex_exit: check waiters bit; do wakeup if set 93 * membar #LoadStore|#StoreStore 94 * clear owner 95 * (at this point, other threads may or may not grab 96 * the lock, and we may or may not reacquire it) 97 * 98 * when blocking: membar #StoreStore (due to disp_lock_enter()) 99 * set CPU_THREAD = (possibly) someone else 100 * 101 * (3) A thread blocking in mutex_vector_enter() does the following: 102 * 103 * set waiters bit 104 * membar #StoreLoad (via membar_enter()) 105 * check CPU_THREAD for each CPU; abort if owner running 106 * membar #LoadLoad (via membar_consumer()) 107 * check owner and waiters bit; abort if either changed 108 * block 109 * 110 * Thus the global memory orderings for (2) and (3) are as follows: 111 * 112 * (2M) mutex_exit() memory order: 113 * 114 * STORE CPU_THREAD = owner 115 * LOAD waiters bit 116 * STORE owner = NULL 117 * STORE CPU_THREAD = (possibly) someone else 118 * 119 * (3M) mutex_vector_enter() memory order: 120 * 121 * STORE waiters bit = 1 122 * LOAD CPU_THREAD for each CPU 123 * LOAD owner and waiters bit 124 * 125 * It has been verified by exhaustive simulation that all possible global 126 * memory orderings of (2M) interleaved with (3M) result in correct 127 * behavior. Moreover, these ordering constraints are minimal: changing 128 * the ordering of anything in (2M) or (3M) breaks the algorithm, creating 129 * windows for missed wakeups. Note: the possibility that other threads 130 * may grab the lock after the owner drops it can be factored out of the 131 * memory ordering analysis because mutex_vector_enter() won't block 132 * if the lock isn't still owned by the same thread. 133 * 134 * The only requirements of code outside the mutex implementation are 135 * (1) mutex_exit() preemption fixup in interrupt handlers or trap return, 136 * and (2) a membar #StoreLoad after setting CPU_THREAD in resume(). 137 * Note: idle threads cannot grab adaptive locks (since they cannot block), 138 * so the membar may be safely omitted when resuming an idle thread. 139 * 140 * When a mutex has waiters, mutex_vector_exit() has several options: 141 * 142 * (1) Choose a waiter and make that thread the owner before waking it; 143 * this is known as "direct handoff" of ownership. 144 * 145 * (2) Drop the lock and wake one waiter. 146 * 147 * (3) Drop the lock, clear the waiters bit, and wake all waiters. 148 * 149 * In many ways (1) is the cleanest solution, but if a lock is moderately 150 * contended it defeats the adaptive spin logic. If we make some other 151 * thread the owner, but he's not ONPROC yet, then all other threads on 152 * other cpus that try to get the lock will conclude that the owner is 153 * blocked, so they'll block too. And so on -- it escalates quickly, 154 * with every thread taking the blocking path rather than the spin path. 155 * Thus, direct handoff is *not* a good idea for adaptive mutexes. 156 * 157 * Option (2) is the next most natural-seeming option, but it has several 158 * annoying properties. If there's more than one waiter, we must preserve 159 * the waiters bit on an unheld lock. On cas-capable platforms, where 160 * the waiters bit is part of the lock word, this means that both 0x0 161 * and 0x1 represent unheld locks, so we have to cas against *both*. 162 * Priority inheritance also gets more complicated, because a lock can 163 * have waiters but no owner to whom priority can be willed. So while 164 * it is possible to make option (2) work, it's surprisingly vile. 165 * 166 * Option (3), the least-intuitive at first glance, is what we actually do. 167 * It has the advantage that because you always wake all waiters, you 168 * never have to preserve the waiters bit. Waking all waiters seems like 169 * begging for a thundering herd problem, but consider: under option (2), 170 * every thread that grabs and drops the lock will wake one waiter -- so 171 * if the lock is fairly active, all waiters will be awakened very quickly 172 * anyway. Moreover, this is how adaptive locks are *supposed* to work. 173 * The blocking case is rare; the more common case (by 3-4 orders of 174 * magnitude) is that one or more threads spin waiting to get the lock. 175 * Only direct handoff can prevent the thundering herd problem, but as 176 * mentioned earlier, that would tend to defeat the adaptive spin logic. 177 * In practice, option (3) works well because the blocking case is rare. 178 */ 179 180 /* 181 * delayed lock retry with exponential delay for spin locks 182 * 183 * It is noted above that for both the spin locks and the adaptive locks, 184 * spinning is the dominate mode of operation. So long as there is only 185 * one thread waiting on a lock, the naive spin loop works very well in 186 * cache based architectures. The lock data structure is pulled into the 187 * cache of the processor with the waiting/spinning thread and no further 188 * memory traffic is generated until the lock is released. Unfortunately, 189 * once two or more threads are waiting on a lock, the naive spin has 190 * the property of generating maximum memory traffic from each spinning 191 * thread as the spinning threads contend for the lock data structure. 192 * 193 * By executing a delay loop before retrying a lock, a waiting thread 194 * can reduce its memory traffic by a large factor, depending on the 195 * size of the delay loop. A large delay loop greatly reduced the memory 196 * traffic, but has the drawback of having a period of time when 197 * no thread is attempting to gain the lock even though several threads 198 * might be waiting. A small delay loop has the drawback of not 199 * much reduction in memory traffic, but reduces the potential idle time. 200 * The theory of the exponential delay code is to start with a short 201 * delay loop and double the waiting time on each iteration, up to 202 * a preselected maximum. The BACKOFF_BASE provides the equivalent 203 * of 2 to 3 memory references delay for US-III+ and US-IV architectures. 204 * The BACKOFF_CAP is the equivalent of 50 to 100 memory references of 205 * time (less than 12 microseconds for a 1000 MHz system). 206 * 207 * To determine appropriate BACKOFF_BASE and BACKOFF_CAP values, 208 * studies on US-III+ and US-IV systems using 1 to 66 threads were 209 * done. A range of possible values were studied. 210 * Performance differences below 10 threads were not large. For 211 * systems with more threads, substantial increases in total lock 212 * throughput was observed with the given values. For cases where 213 * more than 20 threads were waiting on the same lock, lock throughput 214 * increased by a factor of 5 or more using the backoff algorithm. 215 * 216 * Some platforms may provide their own platform specific delay code, 217 * using plat_lock_delay(backoff). If it is available, plat_lock_delay 218 * is executed instead of the default delay code. 219 */ 220 221 #pragma weak plat_lock_delay 222 223 #include <sys/param.h> 224 #include <sys/time.h> 225 #include <sys/cpuvar.h> 226 #include <sys/thread.h> 227 #include <sys/debug.h> 228 #include <sys/cmn_err.h> 229 #include <sys/sobject.h> 230 #include <sys/turnstile.h> 231 #include <sys/systm.h> 232 #include <sys/mutex_impl.h> 233 #include <sys/spl.h> 234 #include <sys/lockstat.h> 235 #include <sys/atomic.h> 236 #include <sys/cpu.h> 237 #include <sys/stack.h> 238 #include <sys/archsystm.h> 239 240 #define BACKOFF_BASE 50 241 #define BACKOFF_CAP 1600 242 243 /* 244 * The sobj_ops vector exports a set of functions needed when a thread 245 * is asleep on a synchronization object of this type. 246 */ 247 static sobj_ops_t mutex_sobj_ops = { 248 SOBJ_MUTEX, mutex_owner, turnstile_stay_asleep, turnstile_change_pri 249 }; 250 251 /* 252 * If the system panics on a mutex, save the address of the offending 253 * mutex in panic_mutex_addr, and save the contents in panic_mutex. 254 */ 255 static mutex_impl_t panic_mutex; 256 static mutex_impl_t *panic_mutex_addr; 257 258 static void 259 mutex_panic(char *msg, mutex_impl_t *lp) 260 { 261 if (panicstr) 262 return; 263 264 if (casptr(&panic_mutex_addr, NULL, lp) == NULL) 265 panic_mutex = *lp; 266 267 panic("%s, lp=%p owner=%p thread=%p", 268 msg, lp, MUTEX_OWNER(&panic_mutex), curthread); 269 } 270 271 /* 272 * mutex_vector_enter() is called from the assembly mutex_enter() routine 273 * if the lock is held or is not of type MUTEX_ADAPTIVE. 274 */ 275 void 276 mutex_vector_enter(mutex_impl_t *lp) 277 { 278 kthread_id_t owner; 279 hrtime_t sleep_time = 0; /* how long we slept */ 280 uint_t spin_count = 0; /* how many times we spun */ 281 cpu_t *cpup, *last_cpu; 282 extern cpu_t *cpu_list; 283 turnstile_t *ts; 284 volatile mutex_impl_t *vlp = (volatile mutex_impl_t *)lp; 285 int backoff; /* current backoff */ 286 int backctr; /* ctr for backoff */ 287 int sleep_count = 0; 288 289 ASSERT_STACK_ALIGNED(); 290 291 if (MUTEX_TYPE_SPIN(lp)) { 292 lock_set_spl(&lp->m_spin.m_spinlock, lp->m_spin.m_minspl, 293 &lp->m_spin.m_oldspl); 294 return; 295 } 296 297 if (!MUTEX_TYPE_ADAPTIVE(lp)) { 298 mutex_panic("mutex_enter: bad mutex", lp); 299 return; 300 } 301 302 /* 303 * Adaptive mutexes must not be acquired from above LOCK_LEVEL. 304 * We can migrate after loading CPU but before checking CPU_ON_INTR, 305 * so we must verify by disabling preemption and loading CPU again. 306 */ 307 cpup = CPU; 308 if (CPU_ON_INTR(cpup) && !panicstr) { 309 kpreempt_disable(); 310 if (CPU_ON_INTR(CPU)) 311 mutex_panic("mutex_enter: adaptive at high PIL", lp); 312 kpreempt_enable(); 313 } 314 315 CPU_STATS_ADDQ(cpup, sys, mutex_adenters, 1); 316 317 if (&plat_lock_delay) { 318 backoff = 0; 319 } else { 320 backoff = BACKOFF_BASE; 321 } 322 323 for (;;) { 324 spin: 325 spin_count++; 326 /* 327 * Add an exponential backoff delay before trying again 328 * to touch the mutex data structure. 329 * the spin_count test and call to nulldev are to prevent 330 * the compiler optimizer from eliminating the delay loop. 331 */ 332 if (&plat_lock_delay) { 333 plat_lock_delay(&backoff); 334 } else { 335 for (backctr = backoff; backctr; backctr--) { 336 if (!spin_count) (void) nulldev(); 337 }; /* delay */ 338 backoff = backoff << 1; /* double it */ 339 if (backoff > BACKOFF_CAP) { 340 backoff = BACKOFF_CAP; 341 } 342 343 SMT_PAUSE(); 344 } 345 346 if (panicstr) 347 return; 348 349 if ((owner = MUTEX_OWNER(vlp)) == NULL) { 350 if (mutex_adaptive_tryenter(lp)) 351 break; 352 continue; 353 } 354 355 if (owner == curthread) 356 mutex_panic("recursive mutex_enter", lp); 357 358 /* 359 * If lock is held but owner is not yet set, spin. 360 * (Only relevant for platforms that don't have cas.) 361 */ 362 if (owner == MUTEX_NO_OWNER) 363 continue; 364 365 /* 366 * When searching the other CPUs, start with the one where 367 * we last saw the owner thread. If owner is running, spin. 368 * 369 * We must disable preemption at this point to guarantee 370 * that the list doesn't change while we traverse it 371 * without the cpu_lock mutex. While preemption is 372 * disabled, we must revalidate our cached cpu pointer. 373 */ 374 kpreempt_disable(); 375 if (cpup->cpu_next == NULL) 376 cpup = cpu_list; 377 last_cpu = cpup; /* mark end of search */ 378 do { 379 if (cpup->cpu_thread == owner) { 380 kpreempt_enable(); 381 goto spin; 382 } 383 } while ((cpup = cpup->cpu_next) != last_cpu); 384 kpreempt_enable(); 385 386 /* 387 * The owner appears not to be running, so block. 388 * See the Big Theory Statement for memory ordering issues. 389 */ 390 ts = turnstile_lookup(lp); 391 MUTEX_SET_WAITERS(lp); 392 membar_enter(); 393 394 /* 395 * Recheck whether owner is running after waiters bit hits 396 * global visibility (above). If owner is running, spin. 397 * 398 * Since we are at ipl DISP_LEVEL, kernel preemption is 399 * disabled, however we still need to revalidate our cached 400 * cpu pointer to make sure the cpu hasn't been deleted. 401 */ 402 if (cpup->cpu_next == NULL) 403 last_cpu = cpup = cpu_list; 404 do { 405 if (cpup->cpu_thread == owner) { 406 turnstile_exit(lp); 407 goto spin; 408 } 409 } while ((cpup = cpup->cpu_next) != last_cpu); 410 membar_consumer(); 411 412 /* 413 * If owner and waiters bit are unchanged, block. 414 */ 415 if (MUTEX_OWNER(vlp) == owner && MUTEX_HAS_WAITERS(vlp)) { 416 sleep_time -= gethrtime(); 417 (void) turnstile_block(ts, TS_WRITER_Q, lp, 418 &mutex_sobj_ops, NULL, NULL); 419 sleep_time += gethrtime(); 420 sleep_count++; 421 } else { 422 turnstile_exit(lp); 423 } 424 } 425 426 ASSERT(MUTEX_OWNER(lp) == curthread); 427 428 if (sleep_time != 0) { 429 /* 430 * Note, sleep time is the sum of all the sleeping we 431 * did. 432 */ 433 LOCKSTAT_RECORD(LS_MUTEX_ENTER_BLOCK, lp, sleep_time); 434 } 435 436 /* 437 * We do not count a sleep as a spin. 438 */ 439 if (spin_count > sleep_count) 440 LOCKSTAT_RECORD(LS_MUTEX_ENTER_SPIN, lp, 441 spin_count - sleep_count); 442 443 LOCKSTAT_RECORD0(LS_MUTEX_ENTER_ACQUIRE, lp); 444 } 445 446 /* 447 * mutex_vector_tryenter() is called from the assembly mutex_tryenter() 448 * routine if the lock is held or is not of type MUTEX_ADAPTIVE. 449 */ 450 int 451 mutex_vector_tryenter(mutex_impl_t *lp) 452 { 453 int s; 454 455 if (MUTEX_TYPE_ADAPTIVE(lp)) 456 return (0); /* we already tried in assembly */ 457 458 if (!MUTEX_TYPE_SPIN(lp)) { 459 mutex_panic("mutex_tryenter: bad mutex", lp); 460 return (0); 461 } 462 463 s = splr(lp->m_spin.m_minspl); 464 if (lock_try(&lp->m_spin.m_spinlock)) { 465 lp->m_spin.m_oldspl = (ushort_t)s; 466 return (1); 467 } 468 splx(s); 469 return (0); 470 } 471 472 /* 473 * mutex_vector_exit() is called from mutex_exit() if the lock is not 474 * adaptive, has waiters, or is not owned by the current thread (panic). 475 */ 476 void 477 mutex_vector_exit(mutex_impl_t *lp) 478 { 479 turnstile_t *ts; 480 481 if (MUTEX_TYPE_SPIN(lp)) { 482 lock_clear_splx(&lp->m_spin.m_spinlock, lp->m_spin.m_oldspl); 483 return; 484 } 485 486 if (MUTEX_OWNER(lp) != curthread) { 487 mutex_panic("mutex_exit: not owner", lp); 488 return; 489 } 490 491 ts = turnstile_lookup(lp); 492 MUTEX_CLEAR_LOCK_AND_WAITERS(lp); 493 if (ts == NULL) 494 turnstile_exit(lp); 495 else 496 turnstile_wakeup(ts, TS_WRITER_Q, ts->ts_waiters, NULL); 497 LOCKSTAT_RECORD0(LS_MUTEX_EXIT_RELEASE, lp); 498 } 499 500 int 501 mutex_owned(kmutex_t *mp) 502 { 503 mutex_impl_t *lp = (mutex_impl_t *)mp; 504 505 if (panicstr) 506 return (1); 507 508 if (MUTEX_TYPE_ADAPTIVE(lp)) 509 return (MUTEX_OWNER(lp) == curthread); 510 return (LOCK_HELD(&lp->m_spin.m_spinlock)); 511 } 512 513 kthread_t * 514 mutex_owner(kmutex_t *mp) 515 { 516 mutex_impl_t *lp = (mutex_impl_t *)mp; 517 kthread_id_t t; 518 519 if (MUTEX_TYPE_ADAPTIVE(lp) && (t = MUTEX_OWNER(lp)) != MUTEX_NO_OWNER) 520 return (t); 521 return (NULL); 522 } 523 524 /* 525 * The iblock cookie 'ibc' is the spl level associated with the lock; 526 * this alone determines whether the lock will be ADAPTIVE or SPIN. 527 * 528 * Adaptive mutexes created in zeroed memory do not need to call 529 * mutex_init() as their allocation in this fashion guarantees 530 * their initialization. 531 * eg adaptive mutexes created as static within the BSS or allocated 532 * by kmem_zalloc(). 533 */ 534 /* ARGSUSED */ 535 void 536 mutex_init(kmutex_t *mp, char *name, kmutex_type_t type, void *ibc) 537 { 538 mutex_impl_t *lp = (mutex_impl_t *)mp; 539 540 ASSERT(ibc < (void *)KERNELBASE); /* see 1215173 */ 541 542 if ((intptr_t)ibc > ipltospl(LOCK_LEVEL) && ibc < (void *)KERNELBASE) { 543 ASSERT(type != MUTEX_ADAPTIVE && type != MUTEX_DEFAULT); 544 MUTEX_SET_TYPE(lp, MUTEX_SPIN); 545 LOCK_INIT_CLEAR(&lp->m_spin.m_spinlock); 546 LOCK_INIT_HELD(&lp->m_spin.m_dummylock); 547 lp->m_spin.m_minspl = (int)(intptr_t)ibc; 548 } else { 549 ASSERT(type != MUTEX_SPIN); 550 MUTEX_SET_TYPE(lp, MUTEX_ADAPTIVE); 551 MUTEX_CLEAR_LOCK_AND_WAITERS(lp); 552 } 553 } 554 555 void 556 mutex_destroy(kmutex_t *mp) 557 { 558 mutex_impl_t *lp = (mutex_impl_t *)mp; 559 560 if (lp->m_owner == 0 && !MUTEX_HAS_WAITERS(lp)) { 561 MUTEX_DESTROY(lp); 562 } else if (MUTEX_TYPE_SPIN(lp)) { 563 LOCKSTAT_RECORD0(LS_MUTEX_DESTROY_RELEASE, lp); 564 MUTEX_DESTROY(lp); 565 } else if (MUTEX_TYPE_ADAPTIVE(lp)) { 566 LOCKSTAT_RECORD0(LS_MUTEX_DESTROY_RELEASE, lp); 567 if (MUTEX_OWNER(lp) != curthread) 568 mutex_panic("mutex_destroy: not owner", lp); 569 if (MUTEX_HAS_WAITERS(lp)) { 570 turnstile_t *ts = turnstile_lookup(lp); 571 turnstile_exit(lp); 572 if (ts != NULL) 573 mutex_panic("mutex_destroy: has waiters", lp); 574 } 575 MUTEX_DESTROY(lp); 576 } else { 577 mutex_panic("mutex_destroy: bad mutex", lp); 578 } 579 } 580 581 /* 582 * Simple C support for the cases where spin locks miss on the first try. 583 */ 584 void 585 lock_set_spin(lock_t *lp) 586 { 587 int spin_count = 1; 588 int backoff; /* current backoff */ 589 int backctr; /* ctr for backoff */ 590 591 if (panicstr) 592 return; 593 594 if (ncpus == 1) 595 panic("lock_set: %p lock held and only one CPU", lp); 596 597 if (&plat_lock_delay) { 598 backoff = 0; 599 } else { 600 backoff = BACKOFF_BASE; 601 } 602 603 while (LOCK_HELD(lp) || !lock_spin_try(lp)) { 604 if (panicstr) 605 return; 606 spin_count++; 607 /* 608 * Add an exponential backoff delay before trying again 609 * to touch the mutex data structure. 610 * the spin_count test and call to nulldev are to prevent 611 * the compiler optimizer from eliminating the delay loop. 612 */ 613 if (&plat_lock_delay) { 614 plat_lock_delay(&backoff); 615 } else { 616 /* delay */ 617 for (backctr = backoff; backctr; backctr--) { 618 if (!spin_count) (void) nulldev(); 619 } 620 621 backoff = backoff << 1; /* double it */ 622 if (backoff > BACKOFF_CAP) { 623 backoff = BACKOFF_CAP; 624 } 625 SMT_PAUSE(); 626 } 627 } 628 629 if (spin_count) { 630 LOCKSTAT_RECORD(LS_LOCK_SET_SPIN, lp, spin_count); 631 } 632 633 LOCKSTAT_RECORD0(LS_LOCK_SET_ACQUIRE, lp); 634 } 635 636 void 637 lock_set_spl_spin(lock_t *lp, int new_pil, ushort_t *old_pil_addr, int old_pil) 638 { 639 int spin_count = 1; 640 int backoff; /* current backoff */ 641 int backctr; /* ctr for backoff */ 642 643 if (panicstr) 644 return; 645 646 if (ncpus == 1) 647 panic("lock_set_spl: %p lock held and only one CPU", lp); 648 649 ASSERT(new_pil > LOCK_LEVEL); 650 651 if (&plat_lock_delay) { 652 backoff = 0; 653 } else { 654 backoff = BACKOFF_BASE; 655 } 656 do { 657 splx(old_pil); 658 while (LOCK_HELD(lp)) { 659 if (panicstr) { 660 *old_pil_addr = (ushort_t)splr(new_pil); 661 return; 662 } 663 spin_count++; 664 /* 665 * Add an exponential backoff delay before trying again 666 * to touch the mutex data structure. 667 * spin_count test and call to nulldev are to prevent 668 * compiler optimizer from eliminating the delay loop. 669 */ 670 if (&plat_lock_delay) { 671 plat_lock_delay(&backoff); 672 } else { 673 for (backctr = backoff; backctr; backctr--) { 674 if (!spin_count) (void) nulldev(); 675 } 676 backoff = backoff << 1; /* double it */ 677 if (backoff > BACKOFF_CAP) { 678 backoff = BACKOFF_CAP; 679 } 680 681 SMT_PAUSE(); 682 } 683 } 684 old_pil = splr(new_pil); 685 } while (!lock_spin_try(lp)); 686 687 *old_pil_addr = (ushort_t)old_pil; 688 689 if (spin_count) { 690 LOCKSTAT_RECORD(LS_LOCK_SET_SPL_SPIN, lp, spin_count); 691 } 692 693 LOCKSTAT_RECORD(LS_LOCK_SET_SPL_ACQUIRE, lp, spin_count); 694 } 695