133346ed6SMatt Macy.\" 233346ed6SMatt Macy.\" Copyright (C) 2018 Matthew Macy <mmacy@FreeBSD.org>. 333346ed6SMatt Macy.\" 433346ed6SMatt Macy.\" Redistribution and use in source and binary forms, with or without 533346ed6SMatt Macy.\" modification, are permitted provided that the following conditions 633346ed6SMatt Macy.\" are met: 733346ed6SMatt Macy.\" 1. Redistributions of source code must retain the above copyright 833346ed6SMatt Macy.\" notice(s), this list of conditions and the following disclaimer as 933346ed6SMatt Macy.\" the first lines of this file unmodified other than the possible 1033346ed6SMatt Macy.\" addition of one or more copyright notices. 1133346ed6SMatt Macy.\" 2. Redistributions in binary form must reproduce the above copyright 1233346ed6SMatt Macy.\" notice(s), this list of conditions and the following disclaimer in the 1333346ed6SMatt Macy.\" documentation and/or other materials provided with the distribution. 1433346ed6SMatt Macy.\" 1533346ed6SMatt Macy.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY 1633346ed6SMatt Macy.\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 1733346ed6SMatt Macy.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 1833346ed6SMatt Macy.\" DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY 1933346ed6SMatt Macy.\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 2033346ed6SMatt Macy.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 2133346ed6SMatt Macy.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 2233346ed6SMatt Macy.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 2333346ed6SMatt Macy.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 2433346ed6SMatt Macy.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH 2533346ed6SMatt Macy.\" DAMAGE. 2633346ed6SMatt Macy.\" 2733346ed6SMatt Macy.\" $FreeBSD$ 2833346ed6SMatt Macy.\" 2933346ed6SMatt Macy.Dd May 13, 2018 3033346ed6SMatt Macy.Dt EPOCH 9 3133346ed6SMatt Macy.Os 3233346ed6SMatt Macy.Sh NAME 3333346ed6SMatt Macy.Nm epoch , 3433346ed6SMatt Macy.Nm epoch_context , 3533346ed6SMatt Macy.Nm epoch_alloc , 3633346ed6SMatt Macy.Nm epoch_free , 3733346ed6SMatt Macy.Nm epoch_enter , 3833346ed6SMatt Macy.Nm epoch_exit , 3933346ed6SMatt Macy.Nm epoch_wait , 4033346ed6SMatt Macy.Nm epoch_call , 4133346ed6SMatt Macy.Nm in_epoch , 4233346ed6SMatt Macy.Nd kernel epoch based reclaimation 4333346ed6SMatt Macy.Sh SYNOPSIS 4433346ed6SMatt Macy.In sys/param.h 4533346ed6SMatt Macy.In sys/proc.h 4633346ed6SMatt Macy.In sys/epoch.h 4733346ed6SMatt Macy.Ft epoch_t 486e36248fSMatt Macy.Fn epoch_alloc "int flags" 4933346ed6SMatt Macy.Ft void 5033346ed6SMatt Macy.Fn epoch_enter "epoch_t epoch" 5133346ed6SMatt Macy.Ft void 52*70398c2fSMatt Macy.Fn epoch_enter_preempt "epoch_t epoch" 536e36248fSMatt Macy.Ft void 5433346ed6SMatt Macy.Fn epoch_exit "epoch_t epoch" 5533346ed6SMatt Macy.Ft void 56*70398c2fSMatt Macy.Fn epoch_exit_preempt "epoch_t epoch" 576e36248fSMatt Macy.Ft void 5833346ed6SMatt Macy.Fn epoch_wait "epoch_t epoch" 5933346ed6SMatt Macy.Ft void 60*70398c2fSMatt Macy.Fn epoch_wait_preempt "epoch_t epoch" 616e36248fSMatt Macy.Ft void 6233346ed6SMatt Macy.Fn epoch_call "epoch_t epoch" "epoch_context_t ctx" "void (*callback) (epoch_context_t)" 6333346ed6SMatt Macy.Ft int 6433346ed6SMatt Macy.Fn in_epoch "void" 6533346ed6SMatt Macy.Sh DESCRIPTION 6633346ed6SMatt MacyEpochs are used to guarantee liveness and immutability of data by 6733346ed6SMatt Macydeferring reclamation and mutation until a grace period has elapsed. 6833346ed6SMatt MacyEpochs do not have any lock ordering issues. Entering and leaving 6933346ed6SMatt Macyan epoch section will never block. 7033346ed6SMatt Macy.Pp 7133346ed6SMatt MacyEpochs are allocated with 7233346ed6SMatt Macy.Fn epoch_alloc 7333346ed6SMatt Macyand freed with 7433346ed6SMatt Macy.Fn epoch_free . 756e36248fSMatt MacyThe flags passed to epoch_alloc determine whether preemption is 76*70398c2fSMatt Macyallowed during a section or not (the dafult), as specified by 77*70398c2fSMatt MacyEPOCH_PREEMPT. 7833346ed6SMatt MacyThreads indicate the start of an epoch critical section by calling 7933346ed6SMatt Macy.Fn epoch_enter . 8033346ed6SMatt MacyThe end of a critical section is indicated by calling 8133346ed6SMatt Macy.Fn epoch_exit . 82*70398c2fSMatt MacyThe _preempt variants can be used around code which requires preemption. 8333346ed6SMatt MacyA thread can wait until a grace period has elapsed 8433346ed6SMatt Macysince any threads have entered 8533346ed6SMatt Macythe epoch by calling 86*70398c2fSMatt Macy.Fn epoch_wait 87*70398c2fSMatt Macyor 88*70398c2fSMatt Macy.Fn epoch_wait_preempt , 89*70398c2fSMatt Macydepending on the epoch_type. 90*70398c2fSMatt MacyThe use of a default epoch type allows one to use 91*70398c2fSMatt Macy.Fn epoch_wait 926e36248fSMatt Macywhich is guaranteed to have much shorter completion times since 936e36248fSMatt Macywe know that none of the threads in an epoch section will be preempted 946e36248fSMatt Macybefore completing its section. 9533346ed6SMatt MacyIf the thread can't sleep or is otherwise in a performance sensitive 9633346ed6SMatt Macypath it can ensure that a grace period has elapsed by calling 9733346ed6SMatt Macy.Fn epoch_call 9833346ed6SMatt Macywith a callback with any work that needs to wait for an epoch to elapse. 9933346ed6SMatt MacyOnly non-sleepable locks can be acquired during a section protected by 100*70398c2fSMatt Macy.Fn epoch_enter_preempt 10133346ed6SMatt Macyand 102*70398c2fSMatt Macy.Fn epoch_exit_preempt . 10333346ed6SMatt MacyINVARIANTS can assert that a thread is in an epoch by using 10433346ed6SMatt Macy.Fn in_epoch . 10533346ed6SMatt Macy.Pp 106*70398c2fSMatt MacyThe epoch API currently does not support sleeping in epoch_preempt sections. 107*70398c2fSMatt MacyA caller cannot do epoch_enter recursively on different preemptible epochs. A 10833346ed6SMatt Macycaller should never call 10933346ed6SMatt Macy.Fn epoch_wait 11033346ed6SMatt Macyin the middle of an epoch section as this will lead to a deadlock. 11133346ed6SMatt Macy.Pp 11233346ed6SMatt MacyNote that epochs are not a straight replacement for read locks. Callers 11333346ed6SMatt Macymust use safe list and tailq traversal routines in an epoch (see ck_queue). 11433346ed6SMatt MacyWhen modifying a list referenced from an epoch section safe removal 11533346ed6SMatt Macyroutines must be used and the caller can no longer modify a list entry 11633346ed6SMatt Macyin place. An item to be modified must be handled with copy on write 11733346ed6SMatt Macyand frees must be deferred until after a grace period has elapsed. 11833346ed6SMatt Macy.Sh RETURN VALUES 11933346ed6SMatt Macy.Fn in_epoch 12033346ed6SMatt Macywill return 1 if curthread is in an epoch, 0 otherwise. 121*70398c2fSMatt Macy.Sh CAVEATS 122*70398c2fSMatt MacyOne must be cautious when using 123*70398c2fSMatt Macy.Fn epoch_wait_preempt 124*70398c2fSMatt Macythreads are pinned during epoch sections so if a thread in a section is then 125*70398c2fSMatt Macypreempted by a higher priority compute bound thread on that CPU it can be 126*70398c2fSMatt Macyprevented from leaving the section. Thus the wait time for the waiter is 127*70398c2fSMatt Macypotentially unbounded. 12833346ed6SMatt Macy.Sh EXAMPLES 12933346ed6SMatt MacyAsync free example: 13033346ed6SMatt Macy 13133346ed6SMatt MacyThread 1: 13233346ed6SMatt Macy.Bd -literal 1336e36248fSMatt Macyint 1346e36248fSMatt Macyin_pcbladdr(struct inpcb *inp, struct in_addr *faddr, struct in_laddr *laddr, 1356e36248fSMatt Macy struct ucred *cred) 13633346ed6SMatt Macy{ 1376e36248fSMatt Macy /* ... */ 13833346ed6SMatt Macy epoch_enter(net_epoch); 13933346ed6SMatt Macy CK_STAILQ_FOREACH(ifa, &ifp->if_addrhead, ifa_link) { 14033346ed6SMatt Macy sa = ifa->ifa_addr; 14133346ed6SMatt Macy if (sa->sa_family != AF_INET) 14233346ed6SMatt Macy continue; 14333346ed6SMatt Macy sin = (struct sockaddr_in *)sa; 14433346ed6SMatt Macy if (prison_check_ip4(cred, &sin->sin_addr) == 0) { 14533346ed6SMatt Macy ia = (struct in_ifaddr *)ifa; 14633346ed6SMatt Macy break; 14733346ed6SMatt Macy } 14833346ed6SMatt Macy } 14933346ed6SMatt Macy epoch_exit(net_epoch); 1506e36248fSMatt Macy /* ... */ 15133346ed6SMatt Macy} 15233346ed6SMatt Macy.Ed 15333346ed6SMatt MacyThread 2: 15433346ed6SMatt Macy.Bd -literal 15533346ed6SMatt Macyvoid 15633346ed6SMatt Macyifa_free(struct ifaddr *ifa) 15733346ed6SMatt Macy{ 15833346ed6SMatt Macy 15933346ed6SMatt Macy if (refcount_release(&ifa->ifa_refcnt)) 16033346ed6SMatt Macy epoch_call(net_epoch, &ifa->ifa_epoch_ctx, ifa_destroy); 16133346ed6SMatt Macy} 16233346ed6SMatt Macy 1636e36248fSMatt Macyvoid 1646e36248fSMatt Macyif_purgeaddrs(struct ifnet *ifp) 16533346ed6SMatt Macy{ 16633346ed6SMatt Macy 1676e36248fSMatt Macy /* .... */ 16833346ed6SMatt Macy IF_ADDR_WLOCK(ifp); 16933346ed6SMatt Macy CK_STAILQ_REMOVE(&ifp->if_addrhead, ifa, ifaddr, ifa_link); 17033346ed6SMatt Macy IF_ADDR_WUNLOCK(ifp); 17133346ed6SMatt Macy ifa_free(ifa); 17233346ed6SMatt Macy} 17333346ed6SMatt Macy.Ed 17433346ed6SMatt Macy.Pp 17533346ed6SMatt MacyThread 1 traverses the ifaddr list in an epoch. Thread 2 unlinks 17633346ed6SMatt Macywith the corresponding epoch safe macro, marks as logically free, 17733346ed6SMatt Macyand then defers deletion. More general mutation or a synchronous 17833346ed6SMatt Macyfree would have to follow a a call to 17933346ed6SMatt Macy.Fn epoch_wait . 18033346ed6SMatt Macy.Sh ERRORS 18133346ed6SMatt MacyNone. 18233346ed6SMatt Macy.El 18333346ed6SMatt Macy.Sh SEE ALSO 18433346ed6SMatt Macy.Xr locking 9 , 18533346ed6SMatt Macy.Xr mtx_pool 9 , 18633346ed6SMatt Macy.Xr mutex 9 , 18733346ed6SMatt Macy.Xr rwlock 9 , 18833346ed6SMatt Macy.Xr sema 9 , 18933346ed6SMatt Macy.Xr sleep 9 , 19033346ed6SMatt Macy.Xr sx 9 , 19133346ed6SMatt Macy.Xr timeout 9 192