#
d0728d71 |
| 23-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Introduce and use a sysinit-based initialization scheme for virtual network stacks, VNET_SYSINIT:
- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network
Introduce and use a sysinit-based initialization scheme for virtual network stacks, VNET_SYSINIT:
- Add VNET_SYSINIT and VNET_SYSUNINIT macros to declare events that will occur each time a network stack is instantiated and destroyed. In the !VIMAGE case, these are simply mapped into regular SYSINIT/SYSUNINIT. For the VIMAGE case, we instead use SYSINIT's to track their order and properties on registration, using them for each vnet when created/ destroyed, or immediately on module load for already-started vnets. - Remove vnet_modinfo mechanism that existed to serve this purpose previously, as well as its dependency scheme: we now just use the SYSINIT ordering scheme. - Implement VNET_DOMAIN_SET() to allow protocol domains to declare that they want init functions to be called for each virtual network stack rather than just once at boot, compiling down to DOMAIN_SET() in the non-VIMAGE case. - Walk all virtualized kernel subsystems and make use of these instead of modinfo or DOMAIN_SET() for init/uninit events. In some cases, convert modular components from using modevent to using sysinit (where appropriate). In some cases, do minor rejuggling of SYSINIT ordering to make room for or better manage events.
Portions submitted by: jhb (VNET_SYSINIT), bz (cleanup) Discussed with: jhb, bz, julian, zec Reviewed by: bz Approved by: re (VIMAGE blanket)
show more ...
|
#
0a4747d4 |
| 20-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Garbage collect vnet module registrations that have neither constructors nor destructors, as there's no actual work to do.
In most cases, the constructors weren't needed because of the existing prot
Garbage collect vnet module registrations that have neither constructors nor destructors, as there's no actual work to do.
In most cases, the constructors weren't needed because of the existing protocol initialization functions run by net_init_domain() as part of VNET_MOD_NET, or they were eliminated when support for static initialization of virtualized globals was added.
Garbage collect dependency references to modules without constructors or destructors, notably VNET_MOD_INET and VNET_MOD_INET6.
Reviewed by: bz Approved by: re (vimage blanket)
show more ...
|
#
5ee847d3 |
| 19-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Reimplement and/or implement vnet list locking by replacing a mostly unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Eit
Reimplement and/or implement vnet list locking by replacing a mostly unused custom mutex/condvar-based sleep locks with two locks: an rwlock (for non-sleeping use) and sxlock (for sleeping use). Either acquired for read is sufficient to stabilize the vnet list, but both must be acquired for write to modify the list.
Replace previous no-op read locking macros, used in various places in the stack, with actual locking to prevent race conditions. Callers must declare when they may perform unbounded sleeps or not when selecting how to lock.
Refactor vnet sysinits so that the vnet list and locks are initialized before kernel modules are linked, as the kernel linker will use them for modules loaded by the boot loader.
Update various consumers of these KPIs based on whether they may sleep or not.
Reviewed by: bz Approved by: re (kib)
show more ...
|
#
1e77c105 |
| 16-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references.
Discussed with: bz, julian Reviewed by: bz Approved b
Remove unused VNET_SET() and related macros; only VNET_GET() is ever actually used. Rename VNET_GET() to VNET() to shorten variable references.
Discussed with: bz, julian Reviewed by: bz Approved by: re (kensmith, kib)
show more ...
|
#
eddfbb76 |
| 15-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the alloca
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables.
Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker.
Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided.
This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS.
Bump __FreeBSD_version and update UPDATING.
Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
show more ...
|
#
09c817ba |
| 03-Jul-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
- MFC
|
#
8c0fec80 |
| 23-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references:
Modify most routines returning 'struct ifaddr *' to return references rather than pointers, requiring callers to properly dispose of those references. The following routines now return references:
ifaddr_byindex ifa_ifwithaddr ifa_ifwithbroadaddr ifa_ifwithdstaddr ifa_ifwithnet ifaof_ifpforaddr ifa_ifwithroute ifa_ifwithroute_fib rt_getifa rt_getifa_fib IFP_TO_IA ip_rtaddr in6_ifawithifp in6ifa_ifpforlinklocal in6ifa_ifpwithaddr in6_ifadd carp_iamatch6 ip6_getdstifaddr
Remove unused macro which didn't have required referencing:
IFP_TO_IA6
This closes many small races in which changes to interface or address lists while an ifaddr was in use could lead to use of freed memory (etc). In a few cases, add missing if_addr_list locking required to safely acquire references.
Because of a lack of deep copying support, we accept a race in which an in6_ifaddr pointed to by mbuf tags and extracted with ip6_getdstifaddr() doesn't hold a reference while in transmit. Once we have mbuf tag deep copy support, this can be fixed.
Reviewed by: bz Obtained from: Apple, Inc. (portions) MFC after: 6 weeks (portions)
show more ...
|
#
5736e6fb |
| 23-Jun-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
After cleaning up rt_tables from vnet.h and cleaning up opt_route.h a lot of files no longer need route.h either. Garbage collect them. While here remove now unneeded vnet.h #includes as well.
|
#
7e857dd1 |
| 12-Jun-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
- Merge from HEAD
|
#
8d8bc018 |
| 08-Jun-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer dep
After r193232 rt_tables in vnet.h are no longer indirectly dependent on the ROUTETABLES kernel option thus there is no need to include opt_route.h anymore in all consumers of vnet.h and no longer depend on it for module builds.
Remove the hidden include in flowtable.h as well and leave the two explicit #includes in ip_input.c and ip_output.c.
show more ...
|
#
403f4aa0 |
| 06-Jun-2009 |
Marko Zec <zec@FreeBSD.org> |
Unbreak options VIMAGE build.
Submitted by: julian (mentor) Approved by: julian (mentor)
|
#
bcf11e8d |
| 05-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in du
Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC and used in a large number of files, but also because an increasing number of incorrect uses of MAC calls were sneaking in due to copy-and-paste of MAC-aware code without the associated opt_mac.h include.
Discussed with: pjd
show more ...
|
#
b01c90a3 |
| 01-Jun-2009 |
Bruce M Simpson <bms@FreeBSD.org> |
Merge fixes from p4: * Tighten v1 query input processing. * Borrow changes from MLDv2 for how general queries are processed. * Do address field validation upfront before accepting input. * Do
Merge fixes from p4: * Tighten v1 query input processing. * Borrow changes from MLDv2 for how general queries are processed. * Do address field validation upfront before accepting input. * Do NOT switch protocol version if old querier present timer active. * Always clear IGMPv3 state in igmp_v3_cancel_link_timers(). * Update comments.
Tested by: deeptech71 at gmail dot com
show more ...
|
#
d4b5cae4 |
| 01-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Reimplement the netisr framework in order to support parallel netisr threads:
- Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Thread
Reimplement the netisr framework in order to support parallel netisr threads:
- Support up to one netisr thread per CPU, each processings its own workstream, or set of per-protocol queues. Threads may be bound to specific CPUs, or allowed to migrate, based on a global policy.
In the future it would be desirable to support topology-centric policies, such as "one netisr per package".
- Allow each protocol to advertise an ordering policy, which can currently be one of:
NETISR_POLICY_SOURCE: packets must maintain ordering with respect to an implicit or explicit source (such as an interface or socket).
NETISR_POLICY_FLOW: make use of mbuf flow identifiers to place work, as well as allowing protocols to provide a flow generation function for mbufs without flow identifers (m2flow). Falls back on NETISR_POLICY_SOURCE if now flow ID is available.
NETISR_POLICY_CPU: allow protocols to inspect and assign a CPU for each packet handled by netisr (m2cpuid).
- Provide utility functions for querying the number of workstreams being used, as well as a mapping function from workstream to CPU ID, which protocols may use in work placement decisions.
- Add explicit interfaces to get and set per-protocol queue limits, and get and clear drop counters, which query data or apply changes across all workstreams.
- Add a more extensible netisr registration interface, in which protocols declare 'struct netisr_handler' structures for each registered NETISR_ type. These include name, handler function, optional mbuf to flow ID function, optional mbuf to CPU ID function, queue limit, and ordering policy. Padding is present to allow these to be expanded in the future. If no queue limit is declared, then a default is used.
- Queue limits are now per-workstream, and raised from the previous IFQ_MAXLEN default of 50 to 256.
- All protocols are updated to use the new registration interface, and with the exception of netnatm, default queue limits. Most protocols register as NETISR_POLICY_SOURCE, except IPv4 and IPv6, which use NETISR_POLICY_FLOW, and will therefore take advantage of driver- generated flow IDs if present.
- Formalize a non-packet based interface between interface polling and the netisr, rather than having polling pretend to be two protocols. Provide two explicit hooks in the netisr worker for start and end events for runs: netisr_poll() and netisr_pollmore(), as well as a function, netisr_sched_poll(), to allow the polling code to schedule netisr execution. DEVICE_POLLING still embeds single-netisr assumptions in its implementation, so for now if it is compiled into the kernel, a single and un-bound netisr thread is enforced regardless of tunable configuration.
In the default configuration, the new netisr implementation maintains the same basic assumptions as the previous implementation: a single, un-bound worker thread processes all deferred work, and direct dispatch is enabled by default wherever possible.
Performance measurement shows a marginal performance improvement over the old implementation due to the use of batched dequeue.
An rmlock is used to synchronize use and registration/unregistration using the framework; currently, synchronized use is disabled (replicating current netisr policy) due to a measurable 3%-6% hit in ping-pong micro-benchmarking. It will be enabled once further rmlock optimization has taken place. However, in practice, netisrs are rarely registered or unregistered at runtime.
A new man page for netisr will follow, but since one doesn't currently exist, it hasn't been updated.
This change is not appropriate for MFC, although the polling shutdown handler should be merged to 7-STABLE.
Bump __FreeBSD_version.
Reviewed by: bz
show more ...
|
#
2e370a5c |
| 26-May-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
Merge from HEAD
|
#
ddd50c34 |
| 08-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Remove a bogus check that unintentionally slipped in r191816.
This change has no functional impact on nooptions VIMAGE builds. Submitted by: bz
|
#
e7153b25 |
| 07-May-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
Merge from HEAD
|
#
94e9f5a1 |
| 06-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Remove unnecessary CURVNET_SET() calls where curvnet context is (i.e. seems to be) already set.
This should reduce console noise due to curvnet recursion reports.
This change has no impact on noopt
Remove unnecessary CURVNET_SET() calls where curvnet context is (i.e. seems to be) already set.
This should reduce console noise due to curvnet recursion reports.
This change has no impact on nooptions VIMAGE builds. Approved by: julian (mentor)
show more ...
|
#
21ca7b57 |
| 05-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Change the curvnet variable from a global const struct vnet *, previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set
Change the curvnet variable from a global const struct vnet *, previously always pointing to the default vnet context, to a dynamically changing thread-local one. The currvnet context should be set on entry to networking code via CURVNET_SET() macros, and reverted to previous state via CURVNET_RESTORE(). Recursions on curvnet are permitted, though strongly discuouraged.
This change should have no functional impact on nooptions VIMAGE kernel builds, where CURVNET_* macros expand to whitespace.
The curthread->td_vnet (aka curvnet) variable's purpose is to be an indicator of the vnet context in which the current network-related operation takes place, in case we cannot deduce the current vnet context from any other source, such as by looking at mbuf's m->m_pkthdr.rcvif->if_vnet, sockets's so->so_vnet etc. Moreover, so far curvnet has turned out to be an invaluable consistency checking aid: it helps to catch cases when sockets, ifnets or any other vnet-aware structures may have leaked from one vnet to another.
The exact placement of the CURVNET_SET() / CURVNET_RESTORE() macros was a result of an empirical iterative process, whith an aim to reduce recursions on CURVNET_SET() to a minimum, while still reducing the scope of CURVNET_SET() to networking only operations - the alternative would be calling CURVNET_SET() on each system call entry. In general, curvnet has to be set in three typicall cases: when processing socket-related requests from userspace or from within the kernel; when processing inbound traffic flowing from device drivers to upper layers of the networking stack, and when executing timer-driven networking functions.
This change also introduces a DDB subcommand to show the list of all vnet instances.
Approved by: julian (mentor)
show more ...
|
#
d7fcc528 |
| 02-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Unbreak options VIMAGE + nooptions INVARIANTS kernel builds.
Submitted by: julian Approved by: julian (mentor)
|
Revision tags: release/7.2.0_cvs, release/7.2.0 |
|
#
31a3e65d |
| 29-Apr-2009 |
Bruce M Simpson <bms@FreeBSD.org> |
Fix a problem whereby enqueued IGMPv3 filter list changes would be incorrectly output, if the RB-tree enumeration happened to reuse the same chain for a mode switch: that is, both ALLOW and BLOCK rec
Fix a problem whereby enqueued IGMPv3 filter list changes would be incorrectly output, if the RB-tree enumeration happened to reuse the same chain for a mode switch: that is, both ALLOW and BLOCK records were appended for the same group, in the same mbuf packet chain.
This was introduced during an mbuf chain layout bug fix involving m_getptr(), which obviously cannot count from offset 0 on the second pass through the RB-tree when serializing the IGMPv3 group records into the pending mbuf chain.
Cut over to KTR_INET for IGMPv3 CTR usage.
show more ...
|
#
093f25f8 |
| 27-Apr-2009 |
Marko Zec <zec@FreeBSD.org> |
In preparation for turning on options VIMAGE in next commits, rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace.
Reviewed by: bz (an o
In preparation for turning on options VIMAGE in next commits, rearrange / replace / adjust several INIT_VNET_* initializer macros, all of which currently resolve to whitespace.
Reviewed by: bz (an older version of the patch) Approved by: julian (mentor)
show more ...
|
#
b5fbc0b9 |
| 19-Apr-2009 |
Bruce M Simpson <bms@FreeBSD.org> |
Now that IFF_NEEDSGIANT has been removed from the network stack, catch up with this in IGMPv3 and remove dead code. This has the side-effect of not being back-portable to RELENG_7 w/o further changes.
|
#
9c797940 |
| 13-Apr-2009 |
Oleksandr Tymoshenko <gonzo@FreeBSD.org> |
- Merge from HEAD
|
#
bd88cce2 |
| 12-Apr-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Update stats in struct igmpstat using two new macros: IGMPSTAT_ADD() and IGMPSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the impleme
Update stats in struct igmpstat using two new macros: IGMPSTAT_ADD() and IGMPSTAT_INC(), rather than directly manipulating the fields of the structure. This will make it easier to change the implementation of these statistics, such as using per-CPU versions of the data structures.
MFC after: 3 days
show more ...
|