#
756368b6 |
| 17-Oct-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
igmp_v1v2_queue_report() doesn't require epoch.
|
#
8b3bc70a |
| 08-Oct-2019 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r352764 through r353315.
|
#
b8a6e03f |
| 08-Oct-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex cover
Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas.
However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win.
Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier.
On output path we already enter epoch quite early - in the ip_output(), in the ip6_output().
This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch.
Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed.
This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources.
Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111
show more ...
|
Revision tags: release/11.3.0 |
|
#
a68cc388 |
| 09-Jan-2019 |
Gleb Smirnoff <glebius@FreeBSD.org> |
Mechanical cleanup of epoch(9) usage in network stack.
- Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro
Mechanical cleanup of epoch(9) usage in network stack.
- Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always.
- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.
This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively.
Discussed with: jtl, gallatin
show more ...
|
Revision tags: release/12.0.0 |
|
#
14b841d4 |
| 11-Aug-2018 |
Kyle Evans <kevans@FreeBSD.org> |
MFH @ r337607, in preparation for boarding
|
#
5f901c92 |
| 24-Jul-2018 |
Andrew Turner <andrew@FreeBSD.org> |
Use the new VNET_DEFINE_STATIC macro when we are defining static VNET variables.
Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147
|
Revision tags: release/11.2.0 |
|
#
4f6c66cc |
| 23-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
UDP: further performance improvements on tx
Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps
Single stream throughput incre
UDP: further performance improvements on tx
Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps
Single stream throughput increases from 910kpps to 1.18Mpps
Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg
- Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg
- Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg
- Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg
A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW.
Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409
show more ...
|
#
f6960e20 |
| 19-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
netinet silence warnings
|
#
d7c5a620 |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace.
When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch:
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32
After the patch
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52
Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch
Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366
show more ...
|
#
b6f6f880 |
| 06-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's
r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a a multicast update would still be queued for deletion when ifconfig deleted the interface thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses on the interface.
Synchronously remove all external references to a multicast address before enqueueing for delete.
Reported by: lwhsu Approved by: sbruno
show more ...
|
#
f3e1324b |
| 02-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a
Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a non sleepable lock. Serialize multicast updates instead.
Submitted by: mmacy <mmacy@mattmacy.io> Reviewed by: shurd, sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14969
show more ...
|
#
82725ba9 |
| 23-Nov-2017 |
Hans Petter Selasky <hselasky@FreeBSD.org> |
Merge ^/head r325999 through r326131.
|
#
51369649 |
| 20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for
sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.
The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.
Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
show more ...
|
Revision tags: release/10.4.0, release/11.1.0 |
|
#
40769242 |
| 14-Mar-2017 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
Add some ntohl() love to r315277
inet_ntoa() and inet_ntoa_r() take the address in network byte-order. When I removed those calls, I should have replaced them with ntohl() to make the hex addresses
Add some ntohl() love to r315277
inet_ntoa() and inet_ntoa_r() take the address in network byte-order. When I removed those calls, I should have replaced them with ntohl() to make the hex addresses slightly less unreadable. Here they are.
See r315277 regarding classic blunders.
vangyzen: you're deep in "no good deed" territory, it seems --badger
Reported by: ian MFC after: 3 days MFC when: I finally get it right Sponsored by: Dell EMC
show more ...
|
#
47d803ea |
| 14-Mar-2017 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
KTR: log IPv4 addresses in hex rather than dotted-quad
When I made the changes in r313821, I fell victim to one of the classic blunders, the most famous of which is: never get involved in a land war
KTR: log IPv4 addresses in hex rather than dotted-quad
When I made the changes in r313821, I fell victim to one of the classic blunders, the most famous of which is: never get involved in a land war in Asia. But only slightly less well known is this: Keep your brain turned on and engaged when making a tedious, sweeping, mechanical change. KTR can correctly log the immediate integral values passed to it, as well as constant strings, but not non-constant strings, since they might change by the time ktrdump retrieves them.
Reported by: glebius MFC after: 3 days Sponsored by: Dell EMC
show more ...
|
#
348238db |
| 01-Mar-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r314420 through r314481.
|
#
fbbd9655 |
| 01-Mar-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is
Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
show more ...
|
#
a3906ca5 |
| 17-Feb-2017 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r313644 through r313895.
|
#
8144690a |
| 16-Feb-2017 |
Eric van Gyzen <vangyzen@FreeBSD.org> |
Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with
Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment because it uses a static local buffer. Instead, use inet_ntoa_r() with a buffer on the caller's stack.
Suggested by: glebius, emaste Reviewed by: gnn MFC after: 2 weeks Sponsored by: Dell EMC Differential Revision: https://reviews.freebsd.org/D9625
show more ...
|
Revision tags: release/11.0.1, release/11.0.0 |
|
#
3d6d3da4 |
| 04-Sep-2016 |
Dimitry Andric <dim@FreeBSD.org> |
Merge ^/head r305361 through r305389.
|
#
6c01c0e0 |
| 04-Sep-2016 |
Dimitry Andric <dim@FreeBSD.org> |
With clang 3.9.0, compiling sys/netinet/igmp.c results in the following warning:
sys/netinet/igmp.c:546:21: error: implicit conversion from 'int' to 'char' changes value from 148 to -108 [-Werror,-W
With clang 3.9.0, compiling sys/netinet/igmp.c results in the following warning:
sys/netinet/igmp.c:546:21: error: implicit conversion from 'int' to 'char' changes value from 148 to -108 [-Werror,-Wconstant-conversion] p->ipopt_list[0] = IPOPT_RA; /* Router Alert Option */ ~ ^~~~~~~~ sys/netinet/ip.h:153:19: note: expanded from macro 'IPOPT_RA' #define IPOPT_RA 148 /* router alert */ ^~~
This is because ipopt_list is an array of char, so IPOPT_RA is wrapped to a negative value. It would be nice to change ipopt_list to an array of u_char, but it changes the signature of the public struct ipoption, so add an explicit cast to suppress the warning.
Reviewed by: imp MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D7777
show more ...
|
#
89856f7e |
| 21-Jun-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be
Get closer to a VIMAGE network stack teardown from top to bottom rather than removing the network interfaces first. This change is rather larger and convoluted as the ordering requirements cannot be separated.
Move the pfil(9) framework to SI_SUB_PROTO_PFIL, move Firewalls and related modules to their own SI_SUB_PROTO_FIREWALL. Move initialization of "physical" interfaces to SI_SUB_DRIVERS, move virtual (cloned) interfaces to SI_SUB_PSEUDO. Move Multicast to SI_SUB_PROTO_MC.
Re-work parts of multicast initialisation and teardown, not taking the huge amount of memory into account if used as a module yet.
For interface teardown we try to do as many of them as we can on SI_SUB_INIT_IF, but for some this makes no sense, e.g., when tunnelling over a higher layer protocol such as IP. In that case the interface has to go along (or before) the higher layer protocol is shutdown.
Kernel hhooks need to go last on teardown as they may be used at various higher layers and we cannot remove them before we cleaned up the higher layers.
For interface teardown there are multiple paths: (a) a cloned interface is destroyed (inside a VIMAGE or in the base system), (b) any interface is moved from a virtual network stack to a different network stack ("vmove"), or (c) a virtual network stack is being shut down. All code paths go through if_detach_internal() where we, depending on the vmove flag or the vnet state, make a decision on how much to shut down; in case we are destroying a VNET the individual protocol layers will cleanup their own parts thus we cannot do so again for each interface as we end up with, e.g., double-frees, destroying locks twice or acquiring already destroyed locks. When calling into protocol cleanups we equally have to tell them whether they need to detach upper layer protocols ("ulp") or not (e.g., in6_ifdetach()).
Provide or enahnce helper functions to do proper cleanup at a protocol rather than at an interface level.
Approved by: re (hrs) Obtained from: projects/vnet Reviewed by: gnn, jhb Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D6747
show more ...
|
#
b941cb8d |
| 07-Jun-2016 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Add a `show igi_list` command to DDB to debug IGMP state.
Obtained from: projects/vnet MFC after: 2 weeks Sponsored by: The FreeBSD Foundation
|
#
a4641f4e |
| 03-May-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys/net*: minor spelling fixes.
No functional change.
|
#
0edd2576 |
| 16-Apr-2016 |
Glen Barber <gjb@FreeBSD.org> |
MFH
Sponsored by: The FreeBSD Foundation
|