#
4f6c66cc |
| 23-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
UDP: further performance improvements on tx
Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps
Single stream throughput incre
UDP: further performance improvements on tx
Cumulative throughput while running 64 netperf -H $DUT -t UDP_STREAM -- -m 1 on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps
Single stream throughput increases from 910kpps to 1.18Mpps
Baseline: https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg
- Protect read access to global ifnet list with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg
- Protect short lived ifaddr references with epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg
- Convert if_afdata read lock path to epoch https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg
A fix for the inpcbhash contention is pending sufficient time on a canary at LLNW.
Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15409
show more ...
|
#
f6cb0dea |
| 19-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
net: fix uninitialized variable warning
|
#
d7c5a620 |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I
ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@
gallatin: Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5 based ConnectX 4-LX NIC, I see an almost 12% improvement in received packet rate, and a larger improvement in bytes delivered all the way to userspace.
When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1, I see, using nstat -I mce0 1 before the patch:
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 4.98 0.00 4.42 0.00 4235592 33 83.80 4720653 2149771 1235 247.32 4.73 0.00 4.20 0.00 4025260 33 82.99 4724900 2139833 1204 247.32 4.72 0.00 4.20 0.00 4035252 33 82.14 4719162 2132023 1264 247.32 4.71 0.00 4.21 0.00 4073206 33 83.68 4744973 2123317 1347 247.32 4.72 0.00 4.21 0.00 4061118 33 80.82 4713615 2188091 1490 247.32 4.72 0.00 4.21 0.00 4051675 33 85.29 4727399 2109011 1205 247.32 4.73 0.00 4.21 0.00 4039056 33 84.65 4724735 2102603 1053 247.32
After the patch
InMpps OMpps InGbs OGbs err TCP Est %CPU syscalls csw irq GBfree 5.43 0.00 4.20 0.00 3313143 33 84.96 5434214 1900162 2656 245.51 5.43 0.00 4.20 0.00 3308527 33 85.24 5439695 1809382 2521 245.51 5.42 0.00 4.19 0.00 3316778 33 87.54 5416028 1805835 2256 245.51 5.42 0.00 4.19 0.00 3317673 33 90.44 5426044 1763056 2332 245.51 5.42 0.00 4.19 0.00 3314839 33 88.11 5435732 1792218 2499 245.52 5.44 0.00 4.19 0.00 3293228 33 91.84 5426301 1668597 2121 245.52
Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch
Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D15366
show more ...
|
#
f2d19f98 |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch(9): allocate net epochs earlier in boot
|
#
d71e30de |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch: move epoch variables to read mostly section
|
#
70398c2f |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch(9): Make epochs non-preemptible by default
There are risks associated with waiting on a preemptible epoch section. Change the name to make them not be the default and document the issue under
epoch(9): Make epochs non-preemptible by default
There are risks associated with waiting on a preemptible epoch section. Change the name to make them not be the default and document the issue under CAVEATS.
Reported by: markj
show more ...
|
#
5e68a3df |
| 18-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
epoch: add non-preemptible "critical" variant
adds: - epoch_enter_critical() - can be called inside a different epoch, starts a section that will acquire any MTX_DEF mutexes or do anything that
epoch: add non-preemptible "critical" variant
adds: - epoch_enter_critical() - can be called inside a different epoch, starts a section that will acquire any MTX_DEF mutexes or do anything that might sleep. - epoch_exit_critical() - corresponding exit call - epoch_wait_critical() - wait variant that is guaranteed that any threads in a section are running. - epoch_global_critical - an epoch_wait_critical safe epoch instance
Requested by: markj Approved by: sbruno
show more ...
|
#
5c30b378 |
| 11-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Allow different bridge types to coexist
if_bridge has a lot of limitations that make it scale poorly to higher data rates. In my projects/VPC branch I leverage the bridge interface between layers fo
Allow different bridge types to coexist
if_bridge has a lot of limitations that make it scale poorly to higher data rates. In my projects/VPC branch I leverage the bridge interface between layers for my high speed soft switch as well as for purposes of stacking in general.
Reviewed by: sbruno@ Approved by: sbruno@ Differential Revision: https://reviews.freebsd.org/D15344
show more ...
|
#
20f8d7bc |
| 11-May-2018 |
Dag-Erling Smørgrav <des@FreeBSD.org> |
Slight cleanup of interface event logging.
Make if_printf() use vlog() instead of vprintf(). This means it can no longer return the number of characters printed, as it used to, but every single cal
Slight cleanup of interface event logging.
Make if_printf() use vlog() instead of vprintf(). This means it can no longer return the number of characters printed, as it used to, but every single call to if_printf() in the entire kernel ignores the return value anyway; just return 0 so we don't have to change the prototype.
Consistently use if_printf() throughout sys/net/if.c, instead of a mixture of if_printf() and log().
In ifa_maintain_loopback_route(), don't needlessly log an error if we either failed to add a route because it already existed or failed to remove one because it did not. We still return an error code, though.
MFC after: 1 week
show more ...
|
#
7bf272a6 |
| 10-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
Allocate epoch for networking at startup
Additionally add CK to include paths for modules
Approved by: sbruno@
|
#
b6f6f880 |
| 06-May-2018 |
Matt Macy <mmacy@FreeBSD.org> |
r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's
r333175 introduced deferred deletion of multicast addresses in order to permit the driver ioctl to sleep on commands to the NIC when updating multicast filters. More generally this permitted driver's to use an sx as a softc lock. Unfortunately this change introduced a race whereby a a multicast update would still be queued for deletion when ifconfig deleted the interface thus calling down in to _purgemaddrs and synchronously deleting _all_ of the multicast addresses on the interface.
Synchronously remove all external references to a multicast address before enqueueing for delete.
Reported by: lwhsu Approved by: sbruno
show more ...
|
#
e5054602 |
| 06-May-2018 |
Mark Johnston <markj@FreeBSD.org> |
Import the netdump client code.
This is a component of a system which lets the kernel dump core to a remote host after a panic, rather than to a local storage device. The server component is availab
Import the netdump client code.
This is a component of a system which lets the kernel dump core to a remote host after a panic, rather than to a local storage device. The server component is available in the ports tree. netdump is particularly useful on diskless systems.
The netdump(4) man page contains some details describing the protocol. Support for configuring netdump will be added to dumpon(8) in a future commit. To use netdump, the kernel must have been compiled with the NETDUMP option.
The initial revision of netdump was written by Darrell Anderson and was integrated into Sandvine's OS, from which this version was derived.
Reviewed by: bdrewery, cem (earlier versions), julian, sbruno MFC after: 1 month X-MFC note: use a spare field in struct ifnet Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15253
show more ...
|
#
f3e1324b |
| 02-May-2018 |
Stephen Hurd <shurd@FreeBSD.org> |
Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a
Separate list manipulation locking from state change in multicast
Multicast incorrectly calls in to drivers with a mutex held causing drivers to have to go through all manner of contortions to use a non sleepable lock. Serialize multicast updates instead.
Submitted by: mmacy <mmacy@mattmacy.io> Reviewed by: shurd, sbruno Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14969
show more ...
|
#
3edb7f4e |
| 25-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Translate 32-bit ifmedia requests into native ones.
We use transformation rather than accessors as virtually ever driver implements SIOCGIFMEDIA and all would have to be touched.
Keep the code read
Translate 32-bit ifmedia requests into native ones.
We use transformation rather than accessors as virtually ever driver implements SIOCGIFMEDIA and all would have to be touched.
Keep the code readable by always performing copies and (possiably no-op) transforms.
Reviewed by: jhb, kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14996
show more ...
|
#
3a4fc8a8 |
| 13-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove support for the Arcnet protocol.
While Arcnet has some continued deployment in industrial controls, the lack of drivers for any of the PCI, USB, or PCIe NICs on the market suggests such users
Remove support for the Arcnet protocol.
While Arcnet has some continued deployment in industrial controls, the lack of drivers for any of the PCI, USB, or PCIe NICs on the market suggests such users aren't running FreeBSD.
Evidence in the PR database suggests that the cm(4) driver (our sole Arcnet NIC) was broken in 5.0 and has not worked since.
PR: 182297 Reviewed by: jhibbits, vangyzen Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15057
show more ...
|
#
0437c8e3 |
| 11-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove support for FDDI networks.
Defines in net/if_media.h remain in case code copied from ifconfig is in use elsewere (supporting non-existant media type is harmless).
Reviewed by: kib, jhb Spons
Remove support for FDDI networks.
Defines in net/if_media.h remain in case code copied from ifconfig is in use elsewere (supporting non-existant media type is harmless).
Reviewed by: kib, jhb Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D15017
show more ...
|
#
8a4a4a43 |
| 07-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove the thread argument from ifr_buffer_*() accessors.
They are always used in a context where curthread is the correct thread. This makes them more similar to the ifr_data_get_ptr() accessor.
|
#
e7fdc72e |
| 06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume that each pseudo ifreq is of length MAX(sizeof(struct ifreq), si
ifconf(): correct handling of sockaddrs smaller than struct sockaddr.
Portable programs that use SIOCGIFCONF (e.g. traceroute) assume that each pseudo ifreq is of length MAX(sizeof(struct ifreq), sizeof(ifr_name) + ifr_addr.sa_len). For short sockaddrs we copied too much from the source sockaddr resulting in a heap leak.
I believe only one such sockaddr exists (struct sockaddr_sco which is 8 bytes) and it is unclear if such sockaddrs end up on interfaces in practice. If it did, the result would be an 8 byte heap leak on current architectures.
admbugs: 869 Reviewed by: kib Obtained from: CheriBSD MFC after: 3 days Security: kernel heap leak Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14981
show more ...
|
#
6469bdcd |
| 06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is close
Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options.
Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures.
Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files.
Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941
show more ...
|
#
756181b8 |
| 06-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Add 32-bit compat for ioctls that take struct ifgroupreq.
Use an accessor to access ifgr_group and ifgr_groups.
Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such as "case SIOCA
Add 32-bit compat for ioctls that take struct ifgroupreq.
Use an accessor to access ifgr_group and ifgr_groups.
Use an macro CASE_IOC_IFGROUPREQ(cmd) in place of case statements such as "case SIOCAIFGROUP:". This avoids poluting the switch statements with large numbers of #ifdefs.
Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14960
show more ...
|
#
2443045f |
| 05-Apr-2018 |
Brooks Davis <brooks@FreeBSD.org> |
ifconf(): Always zero the whole struct ifreq.
The previous split of zeroing ifr_name and ifr_addr seperately is safe on current architectures, but would be unsafe if pointers were larger than 8 byte
ifconf(): Always zero the whole struct ifreq.
The previous split of zeroing ifr_name and ifr_addr seperately is safe on current architectures, but would be unsafe if pointers were larger than 8 bytes. Combining the zeroing adds no real cost (a few instructions) and makes the security property easier to verify.
Reviewed by: kib, emaste Obtained from: CheriBSD MFC after: 3 days Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14912
show more ...
|
#
8708f1bd |
| 30-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Document and enforce assumptions about struct (in6_)ifreq.
- The two types must be type-punnable for shared members of ifr_ifru. This allows compatibility accessors to be shared.
- There must be
Document and enforce assumptions about struct (in6_)ifreq.
- The two types must be type-punnable for shared members of ifr_ifru. This allows compatibility accessors to be shared.
- There must be no padding gap between ifr_name and ifr_ifru. This is assumed in tcpdump's use of SIOCGIFFLAGS output which attempts to be broadly portable. This is true for all current architectures, but very large (256-bit) fat-pointers could violate this invariant.
Reviewed by: kib Obtained from: CheriBSD Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14910
show more ...
|
#
541d96aa |
| 30-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size). This is believed to be sufficent to fully support
Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required as struct ifreq is the same size). This is believed to be sufficent to fully support ifconfig on 32-bit systems.
Reviewed by: kib Obtained from: CheriBSD MFC after: 1 week Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14900
show more ...
|
#
69f0fecb |
| 29-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Remove infrastructure for token-ring networks.
Reviewed by: cem, imp, jhb, jmallett Relnotes: yes Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14875
|
#
f8f65519 |
| 27-Mar-2018 |
Brooks Davis <brooks@FreeBSD.org> |
Fix a whitespace bug missed in refactoring prior to r331641.
MFC with: r331641
|