#
0650bf52 |
| 17-Sep-2021 |
Vladimir Oltean <vladimir.oltean@nxp.com> |
net: dsa: be compatible with masters which unregister on shutdown
Lino reports that on his system with bcmgenet as DSA master and KSZ9897 as a switch, rebooting or shutting down never works properly
net: dsa: be compatible with masters which unregister on shutdown
Lino reports that on his system with bcmgenet as DSA master and KSZ9897 as a switch, rebooting or shutting down never works properly.
What does the bcmgenet driver have special to trigger this, that other DSA masters do not? It has an implementation of ->shutdown which simply calls its ->remove implementation. Otherwise said, it unregisters its network interface on shutdown.
This message can be seen in a loop, and it hangs the reboot process there:
unregister_netdevice: waiting for eth0 to become free. Usage count = 3
So why 3?
A usage count of 1 is normal for a registered network interface, and any virtual interface which links itself as an upper of that will increment it via dev_hold. In the case of DSA, this is the call path:
dsa_slave_create -> netdev_upper_dev_link -> __netdev_upper_dev_link -> __netdev_adjacent_dev_insert -> dev_hold
So a DSA switch with 3 interfaces will result in a usage count elevated by two, and netdev_wait_allrefs will wait until they have gone away.
Other stacked interfaces, like VLAN, watch NETDEV_UNREGISTER events and delete themselves, but DSA cannot just vanish and go poof, at most it can unbind itself from the switch devices, but that must happen strictly earlier compared to when the DSA master unregisters its net_device, so reacting on the NETDEV_UNREGISTER event is way too late.
It seems that it is a pretty established pattern to have a driver's ->shutdown hook redirect to its ->remove hook, so the same code is executed regardless of whether the driver is unbound from the device, or the system is just shutting down. As Florian puts it, it is quite a big hammer for bcmgenet to unregister its net_device during shutdown, but having a common code path with the driver unbind helps ensure it is well tested.
So DSA, for better or for worse, has to live with that and engage in an arms race of implementing the ->shutdown hook too, from all individual drivers, and do something sane when paired with masters that unregister their net_device there. The only sane thing to do, of course, is to unlink from the master.
However, complications arise really quickly.
The pattern of redirecting ->shutdown to ->remove is not unique to bcmgenet or even to net_device drivers. In fact, SPI controllers do it too (see dspi_shutdown -> dspi_remove), and presumably, I2C controllers and MDIO controllers do it too (this is something I have not researched too deeply, but even if this is not the case today, it is certainly plausible to happen in the future, and must be taken into consideration).
Since DSA switches might be SPI devices, I2C devices, MDIO devices, the insane implication is that for the exact same DSA switch device, we might have both ->shutdown and ->remove getting called.
So we need to do something with that insane environment. The pattern I've come up with is "if this, then not that", so if either ->shutdown or ->remove gets called, we set the device's drvdata to NULL, and in the other hook, we check whether the drvdata is NULL and just do nothing. This is probably not necessary for platform devices, just for devices on buses, but I would really insist for consistency among drivers, because when code is copy-pasted, it is not always copy-pasted from the best sources.
So depending on whether the DSA switch's ->remove or ->shutdown will get called first, we cannot really guarantee even for the same driver if rebooting will result in the same code path on all platforms. But nonetheless, we need to do something minimally reasonable on ->shutdown too to fix the bug. Of course, the ->remove will do more (a full teardown of the tree, with all data structures freed, and this is why the bug was not caught for so long). The new ->shutdown method is kept separate from dsa_unregister_switch not because we couldn't have unregistered the switch, but simply in the interest of doing something quick and to the point.
The big question is: does the DSA switch's ->shutdown get called earlier than the DSA master's ->shutdown? If not, there is still a risk that we might still trigger the WARN_ON in unregister_netdevice that says we are attempting to unregister a net_device which has uppers. That's no good. Although the reference to the master net_device won't physically go away even if DSA's ->shutdown comes afterwards, remember we have a dev_hold on it.
The answer to that question lies in this comment above device_link_add:
* A side effect of the link creation is re-ordering of dpm_list and the * devices_kset list by moving the consumer device and all devices depending * on it to the ends of these lists (that does not happen to devices that have * not been registered when this function is called).
so the fact that DSA uses device_link_add towards its master is not exactly for nothing. device_shutdown() walks devices_kset from the back, so this is our guarantee that DSA's shutdown happens before the master's shutdown.
Fixes: 2f1e8ea726e9 ("net: dsa: link interfaces with the DSA master to get rid of lockdep warnings") Link: https://lore.kernel.org/netdev/20210909095324.12978-1-LinoSanfilippo@gmx.de/ Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1 |
|
#
08987822 |
| 16-Sep-2019 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 5.4 merge window.
|
Revision tags: v5.3 |
|
#
d3f9990f |
| 14-Sep-2019 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'for-next' into for-linus
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
Revision tags: v5.3-rc8, v5.3-rc7, v5.3-rc6 |
|
#
75bf465f |
| 23-Aug-2019 |
Paul Mackerras <paulus@ozlabs.org> |
Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in fixes for the XIVE interrupt controller which touch both generic powerpc and PPC KVM code. To avoid mer
Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in fixes for the XIVE interrupt controller which touch both generic powerpc and PPC KVM code. To avoid merge conflicts, these commits will go upstream via the powerpc tree as well as the KVM tree.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
show more ...
|
Revision tags: v5.3-rc5 |
|
#
58e16d79 |
| 13-Aug-2019 |
Tony Lindgren <tony@atomide.com> |
Merge branch 'ti-sysc-fixes' into fixes
|
#
cbd32a1c |
| 12-Aug-2019 |
Thomas Gleixner <tglx@linutronix.de> |
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent
Pull a single EFI fix for v5.3 from Ard:
- Fix mixed mode breakage in EFI config table handling for
Merge tag 'efi-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent
Pull a single EFI fix for v5.3 from Ard:
- Fix mixed mode breakage in EFI config table handling for TPM.
show more ...
|
#
4aa31b4b |
| 12-Aug-2019 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.3-rc4' into next
Sync up with mainline to bring in device_property_count_u32 andother newer APIs.
|
Revision tags: v5.3-rc4 |
|
#
3f61fd41 |
| 09-Aug-2019 |
Alex Deucher <alexander.deucher@amd.com> |
Merge tag 'v5.3-rc3' into drm-next-5.4
Linux 5.3-rc3
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
Revision tags: v5.3-rc3 |
|
#
ed32f8d4 |
| 29-Jul-2019 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next-queued
Catching up with 5.3-rc*
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
7a30bdd9 |
| 28-Jul-2019 |
Thomas Gleixner <tglx@linutronix.de> |
Merge branch master from git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Pick up the spectre documentation so the Grand Schemozzle can be added.
|
Revision tags: v5.3-rc2 |
|
#
27988c96 |
| 24-Jul-2019 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.3-rc1' into regulator-5.3
Linus 5.3-rc1
|
#
e27a2421 |
| 22-Jul-2019 |
Jonathan Corbet <corbet@lwn.net> |
Merge tag 'v5.3-rc1' into docs-next
Pull in all of the massive docs changes from elsewhere.
|
#
03b0f2ce |
| 22-Jul-2019 |
Maxime Ripard <maxime.ripard@bootlin.com> |
Merge v5.3-rc1 into drm-misc-next
Noralf needs some SPI patches in 5.3 to merge some work on tinydrm.
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
|
#
3f98538c |
| 22-Jul-2019 |
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> |
Merge tag 'v5.3-rc1' into patchwork
Linus 5.3-rc1
* tag 'v5.3-rc1': (12816 commits) Linus 5.3-rc1 iommu/amd: fix a crash in iova_magazine_free_pfns hexagon: switch to generic version of pte a
Merge tag 'v5.3-rc1' into patchwork
Linus 5.3-rc1
* tag 'v5.3-rc1': (12816 commits) Linus 5.3-rc1 iommu/amd: fix a crash in iova_magazine_free_pfns hexagon: switch to generic version of pte allocation typo fix: it's d_make_root, not d_make_inode... dt-bindings: pinctrl: stm32: Fix missing 'clocks' property in examples dt-bindings: iio: ad7124: Fix dtc warnings in example dt-bindings: iio: avia-hx711: Fix avdd-supply typo in example dt-bindings: pinctrl: aspeed: Fix AST2500 example errors dt-bindings: pinctrl: aspeed: Fix 'compatible' schema errors dt-bindings: riscv: Limit cpus schema to only check RiscV 'cpu' nodes dt-bindings: Ensure child nodes are of type 'object' x86/entry/64: Prevent clobbering of saved CR2 value smp: Warn on function calls from softirq context KVM: x86: Add fixed counters to PMU filter KVM: nVMX: do not use dangling shadow VMCS after guest reset KVM: VMX: dump VMCS on failed entry KVM: x86/vPMU: refine kvm_pmu err msg when event creation failed KVM: s390: Use kvm_vcpu_wake_up in kvm_s390_vcpu_wakeup KVM: Boost vCPUs that are delivering interrupts KVM: selftests: Remove superfluous define from vmx.c ...
show more ...
|
#
4df4888b |
| 22-Jul-2019 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'topic/hda-acomp-base' into for-next
Pull the support for AMD / Nvidia HD-audio compmonent notification
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
Revision tags: v5.3-rc1 |
|
#
237f83df |
| 11-Jul-2019 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Some highlights from this development cycle:
1) Big refactoring of ipv6 route and
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller: "Some highlights from this development cycle:
1) Big refactoring of ipv6 route and neigh handling to support nexthop objects configurable as units from userspace. From David Ahern.
2) Convert explored_states in BPF verifier into a hash table, significantly decreased state held for programs with bpf2bpf calls, from Alexei Starovoitov.
3) Implement bpf_send_signal() helper, from Yonghong Song.
4) Various classifier enhancements to mvpp2 driver, from Maxime Chevallier.
5) Add aRFS support to hns3 driver, from Jian Shen.
6) Fix use after free in inet frags by allocating fqdirs dynamically and reworking how rhashtable dismantle occurs, from Eric Dumazet.
7) Add act_ctinfo packet classifier action, from Kevin Darbyshire-Bryant.
8) Add TFO key backup infrastructure, from Jason Baron.
9) Remove several old and unused ISDN drivers, from Arnd Bergmann.
10) Add devlink notifications for flash update status to mlxsw driver, from Jiri Pirko.
11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.
12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.
13) Various enhancements to ipv6 flow label handling, from Eric Dumazet and Willem de Bruijn.
14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van der Merwe, and others.
15) Various improvements to axienet driver including converting it to phylink, from Robert Hancock.
16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.
17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana Radulescu.
18) Add devlink health reporting to mlx5, from Moshe Shemesh.
19) Convert stmmac over to phylink, from Jose Abreu.
20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from Shalom Toledo.
21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.
22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.
23) Track spill/fill of constants in BPF verifier, from Alexei Starovoitov.
24) Support bounded loops in BPF, from Alexei Starovoitov.
25) Various page_pool API fixes and improvements, from Jesper Dangaard Brouer.
26) Just like ipv4, support ref-countless ipv6 route handling. From Wei Wang.
27) Support VLAN offloading in aquantia driver, from Igor Russkikh.
28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.
29) Add flower GRE encap/decap support to nfp driver, from Pieter Jansen van Vuuren.
30) Protect against stack overflow when using act_mirred, from John Hurley.
31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.
32) Use page_pool API in netsec driver, Ilias Apalodimas.
33) Add Google gve network driver, from Catherine Sullivan.
34) More indirect call avoidance, from Paolo Abeni.
35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.
36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.
37) Add MPLS manipulation actions to TC, from John Hurley.
38) Add sending a packet to connection tracking from TC actions, and then allow flower classifier matching on conntrack state. From Paul Blakey.
39) Netfilter hw offload support, from Pablo Neira Ayuso"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits) net/mlx5e: Return in default case statement in tx_post_resync_params mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync(). net: dsa: add support for BRIDGE_MROUTER attribute pkt_sched: Include const.h net: netsec: remove static declaration for netsec_set_tx_de() net: netsec: remove superfluous if statement netfilter: nf_tables: add hardware offload support net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload net: flow_offload: add flow_block_cb_is_busy() and use it net: sched: remove tcf block API drivers: net: use flow block API net: sched: use flow block API net: flow_offload: add flow_block_cb_{priv, incref, decref}() net: flow_offload: add list handling functions net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free() net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND net: flow_offload: add flow_block_cb_setup_simple() net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC ...
show more ...
|
Revision tags: v5.2 |
|
#
ad7b134f |
| 07-Jul-2019 |
David S. Miller <davem@davemloft.net> |
Merge branch 'net-dsa-Add-Vitesse-VSC73xx-parallel-mode'
Pawel Dembicki says:
==================== net: dsa: Add Vitesse VSC73xx parallel mode
Main goal of this patch series is to add support for
Merge branch 'net-dsa-Add-Vitesse-VSC73xx-parallel-mode'
Pawel Dembicki says:
==================== net: dsa: Add Vitesse VSC73xx parallel mode
Main goal of this patch series is to add support for CPU attached parallel bus in Vitesse VSC73xx switches. Existing driver supports only SPI mode.
Second change is needed for devices in unmanaged state.
V3: - fix commit messages and descriptions about memory-mapped I/O mode
V2: - drop changes in compatible strings - make changes less invasive - drop mutex in platform part and move mutex from core to spi part - fix indentation - fix devm_ioremap_resource result check - add cover letter ====================
Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
95711cd5 |
| 05-Jul-2019 |
Pawel Dembicki <paweldembicki@gmail.com> |
net: dsa: vsc73xx: Split vsc73xx driver
This driver (currently) only takes control of the switch chip over SPI and configures it to route packages around when connected to a CPU port. But Vitesse ch
net: dsa: vsc73xx: Split vsc73xx driver
This driver (currently) only takes control of the switch chip over SPI and configures it to route packages around when connected to a CPU port. But Vitesse chip support also parallel interface.
This patch split driver into two parts: core and spi. It is required for add support to another managing interface.
Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|