| 29903edf | 28-Jan-2026 |
Cosmin Ratiu <cratiu@nvidia.com> |
devlink: Refactor devlink_rate_nodes_check
devlink_rate_nodes_check() was used to verify there are no devlink rate nodes created when switching the esw mode.
Rate management code is about to become
devlink: Refactor devlink_rate_nodes_check
devlink_rate_nodes_check() was used to verify there are no devlink rate nodes created when switching the esw mode.
Rate management code is about to become more complex, so refactor this function: - remove unused param 'mode'. - add a new 'rate_filter' param. - rename to devlink_rates_check(). - expose devlink_rate_is_node() to be used as a rate filter.
This makes it more usable from multiple places, so use it from those places as well.
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260128112544.1661250-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| 2a367002 | 19-Nov-2025 |
Daniel Zahka <daniel.zahka@gmail.com> |
devlink: support default values for param-get and param-set
Support querying and resetting to default param values.
Introduce two new devlink netlink attrs: DEVLINK_ATTR_PARAM_VALUE_DEFAULT and DEV
devlink: support default values for param-get and param-set
Support querying and resetting to default param values.
Introduce two new devlink netlink attrs: DEVLINK_ATTR_PARAM_VALUE_DEFAULT and DEVLINK_ATTR_PARAM_RESET_DEFAULT. The former is used to contain an optional parameter value inside of the param_value nested attribute. The latter is used in param-set requests from userspace to indicate that the driver should reset the param to its default value.
To implement this, two new functions are added to the devlink driver api: devlink_param::get_default() and devlink_param::reset_default(). These callbacks allow drivers to implement default param actions for runtime and permanent cmodes. For driverinit params, the core latches the last value set by a driver via devl_param_driverinit_value_set(), and uses that as the default value for a param.
Because default parameter values are optional, it would be impossible to discern whether or not a param of type bool has default value of false or not provided if the default value is encoded using a netlink flag type. For this reason, when a DEVLINK_PARAM_TYPE_BOOL has an associated default value, the default value is encoded using a u8 type.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-4-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| 17a42aa4 | 19-Nov-2025 |
Daniel Zahka <daniel.zahka@gmail.com> |
devlink: refactor devlink_nl_param_value_fill_one()
Lift the param type demux and value attr placement into a separate function. This new function, devlink_nl_param_put(), can be used to place addit
devlink: refactor devlink_nl_param_value_fill_one()
Lift the param type demux and value attr placement into a separate function. This new function, devlink_nl_param_put(), can be used to place additional types values in the value array, e.g., default, current, next values. This commit has no functional change.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-3-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| da0e2197 | 24-Aug-2025 |
Shahar Shitrit <shshitrit@nvidia.com> |
devlink: Make health reporter burst period configurable
Enable configuration of the burst period — a time window starting from the first error recovery, during which the reporter allows recovery att
devlink: Make health reporter burst period configurable
Enable configuration of the burst period — a time window starting from the first error recovery, during which the reporter allows recovery attempts for each reported error.
This feature is helpful when a single underlying issue causes multiple errors, as it delays the start of the grace period to allow sufficient time for recovering all related errors. For example, if multiple TX queues time out simultaneously, a sufficient burst period could allow all affected TX queues to be recovered within that window. Without this period, only the first TX queue that reports a timeout will undergo recovery, while the remaining TX queues will be blocked once the grace period begins.
Configuration example: $ devlink health set pci/0000:00:09.0 reporter tx burst_period 500
Configuration example with ynl: ./tools/net/ynl/pyynl/cli.py \ --spec Documentation/netlink/specs/devlink.yaml \ --do health-reporter-set --json '{ "bus-name": "auxiliary", "dev-name": "mlx5_core.eth.0", "port-index": 65535, "health-reporter-name": "tx", "health-reporter-burst-period": 500 }'
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| 6a06d8c4 | 24-Aug-2025 |
Shahar Shitrit <shshitrit@nvidia.com> |
devlink: Introduce burst period for health reporter
Currently, the devlink health reporter starts the grace period immediately after handling an error, blocking any further recoveries until it finis
devlink: Introduce burst period for health reporter
Currently, the devlink health reporter starts the grace period immediately after handling an error, blocking any further recoveries until it finished.
However, when a single root cause triggers multiple errors in a short time frame, it is desirable to treat them as a bulk of errors and to allow their recoveries, avoiding premature blocking of subsequent related errors, and reducing the risk of inconsistent or incomplete error handling.
To address this, introduce a configurable burst period for devlink health reporter. Start this period when the first error is handled, and allow recovery attempts for reported errors during this window. Once burst period expires, begin the grace period to block further recoveries until it concludes.
Timeline summary:
----|--------|------------------------------/----------------------/-- error is error is burst period grace period reported recovered (recoveries allowed) (recoveries blocked)
For calculating the burst period duration, use the same last_recovery_ts as the grace period. Update it on recovery only when the burst period is inactive (either disabled or at the first error).
This patch implements the framework for the burst period and effectively sets its value to 0 at reporter creation, so the current behavior remains unchanged, which ensures backward compatibility.
A downstream patch will make the burst period configurable.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| 20597fb9 | 24-Aug-2025 |
Shahar Shitrit <shshitrit@nvidia.com> |
devlink: Move health reporter recovery abort logic to a separate function
Extract the health reporter recovery abort logic into a separate function devlink_health_recover_abort(). The function encap
devlink: Move health reporter recovery abort logic to a separate function
Extract the health reporter recovery abort logic into a separate function devlink_health_recover_abort(). The function encapsulates the conditions for aborting recovery: - When auto-recovery is disabled - When previous error wasn't recovered - When within the grace period after last recovery
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250824084354.533182-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
| 41a6e8ab | 13-Aug-2025 |
Parav Pandit <parav@nvidia.com> |
devlink/port: Check attributes early and constify
Constify the devlink port attributes to indicate they are read only and does not depend on anything else. Therefore, validate it early before settin
devlink/port: Check attributes early and constify
Constify the devlink port attributes to indicate they are read only and does not depend on anything else. Therefore, validate it early before setting in the devlink port.
Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Link: https://patch.msgid.link/20250813094417.7269-3-parav@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|