9beebc2b | 28-Sep-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: dev: add can_state_get_by_berr_counter() to return the CAN state based on the current error counters
Some CAN controllers do not have a register that contains the current CAN state, but only a
can: dev: add can_state_get_by_berr_counter() to return the CAN state based on the current error counters
Some CAN controllers do not have a register that contains the current CAN state, but only a register that contains the error counters.
Introduce a new function can_state_get_by_berr_counter() that returns the current TX and RX state depending on the provided CAN bit error counters.
Link: https://lore.kernel.org/all/20231005-at91_can-rx_offload-v2-1-9987d53600e0@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
6411959c | 29-Sep-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed out of bounds
If the "struct can_priv::echoo_skb" is accessed out of bounds, this would cause a kernel crash. Inste
can: dev: can_put_echo_skb(): don't crash kernel if can_priv::echo_skb is accessed out of bounds
If the "struct can_priv::echoo_skb" is accessed out of bounds, this would cause a kernel crash. Instead, issue a meaningful warning message and return with an error.
Fixes: a6e4bc530403 ("can: make the number of echo skb's configurable") Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-5-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
f0e0c809 | 29-Sep-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: dev: can_restart(): move debug message and stats after successful restart
Move the debug message "restarted" and the CAN restart stats_after_ the successful restart of the CAN device, because t
can: dev: can_restart(): move debug message and stats after successful restart
Move the debug message "restarted" and the CAN restart stats_after_ the successful restart of the CAN device, because the restart may fail.
While there update the error message from printing the error number to printing symbolic error names.
Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-4-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> [mkl: mention stats in subject and description, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
8f3ec204 | 29-Sep-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: dev: can_restart(): reverse logic to remove need for goto
Reverse the logic in the if statement and eliminate the need for a goto to simplify code readability.
Link: https://lore.kernel.org/al
can: dev: can_restart(): reverse logic to remove need for goto
Reverse the logic in the if statement and eliminate the need for a goto to simplify code readability.
Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-3-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
6841cab8 | 29-Sep-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()
This race condition was discovered while updating the at91_can driver to use can_bus_off(). The followin
can: dev: can_restart(): fix race condition between controller restart and netif_carrier_on()
This race condition was discovered while updating the at91_can driver to use can_bus_off(). The following scenario describes how the converted at91_can driver would behave.
When a CAN device goes into BUS-OFF state, the driver usually stops/resets the CAN device and calls can_bus_off().
This function sets the netif carrier to off, and (if configured by user space) schedules a delayed work that calls can_restart() to restart the CAN device.
The can_restart() function first checks if the carrier is off and triggers an error message if the carrier is OK.
Then it calls the driver's do_set_mode() function to restart the device, then it sets the netif carrier to on. There is a race window between these two calls.
The at91 CAN controller (observed on the sama5d3, a single core 32 bit ARM CPU) has a hardware limitation. If the device goes into bus-off while sending a CAN frame, there is no way to abort the sending of this frame. After the controller is enabled again, another attempt is made to send it.
If the bus is still faulty, the device immediately goes back to the bus-off state. The driver calls can_bus_off(), the netif carrier is switched off and another can_restart is scheduled. This occurs within the race window before the original can_restart() handler marks the netif carrier as OK. This would cause the 2nd can_restart() to be called with an OK netif carrier, resulting in an error message.
The flow of the 1st can_restart() looks like this:
can_restart() // bail out if netif_carrier is OK
netif_carrier_ok(dev) priv->do_set_mode(dev, CAN_MODE_START) // enable CAN controller // sama5d3 restarts sending old message
// CAN devices goes into BUS_OFF, triggers IRQ
// IRQ handler start at91_irq() at91_irq_err_line() can_bus_off() netif_carrier_off() schedule_delayed_work() // IRQ handler end
netif_carrier_on()
The 2nd can_restart() will be called with an OK netif carrier and the error message will be printed.
To close the race window, first set the netif carrier to on, then restart the controller. In case the restart fails with an error code, roll back the netif carrier to off.
Fixes: 39549eef3587 ("can: CAN Network device driver and Netlink interface") Link: https://lore.kernel.org/all/20231005-can-dev-fix-can-restart-v2-2-91b5c1fd922c@pengutronix.de Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
8e0e2950 | 03-Jul-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: rx-offload: add can_rx_offload_get_echo_skb_queue_tail()
Add can_rx_offload_get_echo_skb_queue_tail(). This function addds the echo skb at the end of rx-offload the queue. This is intended for
can: rx-offload: add can_rx_offload_get_echo_skb_queue_tail()
Add can_rx_offload_get_echo_skb_queue_tail(). This function addds the echo skb at the end of rx-offload the queue. This is intended for devices without timestamp support.
Link: https://lore.kernel.org/all/20230718-gs_usb-rx-offload-v2-2-716e542d14d5@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
80a2fbce | 11-Jun-2023 |
Vincent Mailhol <mailhol.vincent@wanadoo.fr> |
can: length: refactor frame lengths definition to add size in bits
Introduce a method to calculate the exact size in bits of a CAN(-FD) frame with or without dynamic bitstuffing.
These are all the
can: length: refactor frame lengths definition to add size in bits
Introduce a method to calculate the exact size in bits of a CAN(-FD) frame with or without dynamic bitstuffing.
These are all the possible combinations taken into account:
- Classical CAN or CAN-FD - Standard or Extended frame format - CAN-FD CRC17 or CRC21 - Include or not intermission
Instead of doing several individual macro definitions, declare the can_frame_bits() function-like macro. To this extent, do a full refactoring of the length definitions.
In addition add the can_frame_bytes(). This function-like macro replaces the existing macro:
- CAN_FRAME_OVERHEAD_SFF: can_frame_bytes(false, false, 0) - CAN_FRAME_OVERHEAD_EFF: can_frame_bytes(false, true, 0) - CANFD_FRAME_OVERHEAD_SFF: can_frame_bytes(true, false, 0) - CANFD_FRAME_OVERHEAD_EFF: can_frame_bytes(true, true, 0)
Function-like macros were chosen over inline functions because they can be used to initialize const struct fields.
The different maximum frame lengths (maximum data length, including intermission) are as follow:
Frame type bits bytes ------------------------------------------------------- Classic CAN SFF no bitstuffing 111 14 Classic CAN EFF no bitstuffing 131 17 Classic CAN SFF bitstuffing 135 17 Classic CAN EFF bitstuffing 160 20 CAN-FD SFF no bitstuffing 579 73 CAN-FD EFF no bitstuffing 598 75 CAN-FD SFF bitstuffing 712 89 CAN-FD EFF bitstuffing 736 92
The macro CAN_FRAME_LEN_MAX and CANFD_FRAME_LEN_MAX are kept as an alias to, respectively, can_frame_bytes(false, true, CAN_MAX_DLEN) and can_frame_bytes(true, true, CANFD_MAX_DLEN).
In addition to the above:
- Use ISO 11898-1:2015 definitions for the names of the CAN frame fields. - Include linux/bits.h for use of BITS_PER_BYTE. - Include linux/math.h for use of mult_frac() and DIV_ROUND_UP(). N.B: the use of DIV_ROUND_UP() is not new to this patch, but the include was previously omitted. - Add copyright 2023 for myself.
Suggested-by: Thomas Kopp <Thomas.Kopp@microchip.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Thomas Kopp <Thomas.Kopp@microchip.com> Link: https://lore.kernel.org/all/20230611025728.450837-4-mailhol.vincent@wanadoo.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
6d793471 | 01-Feb-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: bittiming: can_validate_bitrate(): report error via netlink
Report an error to user space via netlink if the requested bit rate is not supported by the device.
Link: https://lore.kernel.org/al
can: bittiming: can_validate_bitrate(): report error via netlink
Report an error to user space via netlink if the requested bit rate is not supported by the device.
Link: https://lore.kernel.org/all/20230202110854.2318594-18-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
06742086 | 31-Jan-2023 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: bittiming: can_calc_bittiming(): convert from netdev_err() to NL_SET_ERR_MSG_FMT()
Replace the netdev_err() by NL_SET_ERR_MSG_FMT() to better inform the user about the problem. While there, use
can: bittiming: can_calc_bittiming(): convert from netdev_err() to NL_SET_ERR_MSG_FMT()
Replace the netdev_err() by NL_SET_ERR_MSG_FMT() to better inform the user about the problem. While there, use %u to print unsigned values and improve error message a bit.
In case of an error, return -EINVAL instead of -EDOM, this corresponds better to the actual meaning of the error value.
Link: https://lore.kernel.org/all/20230202110854.2318594-17-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
c7650728 | 06-Sep-2022 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: bittiming: can_calc_bittiming(): clean up SJW handling
In the current code, if the user configures a bitrate, a default SJW value of 1 is used. If the user configures both a bitrate and a SJW v
can: bittiming: can_calc_bittiming(): clean up SJW handling
In the current code, if the user configures a bitrate, a default SJW value of 1 is used. If the user configures both a bitrate and a SJW value, can_calc_bittiming() silently limits the SJW value to SJW max and TSEG2.
We came to the conclusion that if the user provided an invalid SJW value, it's best to bail out and inform the user [1].
[1] https://lore.kernel.org/all/CAMZ6RqKqhmTgUZiwe5uqUjBDnhhC2iOjZ791+Y845btJYwVDKg@mail.gmail.com
Further the ISO 11898-1:2015 standard mandates that "SJW shall be less than or equal to the minimum of these two items: Phase_Seg1 and Phase_Seg2." [2] The current code is missing that check.
[2] https://lore.kernel.org/all/BL3PR11MB64844E3FC13C55433CDD0B3DFB449@BL3PR11MB6484.namprd11.prod.outlook.com
The previous patches introduced 1) can_sjw_set_default() - sets a default value for SJW if unset 2) can_sjw_check() - implements a SJW check against SJW max, Phase Seg1 and Phase Seg2. In the error case this function reports the error to user space via netlink.
Replace both the open-coded SJW default setting and the open-coded and insufficient checks of SJW with the helper functions can_sjw_set_default() and can_sjw_check().
Link: https://lore.kernel.org/all/20230202110854.2318594-16-mkl@pengutronix.de Link: https://lore.kernel.org/all/CAMZ6RqKqhmTgUZiwe5uqUjBDnhhC2iOjZ791+Y845btJYwVDKg@mail.gmail.com Link: https://lore.kernel.org/all/BL3PR11MB64844E3FC13C55433CDD0B3DFB449@BL3PR11MB6484.namprd11.prod.outlook.com Suggested-by: Thomas Kopp <Thomas.Kopp@microchip.com> Suggested-by: Vincent Mailhol <vincent.mailhol@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
80bcf5ec | 06-Sep-2022 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: bittiming: can_sjw_set_default(): use Phase Seg2 / 2 as default for SJW
"The (Re-)Synchronization Jump Width (SJW) defines how far a resynchronization may move the Sample Point inside the limi
can: bittiming: can_sjw_set_default(): use Phase Seg2 / 2 as default for SJW
"The (Re-)Synchronization Jump Width (SJW) defines how far a resynchronization may move the Sample Point inside the limits defined by the Phase Buffer Segments to compensate for edge phase errors." [1]
In other words, this means that the SJW parameter controls the tolerance of the CAN controller to frequency errors compared to other CAN controllers.
If the user space does not provide an SJW parameter, the kernel chooses a default value of 1. This has proven to be a good default value for classic CAN controllers, but no longer for modern CAN-FD controllers.
In the past there were CAN controllers like the sja1000 with a rather limited range of bit timing parameters. For the standard bit rates this results in the following bit timing parameters:
| Bit timing parameters for sja1000 with 8.000000 MHz ref clock | _----+--------------=> tseg1: 1 … 16 | / / _---------=> tseg2: 1 … 8 | | | / _-----=> sjw: 1 … 4 | | | | / _-=> brp: 1 … 64 (inc: 1) | | | | | / | nominal | | | | | real Bitrt nom real SampP | Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1 | 1000000 125 2 3 2 1 1 1000000 0.0% 75.0% 75.0% 0.0% 0x00 0x14 | 800000 125 3 4 2 1 1 800000 0.0% 80.0% 80.0% 0.0% 0x00 0x16 | 666666 125 4 4 3 1 1 666666 0.0% 80.0% 75.0% 6.2% 0x00 0x27 | 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c | 250000 250 6 7 2 1 2 250000 0.0% 87.5% 87.5% 0.0% 0x01 0x1c | 125000 500 6 7 2 1 4 125000 0.0% 87.5% 87.5% 0.0% 0x03 0x1c | 100000 625 6 7 2 1 5 100000 0.0% 87.5% 87.5% 0.0% 0x04 0x1c | 83333 750 6 7 2 1 6 83333 0.0% 87.5% 87.5% 0.0% 0x05 0x1c | 50000 1250 6 7 2 1 10 50000 0.0% 87.5% 87.5% 0.0% 0x09 0x1c | 33333 1875 6 7 2 1 15 33333 0.0% 87.5% 87.5% 0.0% 0x0e 0x1c | 20000 3125 6 7 2 1 25 20000 0.0% 87.5% 87.5% 0.0% 0x18 0x1c | 10000 6250 6 7 2 1 50 10000 0.0% 87.5% 87.5% 0.0% 0x31 0x1c
The attentive reader will notice that the SJW is 1 in most cases, while the Seg2 phase is 2. Both values are given in TQ units, which in turn is a duration in nanoseconds.
For example the 500 kbit/s configuration:
| nominal real Bitrt nom real SampP | Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error BTR0 BTR1 | 500000 125 6 7 2 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00 0x1c
the TQ is 125ns, the Phase Seg2 is "2" (== 250ns), the SJW is "1" (== 125 ns).
Looking at a more modern CAN controller like a mcp2518fd, it has wider bit timing registers.
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock | _----+--------------=> tseg1: 2 … 256 | / / _---------=> tseg2: 1 … 128 | | | / _-----=> sjw: 1 … 128 | | | | / _-=> brp: 1 … 256 (inc: 1) | | | | | / | nominal | | | | | real Bitrt nom real SampP | Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG | 500000 25 34 35 10 1 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440900
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "1" (== 25ns).
Since the kernel chooses a default SJW of 1 regardless of the TQ, this leads to a much smaller SJW and thus much smaller tolerances to frequency errors.
To maintain the same oscillator tolerances on controllers with wide bit timing registers, select a default SJW value of Phase Seg2 / 2 unless Phase Seg 1 is less. This results in the following bit timing parameters:
| Bit timing parameters for mcp251xfd with 40.000000 MHz ref clock | _----+--------------=> tseg1: 2 … 256 | / / _---------=> tseg2: 1 … 128 | | | / _-----=> sjw: 1 … 128 | | | | / _-=> brp: 1 … 256 (inc: 1) | | | | | / | nominal | | | | | real Bitrt nom real SampP | Bitrate TQ[ns] PrS PhS1 PhS2 SJW BRP Bitrate Error SampP SampP Error NBTCFG | 500000 25 34 35 10 5 1 500000 0.0% 87.5% 87.5% 0.0% 0x00440904
The TQ is 25ns, the Phase Seg 2 is "10" (== 250ns), the SJW is "5" (== 125ns). Which is the same as on the sja1000 controller.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Link: https://lore.kernel.org/all/20230202110854.2318594-15-mkl@pengutronix.de Cc: Mark Bath <mark@baggywrinkle.co.uk> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|
b5a3d086 | 06-Sep-2022 |
Marc Kleine-Budde <mkl@pengutronix.de> |
can: bittiming: can_sjw_check(): check that SJW is not longer than either Phase Buffer Segment
According to "The Configuration of the CAN Bit Timing" [1] the SJW "may not be longer than either Phase
can: bittiming: can_sjw_check(): check that SJW is not longer than either Phase Buffer Segment
According to "The Configuration of the CAN Bit Timing" [1] the SJW "may not be longer than either Phase Buffer Segment".
Check SJW against length of both Phase buffers. In case the SJW is greater, report an error via netlink to user space and bail out.
[1] http://web.archive.org/http://www.oertel-halle.de/files/cia99paper.pdf
Link: https://lore.kernel.org/all/20230202110854.2318594-14-mkl@pengutronix.de Suggested-by: Vincent Mailhol <vincent.mailhol@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
show more ...
|