CAn controller

Timons
Posts: 2
Joined: Thu Jan 25, 2018 1:01 pm

CAn controller

Postby Timons » Tue Mar 13, 2018 3:17 pm

The Can Controller for the ESP32 works great,
Only when i want to send a couple of messages to CAN, it gives me a Warning and eventually BUS OFF when i do it with multiple nodes on the CAN BUs
When i only have 1 other node i communicate nothing is wrong.

And when i get in BUS-OFF i restart the CAn Controller and it says there is no more error only i cant communicate with my other node.
The setting stayed the same.

Or is there another way to restart the CAN controller

spintec
Posts: 2
Joined: Thu Mar 15, 2018 3:27 pm

Re: CAn controller

Postby spintec » Thu Mar 15, 2018 3:48 pm

Hi,

I'm facing similar problems with bus-off.
Only workaround I have found till now is to read TX error counter an if it exceed half of max value, then start CPU based bus-off recovery before it actually enter real bus-off.
In attachment there is my modified CAN.c - try it and let me know if it works for you.
Attachments
CAN.c
(8.62 KiB) Downloaded 36 times

jcsbanks
Posts: 33
Joined: Tue Mar 28, 2017 8:03 pm

Re: CAn controller

Postby jcsbanks » Fri Apr 13, 2018 1:18 pm

Interesting. Ignore tx_error_count below as unused, but this is the output of a task that prints these values whenever TXERR.B (transmit error counter) has changed. It missed some samples as I had to put in a vTaskDelay(1) at the end of its loop to keep the watchdog fed, but I see SR.ES changing to 1 when TXERR.B reaches 96, and then reverts to 0 when it drops below 96.

This was in response to sending a CAN frame without another device on the bus, then connecting the other device and sending further frames.

Trying to work out what scenarios would produce a bus off as it seems that the transmit retries. Thinking about how to handle an always on device that may sometimes have nothing else on the CAN bus or a sudden power down of another device on the CAN bus.

TXERR.B 64, tx_error_count 0, SR.BS 0, SR.ES0
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 128, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 126, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 124, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 122, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 120, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 119, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 118, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 116, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 114, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 112, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 110, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 108, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 106, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 105, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 102, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 100, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 98, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 96, tx_error_count 0, SR.BS 0, SR.ES1
TXERR.B 94, tx_error_count 0, SR.BS 0, SR.ES0

Markus Becker
Posts: 12
Joined: Fri Mar 02, 2018 3:24 pm

Re: CAn controller

Postby Markus Becker » Fri Apr 13, 2018 8:47 pm

Hi Timons,

first of all, bus off condition should not be reached under normal circumstances. It might be caused by some hardware issue, like invalid bus termination, or you might have multiple nodes trying to arbitrate the bus using the same message id at a time.

I'd recommend to look at the SJA1000 data sheet closely. It has a builtin error recovery procedure that meets CAN specification. It would never recover, if there were lots of dominant bits on the bus...

To be able to recover/retry under any circumstances - like open bus, no termination, faulty nodes, broken cables,... - i've used the following procedure without issues up to now:

1. wait long time (a second)
2. enter reset mode manually (MODULE_CAN->MOD.B.RM = 1;)
3. uninstall isr (esp_intr_free( CAN_cfg.intr_handle);)
4. wait long time (a second second)
5. initialise again (CAN_init();)

Best,
Markus

jcsbanks
Posts: 33
Joined: Tue Mar 28, 2017 8:03 pm

Re: CAn controller

Postby jcsbanks » Fri Apr 13, 2018 10:35 pm

Markus, wondering from my trace whether retransmission stops when it goes error passive with TXERR.B 128, since it does not go higher and when I reconnect the second node, the message is then received, so it does not reach bus off. I could not see from SJA1000 manual what would happen with retransmission (except in single shot mode). So far I am thinking of not recovering from bus off and reporting an error only.

jcsbanks
Posts: 33
Joined: Tue Mar 28, 2017 8:03 pm

Re: CAn controller

Postby jcsbanks » Sat Apr 14, 2018 8:22 am

Kvaser explain it:

Q: What happens if a node is alone on the bus and tries to transmit?
A: The node will, of course, win the arbitration and happily proceeds with the message transmission. But when the time comes for acknowledging… no node will send a dominant bit during the ACK slot, so the transmitter will sense an ACK error, send an error flag, increase its transmit error counter by 8 and start a retransmission. This will happen 16 times; then the transmitter will go error passive. By an special rule in the error confinement algorithm, the transmit error counter is not further increased if the node is error passive and the error is an ACK error. So the node will continue to transmit forever, at least until someone acknowledges the message.

Markus Becker
Posts: 12
Joined: Fri Mar 02, 2018 3:24 pm

Re: CAn controller

Postby Markus Becker » Sat Apr 14, 2018 4:10 pm

Hi jcsbanks,
exactly. For the simple case (node disconnected) the controller will enter passive error state and keep transmitting the frame without incrementing tx error counter until the frame gets acked by a node. In other cases, on a massively disturbed bus, bus-off will be reached.
Anyway, a frame sitting in the controller for a long time will likely become useless after some time (from the app point of view). And then, the node(s) that finally ack the frame might not include the node that should get that frame (could be still rebooting).
Many networks get perfectly usable again after some time, as power supply stabilizes after an overload for example.
This is why i think, an always on device should recover even from bus-off after some (longer) time. Can specification requires (and i think the SJA1000 ensures) 128 * 11 bits pause, but I would recommend a much longer time. Then, if successful communication to restart count ratio gets too low, a good can node could stop can and report "CAN ready" ;)

Markus

jcsbanks
Posts: 33
Joined: Tue Mar 28, 2017 8:03 pm

Re: CAn controller

Postby jcsbanks » Sat Apr 14, 2018 9:21 pm

Good advice, thanks Markus.

Who is online

Users browsing this forum: No registered users and 0 guests