ESP32-PICO-D4 Chip revision and pSRAM limitations

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Mon Nov 09, 2020 12:35 pm

I have a project that migrated from ESP32-WROOM to ESP32-PICO-D4 + external pSRAM recently and am experiencing some spurious random crashes without meaningfull traces since that. Because of that I investigated a little bit and came up with some questions.

1. Which "ECO" revision is the PICO-D4? With release/v4.1 branch the application is in bootloop if I set the ESP32 chip revision to anything above 1 and also the esp_chip_info() function returns a revision of 1. Is this correct?

2. If I disable WiFi and LWIP memory in external pSRAM the crash frequency is reduced significantly. Is this to be expected currently?

2. My application allocates memory with heap_caps_malloc() only explicitly and most execution is done on a single FreeRTOS task on Core 0. I do not interact with the external memory from ISR. The only "problematic" use I can see is that I have a one FreeRTOS task on Core 1 that also interacts with some memory which is allocated on external pSRAM (for slow processing and data transfer out of the system). Is there any known bugs in this regard?

3. As written in this ticket https://github.com/espressif/esp-idf/issues/6093 with release v4.2 and above I get boot loops without the application starting up also if external pSRAM is disabled. Any known changes/limitations/bugs in that regard that I overlooked?

Best regards,

Paul

ESP_Sprite
Posts: 4108
Joined: Thu Nov 26, 2015 4:08 am

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby ESP_Sprite » Tue Nov 10, 2020 9:30 am

There's both a V1 and a V3 for the Pico; if esp_chip_info returns 1, your version is 1. The expected behaviour wrt crashes on the PicoD4 with psram is that it shouldn't crash at all (given proper code and otherwise proper hardware). Can't say too much about your situation otherwise.

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Tue Nov 10, 2020 9:44 am

Hi Sprite,

thank you very much for your answers. As you say there are different revisions for the Pico and I have a version 1 I expect I have to enable SPIRAM_CACHE_WORKAROUND in the sdkconfig, correct? My chips were purchased from Espressif very recently, I expect I can leave this option enabled even if I get revision 3 chips.

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Tue Nov 10, 2020 11:35 am

In addition this is in the v4.2-beta release notes:
The esp-2020r2 toolchain has been added in this release which includes some fixes for C++ exceptions, a fix for -O2 optimization of volatile word accesses, and an improved PSRAM fix. Extended stress testing of Wi-Fi & BT coexistence with heavy PSRAM use show there is a difficult to reproduce issue with the updated PSRAM fix. It is possible to roll back to the esp-2020r1 toolchain release by running git revert --no-edit 298f23c9 in the ESP-IDF directory, then re-run the install.sh/install.bat/install.ps1 and . export.sh/export.bat/export.ps1 steps as described in the Get Started guide. This issue will be fixed in the v4.2 final release.
which is part of why I expected there could be problems

User avatar
rudi ;-)
Posts: 1556
Joined: Fri Nov 13, 2015 3:25 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby rudi ;-) » Thu Nov 12, 2020 3:13 pm

Hi Paul

are you using the Pico SoC in your own design or a ready Kit with the SoC?
if readyKit which version stand on the back side ( example if you use the Pico Kit from espressif ESP32-PICO-KIT_V4.0, ESP32-PICO-KIT_V4.1 ), if in own design, can you share how you setup/connect the SoC with psram?

cheers
-------------------------------------
love it, change it or leave it.
-------------------------------------
問候飛出去的朋友遍全球魯迪

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Thu Nov 12, 2020 3:39 pm

Hi rudi,

thanks for following up. This is our own design and unfortunately I can not easily share the schematics without an NDA. What I can say is that since I upgraded to the current master and enabled SPIRAM_CACHE_WORKAROUND it seems to be more stable. In general the application (now ported to PICO-D4) runs for months up to years without reboot so it must be something "changed" and I thought it might be the pSRAM but could also be the switch to netif or the chip etc..

Now currently the application runs for multiple hours up to sometimes a day before the board reboots in different environments. Unfortunately I was not able to get my serial connected board to crash until now and the result puzzles me a little bit. I record every minute the boot count and also last reset reason in an influxdb+grafana and can monitor pretty well that way. What puzzled me so far is that the reset reason was ESP_RST_WDT very often and ESP_RST_TASK_WDT sometimes even with task watchdogs disabled and the power reset.

What I found out just now is that there seems to be a stack overflow in a task on CPU1 which uses the esp http client if the WiFi disconnects (I unplugged the router) and then the ESP bootloops with the following output:
W (23856231) wifi:<ba-del>idx
W (23856231) DriverESP32Wifi: STA disconnected (200)
E (23856241) TRANS_TCP: [sock=50] delayed connect error: Software caused connection abort

***ERROR*** A stack overflow in task LogWriterTask has been detected.

Backtrace:0x4008bac9:0x3fff2050 0x4008c1a9:0x3fff2070 0x4008eca9:0x3fff2090 0x4008cea6:0x3fff2110 0x4008e160:0x3fff2150 0x4008e112:0x4000bff0 |<-CORRUPTED


ELF file SHA256: 43401c931bee235b

Rebooting...
Re-enable cpu cache.

***ERROR*** A stack overflow in task LogWriterTask has been detected.

Backtrace:0x4011ef47:0x3fff1e40 0x4008c0e9:0x3fff1e80 0x4008bc93:0x3fff1ea0 0x4008bdf9:0x3fff1f20 0x4008c0d9:0x3fff1f70 0x40081d6a:0x3fff1f90 0x4008bac9:0x3fff2050 0x4008c1a9:0x3fff2070 0x4008eca9:0x3fff2090 0x4008cea6:0x3fff2110 0x4008e160:0x3fff2150 0x4008e112:0x4000bff0 |<-CORRUPTED


ELF file SHA256: 43401c931bee235b

Rebooting...
until the Router booted up again and the WiFi was able to connect again. This means the "Reboot" did not stop the Task on CPU1 and also did not stop the WiFi core. Is this documented somewhere or even not intended? If the devices spuriously loose wifi and then reboot it could explain the whole set of problems.

User avatar
rudi ;-)
Posts: 1556
Joined: Fri Nov 13, 2015 3:25 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby rudi ;-) » Thu Nov 12, 2020 4:26 pm

Hi Paul

u are welcome, i come back asap to this today little later ( i'm still on the go to home )
and edit this post, have read your post. in the meantime, can you check, which Pico SoC Version you use with this "error"? ( V0, V1, V3 ) ( you wrote: I have a version 1 , ) ok,
and, can you share which psram you use? is your psram connect to the Pico Flash with shared CLK and only CS pin is extra, or do you use in your design the preview version which is connected own clk and cs pin to the psram and share only datapins with flash?
which mode you run with shared datapins to the flash and clk in menuconfig is 40 or 80 MHz set?
for version check, if you use esptool can you please use the latest release version ( 3.0 ). one further thing, how big is your psram size 16Mbit (2MB) , 32Mbit (4MB) or 64Mbit(8MB) and how much you set the size in the menuconfig ( auto or exact the size which you use or set to "2MB" )?
can you check in menuconfig, that you setup flash size to right size ( 4MB ).

one further thing:
have you outsourced LWIP parts via psram malloc in your code?
how you setup the task for it ( how much mem you give your task? )

cheers
-------------------------------------
love it, change it or leave it.
-------------------------------------
問候飛出去的朋友遍全球魯迪

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Thu Nov 12, 2020 5:30 pm

Hi rudi, thank you very much for your restless support :D
and edit this post, have read your post. in the meantime, can you check, which Pico SoC Version you use with this "error"? ( V0, V1, V3 ) ( you wrote: I have a version 1 , ) ok,
I think we ordered some "sample quantity" directly from Espressif which might explain why we got Revision 1
and, can you share which psram you use? is your psram connect to the Pico Flash with shared CLK and only CS pin is extra, or do you use in your design the preview version which is connected own clk and cs pin to the psram and share only datapins with flash?
We ordered the ESP-PSRAM64H directly from E5spressif and it also has the logo on it. Connection is:

CE - SD3 (pin 29)
SO/SIO[1] - IO17 (pin 27)
SIO[2] - SD0 (pin 32)
VSS - GND
VDD - 3V3
SIO[3] - CMD (pin 30)
SCLK - CLK (pin 31)
SO/SIO[0] - SD1 (pin 33)
which mode you run with shared datapins to the flash and clk in menuconfig is 40 or 80 MHz set?
Current pSRAM config is 40Mhz as there is no other configuration available
for version check, if you use esptool can you please use the latest release version ( 3.0 ). one further thing, how big is your psram size 16Mbit (2MB) , 32Mbit (4MB) or 64Mbit(32MB) and how much you set the size in the menuconfig ( auto or exact the size which you use or set to "2MB" )?
can you check in menuconfig, that you setup flash size to right size ( 4MB ).
pSRAM config is set to auto, flash config is set to CONFIG_ESPTOOLPY_FLASHSIZE_4MB=y
have you outsourced LWIP parts via psram malloc in your code?
how you setup the task for it ( how much mem you give your task? )
I tested both and it happens in both situations. I left the allocation size default in the sdkconfig.

In general I am no longer sure it is because of the pSRAM (see my last reply) I am also testing a version without pSRAM enabled right now and wait if there will be a reboot. I'm right now more interrested how the CPU can reset without WiFi/CPU1 properly resetting.

PaulFreund
Posts: 45
Joined: Wed Nov 15, 2017 9:07 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby PaulFreund » Fri Nov 13, 2020 7:10 am

After flashing a firmware with pSRAM completely disabled in menuconfig and replacing all external with internal mallocs I was now able to see that overnight there were again reboots in 2 out of four electronics with ESP_RST_WDT as reset reason so I assume that the pSRAM is not at fault.

User avatar
rudi ;-)
Posts: 1556
Joined: Fri Nov 13, 2015 3:25 pm

Re: ESP32-PICO-D4 Chip revision and pSRAM limitations

Postby rudi ;-) » Fri Nov 13, 2020 9:30 am

Hi Paul

do you get again a stack overflow in task LogWriterTask or was it an other?
can you check the "unconnect func" - what happens with the socket pointer ( free ? ) , and how you generate then a new ( enough MEM ? ) do you check before using socket handler ( pointer ) that the pointer is valid ?

  1. W (23856231) wifi:<ba-del>idx
  2. W (23856231) DriverESP32Wifi: STA disconnected (200)
  3. E (23856241) TRANS_TCP: [sock=50] delayed connect error: Software caused connection abort
  4.  
  5. ***ERROR*** A stack overflow in task LogWriterTask has been detected.
is the psram still wired in this modul or do you remove it too after you remove psram using in firmware and change to inside malloc?

best wishes
rudi ;-)
-------------------------------------
love it, change it or leave it.
-------------------------------------
問候飛出去的朋友遍全球魯迪

Who is online

Users browsing this forum: No registered users and 33 guests