OTA Update Fails on First Attempt After Custom Code Deployment, Succeeds After Reboot – Heap/Stack Issue?
Posted: Tue May 13, 2025 12:14 pm
Hello ESP RainMaker Community,
I’m facing a persistent issue with OTA updates on custom ESP32 devices using ESP RainMaker, and I’m hoping for some guidance or suggestions from the community.
Issue Summary
When deploying a new device with our custom firmware, the first OTA update attempt almost always fails.
After rebooting the device, OTA updates typically succeed on the next attempt.
We monitor heap status using logs like:
I (251986) MEM: Free heap: 70520, Largest block: 31744
We have already increased the task stack size, but the issue persists.
Our custom code stores data in heap and stack at runtime, which might be related.
Error Logs and Behavior
Here’s what we observe during the failed OTA attempt:
OTA starts, firmware download is initiated.
Heap size appears stable and sufficient before and during OTA.
The OTA process fails with errors such as:
E (335568) Dynamic Impl: alloc(16749 bytes) failed
E (335568) esp-tls-mbedtls: read error :-0x7F00:
E (335568) transport_base: esp_tls_conn_read error, errno=Success
E (335578) HTTP_CLIENT: transport_read: error - -1 | ESP_FAIL
E (335588) esp-tls-mbedtls: read error :-0x7200:
E (335588) transport_base: esp_tls_conn_read error, errno=Success
E (335598) HTTP_CLIENT: transport_read: error - -1 | ESP_FAIL
E (335608) esp_https_ota: data read -1, errno 0
E (335608) esp_rmaker_ota: ESP_HTTPS_OTA upgrade failed ESP_FAIL
I (335618) esp_rmaker_ota: Reporting failed: OTA failed: Error ESP_FAIL
I (335618) esp_rmaker_mqtt: (D)CONFIG_ESP_RMAKER_MQTT_USE_BASIC_INGEST_TOPICS
and
E (291656) esp-tls-mbedtls: read error :-0x7F00:
E (291736) esp_rmaker_ota: ESP_HTTPS_OTA upgrade failed ESP_FAIL
E (292046) esp_image: Checksum failed. Calculated 0x5e read 0xf7
E (292056) esp_rmaker_ota: Image validation failed, image is corrupted
The device reports OTA failure and does not update.
After a manual reboot, the next OTA attempt usually works without any issues.
What We've Tried
Increased task stack size for OTA and main tasks.
Verified that heap remains stable and not fragmented during OTA.
Ensured the OTA binary is built correctly with an incremented version and matching project name.
Used the recommended API: esp_rmaker_ota_enable_default() for enabling OTA.
Suspected Cause
It appears that something in our custom code (possibly heap/stack usage or memory fragmentation) affects the OTA process after initial device provisioning. After a reboot, the memory is cleaned up, and OTA works as expected. This suggests some initialization or resource allocation issue that only resolves after a reboot.
Any insights, suggestions, or references to similar issues would be greatly appreciated!
Thank you,
I’m facing a persistent issue with OTA updates on custom ESP32 devices using ESP RainMaker, and I’m hoping for some guidance or suggestions from the community.
Issue Summary
When deploying a new device with our custom firmware, the first OTA update attempt almost always fails.
After rebooting the device, OTA updates typically succeed on the next attempt.
We monitor heap status using logs like:
I (251986) MEM: Free heap: 70520, Largest block: 31744
We have already increased the task stack size, but the issue persists.
Our custom code stores data in heap and stack at runtime, which might be related.
Error Logs and Behavior
Here’s what we observe during the failed OTA attempt:
OTA starts, firmware download is initiated.
Heap size appears stable and sufficient before and during OTA.
The OTA process fails with errors such as:
E (335568) Dynamic Impl: alloc(16749 bytes) failed
E (335568) esp-tls-mbedtls: read error :-0x7F00:
E (335568) transport_base: esp_tls_conn_read error, errno=Success
E (335578) HTTP_CLIENT: transport_read: error - -1 | ESP_FAIL
E (335588) esp-tls-mbedtls: read error :-0x7200:
E (335588) transport_base: esp_tls_conn_read error, errno=Success
E (335598) HTTP_CLIENT: transport_read: error - -1 | ESP_FAIL
E (335608) esp_https_ota: data read -1, errno 0
E (335608) esp_rmaker_ota: ESP_HTTPS_OTA upgrade failed ESP_FAIL
I (335618) esp_rmaker_ota: Reporting failed: OTA failed: Error ESP_FAIL
I (335618) esp_rmaker_mqtt: (D)CONFIG_ESP_RMAKER_MQTT_USE_BASIC_INGEST_TOPICS
and
E (291656) esp-tls-mbedtls: read error :-0x7F00:
E (291736) esp_rmaker_ota: ESP_HTTPS_OTA upgrade failed ESP_FAIL
E (292046) esp_image: Checksum failed. Calculated 0x5e read 0xf7
E (292056) esp_rmaker_ota: Image validation failed, image is corrupted
The device reports OTA failure and does not update.
After a manual reboot, the next OTA attempt usually works without any issues.
What We've Tried
Increased task stack size for OTA and main tasks.
Verified that heap remains stable and not fragmented during OTA.
Ensured the OTA binary is built correctly with an incremented version and matching project name.
Used the recommended API: esp_rmaker_ota_enable_default() for enabling OTA.
Suspected Cause
It appears that something in our custom code (possibly heap/stack usage or memory fragmentation) affects the OTA process after initial device provisioning. After a reboot, the memory is cleaned up, and OTA works as expected. This suggests some initialization or resource allocation issue that only resolves after a reboot.
Any insights, suggestions, or references to similar issues would be greatly appreciated!
Thank you,