Platform:
ESP32-WROOM-32E-N8 (8MB flash, ESP32-D0WD-V3)
Arduino-ESP32 3.1.3 via PlatformIO (pioarduino 53.03.13), ESP-IDF 5.3.2
Commercial hotel room automation controller
Project Context:
This is a commercial firmware running on ESP32-WROOM-32E 8MB. We use almost every peripheral: MCP23017 I2C GPIO expander, RS485 Modbus RTU (up to 128 slave devices), TLS MQTT over WiFi, BLE (NimBLE 2.5), WS2812 LEDs, Modbus panel controllers, DAC (MCP4728), LittleFS on a custom partition, and a dedicated log-structured flash partition for offline light state storage. The device runs dual-core FreeRTOS — Core 1 for Modbus polling, Core 0 for WiFi/MQTT/BLE. These are deployed in hotel rooms, so a permanent boot loop (no remote recovery, requires physical power cycle) is a production-critical issue.
ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
- Attachments
-
- device-monitor-260402-100700.log.txt
- (277.7 KiB) Downloaded 6 times
Re: ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
Hard to say from the logs alone. Does the thing recover after you power-cycle it?
Re: ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
Yes, after powering OFF and ON again, this issue is completely resolved, and the ESP32 controller runs perfectly until the problem occurs again.
However, it is not resolved by pressing the EN (reset) button. When the issue occurs and I press the EN button, the logs change and look like this:
PFA logs_samples.txt
However, it is not resolved by pressing the EN (reset) button. When the issue occurs and I press the EN button, the logs change and look like this:
PFA logs_samples.txt
- Attachments
-
- logs_samples.txt
- (2.5 KiB) Downloaded 7 times
Re: ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
Feels like the flash got in some weird state; pulling EN resets the ESP32 fully (to the point that it's the same as powering it down and up again) but not the flash.
Are you sure that with so much going on externally, you're not seeing some EMF effect getting into the WROOM module? I can imagine e.g. long signal lines picking something up, or even a button or connector connected to the ESP being zapped by a statically charged human. Given the fact that the flash is entirely out to lunch, but recovers after a powerdown, that would be my guess.
Are you sure that with so much going on externally, you're not seeing some EMF effect getting into the WROOM module? I can imagine e.g. long signal lines picking something up, or even a button or connector connected to the ESP being zapped by a statically charged human. Given the fact that the flash is entirely out to lunch, but recovers after a powerdown, that would be my guess.
Re: ESP32 Permanent Boot Loop (TG1WDT_SYS_RESET → invalid header 0xFFFFFFFF) — Root Cause & Fix for Flash Sector Erase
Our hardware has successfully passed EMC/EMF testing conducted by a certified European laboratory, and this issue was not observed during those tests. During those tests all available interfaces where activated in parallel.
At present, we are testing 17 boards running the same firmware. The issue occurs randomly, sometimes 4 days without any crash, and then 1 crash/day on different boards. The time the crash happens is also random, sometimes it happens 5 minutes after start, on other controllers after 2 weeks without reboot. No external devices are connected during testing (such as Modbus interfaces or relay shields, although the connectors are available on the board). Despite this minimal setup, the issue still occurs. Also overnight when no one is near them.
The boot-stuck issue occurs before the ESP32 idle task running i.e., MQTT (TLS 1.2) connected and Modbus polling but no external device is connected.
We are also using a PSRAM-enabled variant: ESP32-WROOM-32E-N8R2 (8MB flash, ESP32-D0WDR2-V3), and we are observing a similar issue on this module as well.
We have already monitored & checked all strapping pins and power supply and haven't found any concerning values.
Additionally, we have observed that after pulling the EN pin low (reset),after some time the booting process logs appear, and the device eventually boots successfully. This is our latest observation.
We have attached a snapshot/document capturing the controller failure logs for your review. Kindly check.
At present, we are testing 17 boards running the same firmware. The issue occurs randomly, sometimes 4 days without any crash, and then 1 crash/day on different boards. The time the crash happens is also random, sometimes it happens 5 minutes after start, on other controllers after 2 weeks without reboot. No external devices are connected during testing (such as Modbus interfaces or relay shields, although the connectors are available on the board). Despite this minimal setup, the issue still occurs. Also overnight when no one is near them.
The boot-stuck issue occurs before the ESP32 idle task running i.e., MQTT (TLS 1.2) connected and Modbus polling but no external device is connected.
We are also using a PSRAM-enabled variant: ESP32-WROOM-32E-N8R2 (8MB flash, ESP32-D0WDR2-V3), and we are observing a similar issue on this module as well.
We have already monitored & checked all strapping pins and power supply and haven't found any concerning values.
Additionally, we have observed that after pulling the EN pin low (reset),after some time the booting process logs appear, and the device eventually boots successfully. This is our latest observation.
We have attached a snapshot/document capturing the controller failure logs for your review. Kindly check.
- Attachments
-
- Screenshot 2026-04-23 175349.png (90.83 KiB) Viewed 57 times
Who is online
Users browsing this forum: MicroController and 13 guests
