Patterns for heap corruption resolution ...

User avatar
kolban
Posts: 1087
Joined: Mon Nov 16, 2015 4:43 pm
Location: Texas, USA

Patterns for heap corruption resolution ...

Postby kolban » Fri Nov 24, 2017 5:20 am

Imagine I am suffering from memory corruption issues. Using the latest ESP-IDF, I switch on "Comprehensive" heap memory debugging.

I then inject a call to:

heap_caps_check_integrity_all()

This detects a problem ... and reports:

CORRUPT HEAP: Invalid data at 0x3ffd0a90. Expected 0xfefefe got 0xfefefefd

My interpretation of this is that memory I previously allocated and then freed is still being accessed in the code. Ok ... so what do I know ... I see that I know the address of where invalid data has been detected. In this example, it is 0x3ffd0a90. Now my puzzle is to go deeper and figure out what this memory was used for when it was allocated and when/who freed it. And this is where the puzzles start.

I understand that I can perform a memory trace and record allocation and release records. Since I'm not looking for a leak (allocates but no frees) but am instead looking for allocation and then free but continued use .. I have to trace all the records. If I log all the records (and there are thousands of them) and then search for my address, I don't find it. Why? Well ... I suspect that if I allocate 100 bytes and then free that storage and then something "tinkers" with byte 20 within that range, the corrupt heap will not match a previous allocate/free.

Since an allocate "records" the start address and size, the I should be able to find my target address (x) as existing in one of the records such that

record_start <= x <= record_start + record_size

Since there are thousands of records, I can't do this by eyeball but instead have to code something to do this. And in there is the next puzzle ... when I call heap_caps_check_integrity_all() I get back a boolean indicating integrity is good or integrity is bad. If bad, the failed address is written to the console. That doesn't help me as I need the failing address in my app so I can then perform arithmetic against my array of trace records caught.

All this is starting to feel complex/strange ... and that makes me feel that I am not properly understanding the whole story. Is there a better technique that I should be using. If we find that a piece of heap has become corrupted and we suspect that it has been previously allocate, released but still being used ... how can I work this backwards to find out "To whom was this storage previously allocated so that I can examine to see if it recognizes that it was released?"
Free book on ESP32 available here: https://leanpub.com/kolban-ESP32

Who is online

Users browsing this forum: No registered users and 7 guests