RMT state machine problems

DrMickeyLauer
Posts: 209
Joined: Sun May 22, 2022 2:42 pm

RMT state machine problems

Postby DrMickeyLauer » Wed Jun 25, 2025 9:09 am

I'm using the RMT to send and receive the SAE J2716 SENT protocol which works great, but: I'm getting spurious:

Code: Select all

rmt: rmt_receive(411): channel not in enable state
when calling receive, although I'm 100% sure I'm not calling rmt_disable anywhere. It seems to happen when I have a uxTaskGetSystemState running. If I disable my SysInfo, I never get this error -- although I call that from a task that has the lowest priority.

Looking at the RMT driver's state machine, it made me wonder whether I'm allowed to call rmt_receive from the ISR context at all. Here's my ISR for reference:

Code: Select all

IRAM_ATTR bool RMTReceiverTask::rxDoneCallback(rmt_channel_handle_t channel, const rmt_rx_done_event_data_t* edata, void* user_data) {
    auto* self = static_cast<RMTReceiverTask*>(user_data);
    portBASE_TYPE higherPriorityTaskWoken = pdFALSE;

    bool isFull = false;

    for (auto i = 0; i < edata->num_symbols; i++) {
        if (!isFull) {
            if (!self->symbol_queue.push_ISR(edata->received_symbols[i], higherPriorityTaskWoken)) {
                self->dropped_symbols_count += (edata->num_symbols - i);
                isFull = true;
            }
        }
        if (edata->received_symbols[i].duration0 == 0 || edata->received_symbols[i].duration1 == 0) {
                self->startReceive();
        }
    }    
    if (higherPriorityTaskWoken) {
        portYIELD_FROM_ISR();
    }
    return true;
}
Since I do not want to push symbols with 0 (which are irrelevant for SENT as they reflect only SYNC or PAUSE nibbles) into the queue, I have to restart receiving (since 0 symbols stop the RMT) again.

Is this a potential problem or do you have an idea what else could be wrong? Note that this way works fine for hours without having the SysInfo stuff in parallel. For reference, here's what I'm doing in the low priority (1) task:

Code: Select all

esp_err_t SystemInfo::updateStats(TickType_t probeIntervalInMs) {
#ifdef CONFIG_FREERTOS_GENERATE_RUN_TIME_STATS
    TaskStatus_t *start_array = NULL, *end_array = NULL;
    UBaseType_t start_array_size, end_array_size;
    uint32_t start_run_time, end_run_time;

    // Allocate array to store current task states
    start_array_size = uxTaskGetNumberOfTasks() + ARRAY_SIZE_OFFSET;
    start_array = static_cast<TaskStatus_t*>(malloc(sizeof(TaskStatus_t) * start_array_size));
    if (start_array == NULL) { return ESP_ERR_NO_MEM; }

    // Get current task states
    start_array_size = uxTaskGetSystemState(start_array, start_array_size, &start_run_time);
    if (start_array_size == 0) {
        free(start_array);
        return ESP_ERR_INVALID_SIZE;
    }

    vTaskDelay(pdMS_TO_TICKS(probeIntervalInMs));

    // Allocate array to store tasks states post delay
    end_array_size = uxTaskGetNumberOfTasks() + ARRAY_SIZE_OFFSET;
    end_array = static_cast<TaskStatus_t*>(malloc(sizeof(TaskStatus_t) * end_array_size));
    if (end_array == NULL) {
        free(start_array);
        return ESP_ERR_NO_MEM;
    }
    // Get post delay task states
    end_array_size = uxTaskGetSystemState(end_array, end_array_size, &end_run_time);
    if (end_array_size == 0) {
        free(start_array);
        free(end_array);
        return ESP_ERR_INVALID_SIZE;
    }
    // Calculate total_elapsed_time in units of run time stats clock period.
    uint32_t total_elapsed_time = (end_run_time - start_run_time);
    if (total_elapsed_time == 0) {
        free(start_array);
        free(end_array);
        return ESP_ERR_INVALID_STATE;
    }
#ifdef CONFIG_SYSINFO_OUTPUT
    ESP_LOGD(LOG, "+----------------------+-----+-----------+-------+-------+");
    ESP_LOGD(LOG, "| %-20s | %-3s | %-9s | %-5s | %-5s |", "Task", "PRI", "Run Time", "Load", "Stack");
    ESP_LOGD(LOG, "+----------------------+-----+-----------+-------+-------+");
#endif

    uint32_t accumulatedIdlePercentage = 0;

    // Match each task in start_array to those in the end_array
    for (int i = 0; i < start_array_size; i++) {
        int k = -1;
        for (int j = 0; j < end_array_size; j++) {
            if (start_array[i].xHandle == end_array[j].xHandle) {
                k = j;
                //Mark that task have been matched by overwriting their handles
                start_array[i].xHandle = NULL;
                end_array[j].xHandle = NULL;
                break;
            }
        }
        // Check if matching task found
        if (k >= 0) {
            uint32_t task_elapsed_time = end_array[k].ulRunTimeCounter - start_array[i].ulRunTimeCounter;
            uint32_t percentage_time = (task_elapsed_time * 100UL) / (total_elapsed_time * portNUM_PROCESSORS);
#ifdef CONFIG_SYSINFO_OUTPUT
            ESP_LOGD(LOG, "| %-20s | %3" PRIu8 " | %9" PRIu32 " | %4" PRIu32 "%% | %5" PRIu32 " |",
                start_array[i].pcTaskName,
                start_array[i].uxCurrentPriority,
                task_elapsed_time,
                percentage_time,
                start_array[i].usStackHighWaterMark
            );
#endif
            // Store "IDLE" load
            if (strncmp(start_array[i].pcTaskName, "IDLE", 4) == 0) {
                accumulatedIdlePercentage += percentage_time;
            }
        }
    }

#ifdef CONFIG_SYSINFO_OUTPUT
    // Print unmatched tasks
    for (int i = 0; i < start_array_size; i++) {
        if (start_array[i].xHandle != NULL) {
            ESP_LOGD(LOG, "| %-20s | Deleted", start_array[i].pcTaskName);
        }
    }
    for (int i = 0; i < end_array_size; i++) {
        if (end_array[i].xHandle != NULL) {
            ESP_LOGD(LOG, "| %-20s | Created", end_array[i].pcTaskName);
        }
    }
    ESP_LOGD(LOG, "+----------------------+-----+-----------+-------+-------+");
#endif
    // Compute accumulated CPU load and memory pressure
    cpuConsumption = 100 - accumulatedIdlePercentage;
    if (cpuConsumption > maxCpuConsumption) {
        maxCpuConsumption = cpuConsumption;
    }
    // Free resources
    free(start_array);
    free(end_array);
#else
    ESP_LOGD(LOG, "CONFIG_FREERTOS_GENERATE_RUN_TIME_STATS not enabled. Task Statistics are not available.");
#endif
    return ESP_OK;
}

MicroController
Posts: 2669
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: RMT state machine problems

Postby MicroController » Wed Jun 25, 2025 12:34 pm

Hmm. The ISR unconditionally sets the "enable" state right before running the callback. Not sure what would change that before rmt_receive() does its check...

DrMickeyLauer
Posts: 209
Joined: Sun May 22, 2022 2:42 pm

Re: RMT state machine problems

Postby DrMickeyLauer » Fri Jun 27, 2025 2:44 pm

Here's how the logic analyzer sees it:
Screenshot 2025-06-27 at 16.38.44.png
Screenshot 2025-06-27 at 16.38.44.png (673.2 KiB) Viewed 439 times
RMT is configured with pulse range of [1us - 97us]. The IRQ callback is called fairly frequently and does its stuff.
At some point though it seems the IRQ (or the callback?) can't be served quickly enough, hence the RMT
buffer gets full. At this point it looks like the RMT driver does an auto-disable of the channel and the next rmt_receive fails.

It's just that I can't back this theory with the RMT driver code. I will try to give the RMT IRQ a higher priority. Perhaps this improves things. Like I said, it works perfectly for hours when the system is idle.

MicroController
Posts: 2669
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: RMT state machine problems

Postby MicroController » Fri Jun 27, 2025 3:52 pm

My €0,02 are:
1) The ISR sets the state to "enable", then calls the callback
2) The callback calls rmt_receive() which checks the state and fails because it's not "enable"
3) The "state" is just a variable in RAM, i.e. it cannot be modified directly by the peripheral
-> Whatever changes the "state" must be in the software
4) Only the driver code 'legally' accesses the "state" variable
-> Options:
a) concurrent calls to rmt_receive() or some other RMT function while the callback is executing?
b) corruption of either the RMT "handle" or the object it points to?

(Likely unrelated, but I believe higherPriorityTaskWoken needs to be the boolean OR of multiple queue operations - haven't checked the FreeRTOS source, but generally I expect that a waiting "higher priority task" can only be "woken up" once, and not necessarily by the last queue operation in a row.)

DrMickeyLauer
Posts: 209
Joined: Sun May 22, 2022 2:42 pm

Re: RMT state machine problems

Postby DrMickeyLauer » Mon Jun 30, 2025 3:53 pm

Thanks for your response, which pointed me to look deeper in my code. So: The major problem was that sometimes I indeed called rmt_receive() multiple times, because of the presence of multiple symbols with 0 duration in the array I got from the RMT driver. I adjusted this to trigger only the first of those times. I still have the same problem (rmt_receive called while the state is RUN), but less often now. So I dug deeper into the driver to learn how it actually works -- but failed ;-)

As far as I understand it, the RMT driver may call my callback up to three(!) times in a row for every IRQ it receives from the peripheral:
1. From within rmt_isr_handle_rx_threshold (since I have enabled partial receives): https://github.com/espressif/esp-idf/bl ... _rx.c#L701
2. From within rmt_isr_handle_rx_done: https://github.com/espressif/esp-idf/bl ... _rx.c#L616
3. Again, from within rmt_isr_handle_rx_done: https://github.com/espressif/esp-idf/bl ... _rx.c#L668

Now I don't understand that. Is this every time with the same set of symbols or not? Why is it calling more than once in the first place? I guess I need to look for .flags->is_last to prevent restarting rmt_receive too early (and more than once).

(As far as I know with regards to higherPriorityTaskWoken, this is supposed to be accumulative on its own..., but perhaps I misread that)

DrMickeyLauer
Posts: 209
Joined: Sun May 22, 2022 2:42 pm

Re: RMT state machine problems

Postby DrMickeyLauer » Tue Jul 01, 2025 1:30 pm

I'm now thinking of revamping my code to not use the partial receive. Since SENT sensors may work with any tick interval (from 3µs to 90µs), my code has two phases:

1. A calibration phase in which we're configuring RMT to receiving symbols within [1, 10000] µs.
For this phase, we have to use partial receives otherwise all buffers are filled too quickly.

2. A normal operation mode in which we're configuring RMT to receive symbols within [base tick, base tick * 27 * 1.2].
For this phase I used to use partial receives as well, but I will now try to disable this in order to simplify the logic.

Will keep you posted.

mcSensor
Posts: 3
Joined: Tue Oct 10, 2023 1:53 pm

Re: RMT state machine problems

Postby mcSensor » Mon Sep 22, 2025 12:45 pm

Have you been successful with this approach?

Who is online

Users browsing this forum: Applebot, Baidu [Spider], Qwantbot, YisouSpider and 4 guests