SPI Behaves Strange Under RTOS
Posted: Sat Jan 24, 2026 9:30 am
Hello All,
I am pretty sure that I am coming with a strange one. I'll keep it as short as possible.
I wrote my own SPI driver in order to accelerate JTAG transfers of the CMSIS-DAP firmware.
However, what I see is that:
1) In a while(1) loop under app_main task, everything is fine. Transfers are superfast, data is correctly driven and sampled.
2) Under RTOS, I mean after creating a task and using the same codes there, SPI master and slave sometimes drive the line different from the value that I provide.
I am attaching the codes for SPI Full duplex communication and SPI Slave buffer enqueuing, and while(1) loop under app_main which behaves correctly, respectively.
Here is the list of what I tried:
1) I put the codes in critical section: NO HELP.
2) I moved the task to core1 while moving ALL other tasks to core0: NO HELP.
By the way, TMS is driven by SPI Slave, TDI is driven my master, and TDO is sampled by master. TRST corresponds to CS line between them.
PLEASE see the screenshot below. Transaction happens under a FreeRTOS task. You will see that at bit 17, there is a 1 which was not supposed to be there as the 3rd byte of TDI buffer is completely zero.

Below transaction happens under app_main task. As you see, line is driven correctly.
WHY do you think this strange behavior occurs?
I am pretty sure that I am coming with a strange one. I'll keep it as short as possible.
I wrote my own SPI driver in order to accelerate JTAG transfers of the CMSIS-DAP firmware.
However, what I see is that:
1) In a while(1) loop under app_main task, everything is fine. Transfers are superfast, data is correctly driven and sampled.
2) Under RTOS, I mean after creating a task and using the same codes there, SPI master and slave sometimes drive the line different from the value that I provide.
I am attaching the codes for SPI Full duplex communication and SPI Slave buffer enqueuing, and while(1) loop under app_main which behaves correctly, respectively.
Here is the list of what I tried:
1) I put the codes in critical section: NO HELP.
2) I moved the task to core1 while moving ALL other tasks to core0: NO HELP.
Code: Select all
void SPI_SendReceive(uint8_t *sendBuff, uint8_t *readBuff, uint32_t len)
{
portENTER_CRITICAL(
&spi_spinlock
);
uint32_t tempLen = len;
if(tempLen < 32)
{
tempLen = 4;
}
else if(tempLen == 512)
{
tempLen = 64;
}
else
{
tempLen = len / 8 + (len %8 ? 1 : 0);
}
// 1. Wait for Idle
while (SPI_HW->cmd.usr);
// Reset the "Async FIFO" (AFIFO) pointers
SPI_HW->dma_conf.buf_afifo_rst = 1;
while (SPI_HW->dma_conf.buf_afifo_rst);
SPI_HW->dma_conf.rx_afifo_rst = 1;
while (SPI_HW->dma_conf.rx_afifo_rst);
// 4. Fill Buffer (Using 32-bit Access Fix)
//memcpy(&(SPI_HW->data_buf[0]), sendBuff, tempLen);
// Use 32-bit access. memcpy is unsafe for APB registers.
uint32_t byte_len = (len + 7) / 8;
uint32_t words = (byte_len + 3) / 4;
volatile uint32_t *fifo = (volatile uint32_t *)SPI_HW->data_buf;
int i;
for(int j = 0; j < 4; j++)
{
for(i=0; i<words; i++) {
uint32_t word = 0;
if(i*4 < byte_len) word |= sendBuff[i*4];
if(i*4+1 < byte_len) word |= (sendBuff[i*4+1] << 8);
if(i*4+2 < byte_len) word |= (sendBuff[i*4+2] << 16);
if(i*4+3 < byte_len) word |= (sendBuff[i*4+3] << 24);
fifo[i] = word;
}
}
memset(&fifo[i], 0, (16- words) * 4);
// 2. Setup Bit Length (bits - 1)
SPI_HW->ms_dlen.ms_data_bitlen = len - 1;
__asm__ volatile("memw");
// 5. Update Configuration (Vital for S3)
SPI_HW->cmd.update = 1;
while (SPI_HW->cmd.update);
// 6. Execute
SPI_HW->cmd.usr = 1;
while (SPI_HW->cmd.usr);
// 7. Read Buffer (Using 32-bit Access Fix)
//memcpy(readBuff, &(SPI_HW->data_buf[0]), tempLen);
// 8. Read Data (if needed)
if (readBuff) {
for(int i=0; i<words; i++) {
uint32_t word = fifo[i];
if(i*4 < byte_len) readBuff[i*4] = word & 0xFF;
if(i*4+1 < byte_len) readBuff[i*4+1] = (word >> 8) & 0xFF;
if(i*4+2 < byte_len) readBuff[i*4+2] = (word >> 16) & 0xFF;
if(i*4+3 < byte_len) readBuff[i*4+3] = (word >> 24) & 0xFF;
}
}
portEXIT_CRITICAL(
&spi_spinlock
);
}Code: Select all
static portMUX_TYPE spi_spinlock = portMUX_INITIALIZER_UNLOCKED;
void SPI_SlaveQueue(uint8_t *sendBuff, uint32_t len)
{
portENTER_CRITICAL(
&spi_spinlock
);
uint32_t tempLen = len;
if(tempLen < 32)
{
tempLen = 4;
}
else if(tempLen == 512)
{
tempLen = 64;
}
else
{
tempLen = len / 8 + (len %8 ? 1 : 0);
}
while(SPI_SLAVE_HW->cmd.usr);
// 4. Fill Buffer (Using 32-bit Access Fix)
for(int j = 0; j < 4; j++)
{
memcpy(&(SPI_SLAVE_HW->data_buf[0]), sendBuff, tempLen);
}
// Reset the "Async FIFO" (AFIFO) pointers
SPI_SLAVE_HW->slave.soft_reset = 1;
while (SPI_SLAVE_HW->slave.soft_reset);
// Reset the "Async FIFO" (AFIFO) pointers
SPI_SLAVE_HW->dma_conf.buf_afifo_rst = 1;
while (SPI_SLAVE_HW->dma_conf.buf_afifo_rst);
SPI_SLAVE_HW->dma_conf.rx_afifo_rst = 1;
while (SPI_SLAVE_HW->dma_conf.rx_afifo_rst);
SPI_SLAVE_HW->cmd.update = 1;
while(SPI_SLAVE_HW->cmd.update);
// 6. Execute
SPI_SLAVE_HW->cmd.usr = 1;
__asm__ volatile("memw");
portEXIT_CRITICAL(
&spi_spinlock
);
}Code: Select all
while (1)
{
volatile uint32_t use_tms_proper = 0 , use_tdi_proper = 0;
int64_t start = esp_timer_get_time();
if(use_tms_proper)
{
SPI_Transmit_TMS(tms, 128);
}
else
{
SPI_SlaveQueue(tms, 128);
}
if(use_tdi_proper)
{
SPI_Transmit_TDI_TDO (tdi, tdo, 128);
}
else
{
SPI_SendReceive(tdi, tdo, 128);
}
tms[0]++;
tdi[0]++;
tms[1]++;
tdi[1]++;
tms[2]++;
tdi[2]++;
tms[3]++;
tdi[3]++;
tms[4]++;
tdi[4]++;
tms[5]++;
tdi[5]++;
tms[6]++;
tdi[6]++;
tms[7]++;
tdi[7]++;
tms[8]++;
tdi[8]++;
tms[9]++;
tdi[9]++;
tms[10]++;
tdi[10]++;
tms[11]++;
tdi[11]++;
tms[12]++;
tdi[12]++;
//printf("write successful\n");
sleep(0.02);
}PLEASE see the screenshot below. Transaction happens under a FreeRTOS task. You will see that at bit 17, there is a 1 which was not supposed to be there as the 3rd byte of TDI buffer is completely zero.

Below transaction happens under app_main task. As you see, line is driven correctly.
WHY do you think this strange behavior occurs?