BUG: Using portMAX_DELAY with select() results in no delay
Posted: Thu Mar 13, 2025 12:42 am
Issue
While implementing UDP socket listening on multiple ports, I discovered a bug when trying to make select() wait indefinitely.
I passed portMAX_DELAY/1000 (0xFFFFFFFF/1000) as the tv_sec value in struct timeval, expecting it to wait for a very long time. Instead, the function returned immediately without waiting.
Root Cause
Looking at the LwIP implementation of select(), I found that internally it converts the timeout to a signed long:
On ESP32, long is implemented as a 32-bit integer (int32_t). When using large values like portMAX_DELAY, this calculation overflows the 32-bit signed integer limit and becomes negative, causing LwIP to set a minimal timeout (1ms) instead of a maximum one.
The maximum allowed timeout value is 0x7FFFFFFF milliseconds (approximately 24.85 days), which is INT32_MAX.
Solution
To make select() wait indefinitely, pass NULL as the timeout parameter:
Suggestions for Improvement
Add explicit documentation to the select() function noting the valid timeout range and recommending NULL for indefinite waits.
Modify the implementation to gracefully handle the case where tv_sec is too large, either by:
While implementing UDP socket listening on multiple ports, I discovered a bug when trying to make select() wait indefinitely.
I passed portMAX_DELAY/1000 (0xFFFFFFFF/1000) as the tv_sec value in struct timeval, expecting it to wait for a very long time. Instead, the function returned immediately without waiting.
Root Cause
Looking at the LwIP implementation of select(), I found that internally it converts the timeout to a signed long:
Code: Select all
long msecs_long = ((timeout->tv_sec * 1000) + ((timeout->tv_usec + 500) / 1000));The maximum allowed timeout value is 0x7FFFFFFF milliseconds (approximately 24.85 days), which is INT32_MAX.
Solution
To make select() wait indefinitely, pass NULL as the timeout parameter:
Code: Select all
// This will wait forever
int ret = select(sock + 1, &rfds, NULL, NULL, NULL);Add explicit documentation to the select() function noting the valid timeout range and recommending NULL for indefinite waits.
Modify the implementation to gracefully handle the case where tv_sec is too large, either by:
- Clamping large values to the maximum safe timeout
- Converting indefinite-looking timeouts (like UINT32_MAX) to NULL automatically