ESP-MQTT Event Handler Failure

rsimpsonbusa
Posts: 60
Joined: Tue May 17, 2016 8:12 pm

ESP-MQTT Event Handler Failure

Postby rsimpsonbusa » Mon Oct 21, 2019 12:00 am

Hi everybody. Long time.

Subject: ESP-MQTT and mqtt_event_handler.

Using espidf version 3.3 downloaded 1 hour ago, using example of wifi/getting-started and added the esp-mqtt example emitter-client for mqtt handling, broker is Cloudmqtt.

Local wifi connects and then I starts a cycle(about 10-15 secs) of [connect-sendmsg-disconnect(start-publish-stop)]. The result is as follows in the mqtt-event-handler.

1) Start->handler gets called and shows Connected.
2) esp_mqtt_client_publish(clientCloud,....) is called, the message is received in the broker(seen from a monitoring station) BUT no PUBLISHED event is called.
3)esp_stop is called BUT NO event is called.
4) There is an mqtt event #7 (not documented in the APIs)

If I change the menuconfig and disable mqtt 3.1.1 events are sent to the handler but network failure happens all the time and actually DISCONNECT gets called.


  • A partial log file with the cycle can be seen.
    Got Ip, so we are connected to the network
    Task is waiting of X event to start mqtt message delivery
    Gets the Event and
    "Starting mqtt client"--> funny gets event 7 (not documented) then another event 1 Connection and Gets the Connect event
    "Sending status...publish"--> gets no EVENT whatsoever but the messages is received by the Broker
    "Stop Mqtt...disconnect"--> gets no EVENT
    and the Waiting Queue starts the cycle again
    cycle starts again, all messages are received.
So why do I need help? Fear mostly. Why are the events not being reported? Bad programming? Bad Menuconfig? Bad mqtt settings? Bad broker? Lots of fear.

The way its implemented is via a semaphore but the EVENT_MANAGER itself should be the trigger for next phase of the cycle like:
Start->CONNECT->send msg->PUBLISHED->Stop->DISCONNECT

Any help welcomed.

App is included, BTW it manages an gpio_isr which is the event trigger (at X interrupt count)

Many thanks.

RSN
Attachments
Screen Shot 2019-10-20 at 6.39.16 PM.png
Screen Shot 2019-10-20 at 6.39.16 PM.png (109.78 KiB) Viewed 903 times
WiFi station Example.zip
(4.45 KiB) Downloaded 47 times

ESP-Marius
Posts: 9
Joined: Wed Oct 23, 2019 1:49 am

Re: ESP-MQTT Event Handler Failure

Postby ESP-Marius » Wed Oct 23, 2019 2:35 am

2) esp_mqtt_client_publish(clientCloud,....) is called, the message is received in the broker(seen from a monitoring station) BUT no PUBLISHED event is called.

MQTT_EVENT_PUBLISHED is only posted when a message is confirmed published by the broker/server, i.e. when the MQTT-client receives a PUBACK/PUBCOMP. The broker will only acknowledge message with these if you are publishing with Quality of Service 1 or 2, and since you are publishing with QoS=0 you won't get any acknowledgements.
3)esp_stop is called BUT NO event is called.

When calling esp_stop you are shutting down the whole client task, when the function returns the client is shut down, so there is no need for an event here.

What you are thinking of is probably MQTT_EVENT_DISCONNECTED. This event is posted to notify the user that something has gone wrong and that the client is aborting the connection. Could be an error during sending, the broker not responding to pings etc.
4) There is an mqtt event #7 (not documented in the APIs)
This is the MQTT_EVENT_BEFORE_CONNECT event which is posted before a connection is initiated. As far as I can see it is already mentioned in the API reference, is there any documentation you feel it is missing from?

rsimpsonbusa
Posts: 60
Joined: Tue May 17, 2016 8:12 pm

Re: ESP-MQTT Event Handler Failure

Postby rsimpsonbusa » Thu Oct 24, 2019 12:14 am

Hi ESP-Marius

YES I just read this a couple of hours ago. QOS 0 is fire and FORGET :lol:

About the STOP I thought this was the case but its not "fully" documented what it means. When you START you get a CONNECT(actually a PRECONNECT as you state here) and it seemed that the STOP will give u a DISCONNECT which could be use to shutdown your setup. I intended to use the mqtt event handler as a sort of event loop controller so NOW i know that disconnect will not be set unless some major error gets set (tcp,wifi,or Broker ).

I'm certain the documentation is there and I didn't searched properly.

One last reflexion. In your experience is the esp-mqtt stable without memory leaks. I've been out of action for about 1 year and last heard espidf (ver <3 I think) it had big leaks.

Thanks for your help.

RSN

ESP-Marius
Posts: 9
Joined: Wed Oct 23, 2019 1:49 am

Re: ESP-MQTT Event Handler Failure

Postby ESP-Marius » Thu Oct 24, 2019 12:59 am

Happy to help.

We are not aware of any memory leaks in the latest release of the MQTT library. There might appear to be some leaks due to TCP sockets not releasing all the memory at once when you close them (due to the TIME_WAIT state), but over time the memory usages should stabilize without any leaks.

rsimpsonbusa
Posts: 60
Joined: Tue May 17, 2016 8:12 pm

Re: ESP-MQTT Event Handler Failure

Postby rsimpsonbusa » Sat Oct 26, 2019 5:28 pm

Hi ESP-Marius

I found what appears to be a Task leak, not a memory one.

As I explained, I will be Starting-Sending-Stopping A LOT. The test is like every 5 seconds. Should be no problem.

I left the project running all night and a monitor with logging capabilities to record any problem. Several printfs are around.

Eventually the app fails in the Start routine with

Code: Select all

ESP_LOGE(TAG, "Error create mqtt task")
Sure enough if you review the mqtt_client.c you can see that Start creates a task, pinned or not, to a cpu. Seems standard. The problem is that when Stop is executed the Task is NOT DELETED, and therefore repeat Starts will fail creating the task due to resources allocation. Interesting enough it is not by lack of memory/heap, plenty to spare.

I honestly did not get into tracking what the specific error is, just that conceptually it is wrong(to me) to Create a Task with Start and not delete it with Stop, so I added a

Code: Select all

vTaskDelete(client->task_handle)
in the Stop routine. It solved the problem. Mind you it never crashed just was not able to connect to the network, probably the tcip control blocks where exhausted or something related.

Also to note that the esp_mqtt_destroy_config() does not delete the task either. Hopefully the system internally is protecting itself.

Now I have a none standard mqtt_client.c. Not good. Hope to remember to make the change in new versions if not fixed in the git.

Regards.

ESP-Marius
Posts: 9
Joined: Wed Oct 23, 2019 1:49 am

Re: ESP-MQTT Event Handler Failure

Postby ESP-Marius » Mon Oct 28, 2019 2:17 am

The task should clean up after itself, it's the last thing that happens in esp_mqtt_task(void *pv). So adding an additional vTaskDelete shouldn't be necessary.

rsimpsonbusa
Posts: 60
Joined: Tue May 17, 2016 8:12 pm

Re: ESP-MQTT Event Handler Failure

Postby rsimpsonbusa » Wed Oct 30, 2019 2:06 am

Yes I did see that eventually. Sorry.

Yet, the system fails ALWAYS after some time.

The system eventually collapses giving the error "Error create mqtt task" or 'Transport Error" or "Tcp read poll" in certain sequence.

I have made many programming variations and still I get to this points eventually. I hope it's a programming error but I dont think so because of the simplicity of the application.

I will attach the version where I use a task to send Messages every 2 secs and starts failing at message about 1581. It starts failing first with a "Error transport connect", sends a couple of messages, then again same error and finally a " TRANS_TCP: tcp_poll_read select error 113, errno = Software caused connection abort,", it does not recover from this error. Have to reboot and starts working again.

I do not know of a limitation in esp32 regarding start/stop the Mqtt client or it is not documented. The Broker does not seem to be the root cause, since it works immediately upon reboot and we start again a 1000 or more message cycle before failure.

This test is extremely import since it validates (or not) the ESP32 as a device for industrial/long term/unattended device with stable behavior. As of now, this is a toy to test devices,learning basic networking programming or hobby use. The scope of this device will be more than 1MM of them connecting to a Broker and sending telemetry info. It MUST connect-send-disconnect to be viable, one cannot have 1 mm connections to a Broker 10,000 at a time, so thats why the Connect-Send/RX-Disconnect scenario.

Anyway, attached you will find the app and a log of the failing process. The app has the SSID and Password as well as the MQTT server and client/passwords as XXXX. To make it work change them.

I hope its a programming error and I can count on ESP32 as the device for the project.

Many thanks for your patience and help.

ps: I did change the mqtt_client.c to have a STOP event. Attached also the source code.

Regards.
Attachments
mqtt_client.c.zip
(10.8 KiB) Downloaded 25 times
mqttApp.zip
(3.49 KiB) Downloaded 28 times
2019-10-26 18-01-27 OlmedoClone.log.zip
(179.64 KiB) Downloaded 30 times

ESP-Marius
Posts: 9
Joined: Wed Oct 23, 2019 1:49 am

Re: ESP-MQTT Event Handler Failure

Postby ESP-Marius » Thu Oct 31, 2019 2:08 am

If you are still getting "Error create mqtt task" then that sounds like you are running out of heap memory, try using esp_get_free_heap_size() to see if the heap is getting smaller and smaller.

The "Error transport connect" is from the transport layer, and not the mqtt client, you can active MQTT_SUPPORTED_FEATURE_TRANSPORT_ERR_REPORTING and look at the event data from the error to try to find out what the problem is.

rsimpsonbusa
Posts: 60
Joined: Tue May 17, 2016 8:12 pm

Re: ESP-MQTT Event Handler Failure

Postby rsimpsonbusa » Thu Oct 31, 2019 10:21 pm

Thanks, I dont know how to active the MQTT_SUPPORTED_FEATURE_TRANSPORT_ERR_REPORTING feature. Just #define MQTT_SUPPORTED_FEATURE_TRANSPORT_ERR_REPORTING 1 or how?

Heap is stable around 205000 during connect-send-disconnect cycle.

Regards.

Who is online

Users browsing this forum: No registered users and 20 guests