MQTT (Message Queuing Telemetry Transport) is the protocol that powers more IoT deployments than any other. Originally designed by Andy Stanford-Clark at IBM for monitoring oil pipelines over satellite links in 1999, it was purpose-built for constrained devices, unreliable networks, and low bandwidth—which is precisely the IoT operating environment. Despite its age, MQTT 5.0 (published in 2019) is thoroughly modern, and the protocol’s elegance—publish/subscribe decoupling, compact binary framing, flexible QoS—makes it the right default choice for device-to-cloud communication in the vast majority of IoT applications. This guide goes deep: not just what MQTT does, but why the design decisions were made and how to apply them correctly in production.
MQTT Fundamentals: The Publish/Subscribe Model
MQTT is a publish/subscribe protocol, which means producers (devices) and consumers (backends) are decoupled in time and space. A device publishes a message to a topic on a broker. The broker delivers that message to every client that has subscribed to matching topics. Publisher and subscriber never communicate directly, and neither needs to know the other exists.
Contrast this with HTTP’s request/response model: in HTTP, a device must know the server’s address, initiate a TCP connection, and wait for a response. In MQTT, the device connects once to the broker, maintains that persistent connection, and can publish without awaiting a response. This matters for constrained devices where TCP handshake overhead is meaningful and where the application doesn’t need confirmation from the server for every message.
The three entities in every MQTT interaction:
- Client (publisher or subscriber): a device, backend service, or application
- Broker: the message routing server (
mosquitto, EMQX, HiveMQ, AWS IoT Core, etc.) - Topic: a UTF-8 string used to route messages, structured like a filesystem path:
sensors/building_a/floor_2/temp
A single client can be both publisher and subscriber simultaneously. Backend services typically subscribe to device topics to ingest telemetry and publish to command topics to send instructions down to devices.
Topic Design: The Art of Good Hierarchy
Topic design is the single most important architectural decision in an MQTT system. Bad topic hierarchies create routing nightmares, make access control difficult, and limit your ability to use wildcard subscriptions efficiently.
Best practice topic structure:
{product}/{tenantId}/{deviceId}/{dataType}
Examples:
sensors/acme-corp/device-00a3/telemetrysensors/acme-corp/device-00a3/eventssensors/acme-corp/+/status(wildcard: subscribe to status of all ACME devices)sensors/#(multi-level wildcard: subscribe to everything under sensors)
Wildcard characters:
+— single-level wildcard: matches exactly one topic level.sensors/+/temperaturematchessensors/device-1/temperaturebut notsensors/building-a/device-1/temperature.#— multi-level wildcard: matches any number of levels. Must appear only at the end.sensors/#matches everything undersensors/. Use sparingly in production—it can generate enormous subscription fan-out.
Avoid these common mistakes:
- Leading slashes (
/sensors/data) — creates an empty first level, wastes a hierarchy level - Spaces in topic names — technically valid but causes bugs in many client implementations
- Encoding device state in topic levels (
sensors/device-1/online) — use retained messages instead - Overly deep hierarchies (8+ levels) — makes wildcard subscriptions unwieldy
For multi-tenant systems, placing tenantId early in the hierarchy enables per-tenant ACL rules at the broker level: a tenant’s backend can be granted permission to subscribe to sensors/{tenantId}/# without being able to see other tenants’ data.
QoS Levels: Exactly What They Mean
MQTT defines three Quality of Service levels. Understanding what they actually guarantee (and what they don’t) is essential for correct system design.
QoS 0 — At Most Once (“fire and forget”) The broker delivers the message to subscribers zero or one times. No acknowledgment is sent by the subscriber to the broker. If the network drops the message or the subscriber is temporarily disconnected, the message is lost. This is appropriate when:
- Data loss is acceptable (e.g., a sensor sending readings every 5 seconds—a missed one doesn’t matter)
- You’re optimizing for minimum overhead
- Network reliability is high (local network, not cellular)
QoS 1 — At Least Once
The publisher sends the message and waits for a PUBACK acknowledgment from the broker. If no PUBACK arrives within the timeout, the publisher resends (with the DUP flag set). The message may be delivered more than once if the publisher sends a duplicate before the PUBACK arrives. Subscribers must be idempotent (capable of handling duplicate messages). This is the most common production choice: reasonable reliability with manageable overhead.
QoS 2 — Exactly Once
A four-part handshake (PUBLISH → PUBREC → PUBREL → PUBCOMP) ensures the message is delivered exactly once. This is the safest but most expensive QoS level in terms of latency and message exchange overhead. Use QoS 2 only when duplicate delivery is genuinely harmful—financial transactions, actuator commands that must fire exactly once, safety-critical operations.
Important nuance: QoS applies to the leg between publisher and broker, and separately to the leg between broker and subscriber. A device publishing at QoS 1 to a subscriber using QoS 0 means the broker receives the message reliably but delivers to the subscriber with no guarantee. Configure both legs appropriately.

Retained Messages: Device State at Subscription Time
A retained message is a special MQTT message with the RETAIN flag set. The broker stores the last retained message for each topic. When a new subscriber subscribes to a topic, it immediately receives the last retained message for that topic, even if no new message has been published since.
This solves a fundamental IoT problem: how does a new subscriber learn the current state of a device?
Without retained messages, a backend service that restarts must wait for the device to publish its next regular telemetry before knowing its state. With retained messages, the broker immediately delivers the last known state to the reconnecting backend.
Common uses of retained messages:
- Device online/offline status: publish
onlinewith RETAIN=true todevices/{id}/status, use Last Will to publishofflinewith RETAIN=true on disconnect - Current firmware version: publish after OTA update
- Device configuration acknowledgment: device publishes its current config with RETAIN=true so dashboards always have the latest
To clear a retained message, publish a zero-byte payload with RETAIN=true to that topic.
Last Will and Testament: Detecting Ungraceful Disconnects
The Last Will and Testament (LWT) mechanism lets a client specify a message that the broker publishes on its behalf when the client disconnects ungracefully (network failure, crash, power loss). It’s set at connection time in the CONNECT packet.
Example pattern for device presence:
Connect with LWT:
Topic: devices/device-001/status
Payload: {"status": "offline", "reason": "unexpected_disconnect"}
QoS: 1
RETAIN: true
On successful connect, publish:
Topic: devices/device-001/status
Payload: {"status": "online"}
QoS: 1
RETAIN: true
This gives your backend a reliable way to track device connectivity without polling. The LWT fires only on ungraceful disconnect (not on a clean DISCONNECT packet), so you should also publish an offline status message before intentional shutdown.
Persistent Sessions and Clean Sessions
The cleanSession flag (MQTT 3.1.1) or cleanStart + sessionExpiryInterval (MQTT 5.0) controls whether the broker retains subscription state and queued messages between connections.
With clean session / cleanStart=true: on each connection, the broker creates a fresh session with no remembered subscriptions and no queued messages. Simple and stateless, but the client must re-subscribe every connection and misses messages published while disconnected (even at QoS 1/2).
With persistent session / cleanStart=false: the broker remembers the client’s subscriptions and queues QoS 1/2 messages while the client is offline. When the client reconnects, queued messages are delivered. This is essential for devices that sleep between readings—they won’t miss downlink commands while sleeping.
Persistent sessions increase broker memory usage proportional to the number of persistent sessions and queued messages. Size your broker accordingly and set a sessionExpiryInterval (MQTT 5.0) to auto-expire sessions for devices that haven’t connected in a long time.
MQTT 5.0: Key Improvements
MQTT 5.0 (from HiveMQ’s excellent spec guide) added several features critical for production IoT:
- Reason codes: Every
CONNACK,PUBACK,SUBACK, etc. now includes a reason code indicating success or the specific failure reason—far better than the binary success/fail of 3.1.1 - User properties: Arbitrary key-value pairs can be added to any MQTT packet—useful for tracing IDs, device firmware versions, and routing metadata without parsing the payload
- Message expiry interval: Set a TTL on messages; the broker discards stale messages that haven’t been delivered within the interval
- Shared subscriptions: Multiple subscribers share a subscription, enabling load balancing across backend consumers without the fan-out of regular subscriptions
- Topic aliases: Replace frequently used long topic strings with a short integer alias, reducing overhead per message
For new projects, use MQTT 5.0. All major brokers (Mosquitto 2.0+, EMQX, HiveMQ) support it. Most device SDKs (Paho, AWS IoT SDK, Azure IoT SDK) have MQTT 5.0 support.
Security Best Practices
MQTT has no built-in authentication beyond username/password and TLS. Production systems must layer security carefully:
- Always use TLS (
mqtts://on port 8883). Certificate pinning on devices prevents man-in-the-middle attacks. - Unique per-device credentials: Each device should have a unique client certificate (X.509) or unique username/password. Never share credentials across devices.
- Enforce ACLs at the broker: A device should only be permitted to publish to
devices/{its-own-id}/#and subscribe tocommands/{its-own-id}/#. Use the broker’s ACL system (Mosquitto’saclfile, EMQX’s authorization rules, HiveMQ’s extension SDK) to enforce this. - Validate client IDs: Many brokers can enforce that a device’s MQTT client ID matches its certificate CN/SAN, preventing credential reuse.
- Rate limit publishes: An infected device could flood your broker. Implement per-client publish rate limits.
Our IoT Security Best Practices article covers the full security stack in depth.
For a complete guide to the protocols that complement MQTT (HTTP, CoAP, AMQP) and platform integration, visit our IoT Connectivity Integration services page.
Conclusion
MQTT’s elegance lies in its simplicity and its fitness for IoT constraints: a persistent TCP connection, tiny packet overhead, flexible QoS, and publish/subscribe decoupling make it the right default protocol for device-to-cloud communication. The nuances matter enormously in production: topic hierarchy determines your security model and subscription efficiency; QoS level must match your reliability requirements without over-engineering; retained messages and LWT give you device state visibility without polling; persistent sessions ensure devices don’t miss commands while sleeping; MQTT 5.0 adds the operational tooling that production systems need. Master these mechanisms, and MQTT becomes a robust foundation for IoT systems of any scale.
IoT & AIoT Weekly
Get the best IoT development content delivered weekly. No noise, just signal.