Skip to content

EMQX

Streaming & Message Queues

Scalable MQTT broker. Connect 100M+ IoT devices in one single cluster, move and process real-time IoT data with 1M msg/s throughput at 1ms latency.

Erlang Latest e5.10.4 · 1d ago Security brief →

Features

  • Full MQTT v5.0, v3.1.1, and v3.1 support with additional IoT protocols (LwM2M, CoAP, MQTT‑SN, QUIC)
  • Massive scalability – connect >100 million concurrent clients and process millions of messages per second with sub‑millisecond latency
  • Powerful SQL‑based rule engine for real‑time data processing, transformation, and filtering
  • Seamless integration with 50+ backend systems (Kafka, RabbitMQ, PostgreSQL, MySQL, MongoDB, Redis, ClickHouse, InfluxDB, AWS Kinesis, GCP Pub/Sub, Azure Event Hub, etc.)

Recent releases

View all 19 releases →
Review required
e5.10.4 Breaking risk
Auth RBAC Dependencies

Breaking changes — review before upgrading.

e5.8.10 Bug fix
Notable features
  • Kafka polling waits for data instead of returning empty batches
  • RabbitMQ Connector self-recovery without manual restart
  • Azure Blob Storage health check optimization for large containers
Full changelog

Enhancements

Observability and Performance

  • #16746 Configured os_mon to collect only system-wide memory statistics by default, reducing per-process memory scanning overhead.

  • #16911 Reduced the overhead of Prometheus metrics collection by avoiding accidental repeated queries of Mria statistics.

Data Integration

  • #16961 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consisten
    tly.

Licensing

  • #16853 Made the v5 license parser forward-compatible with v6 license keys.

Bug Fixes

Clustering

  • #16729 Improved recovery time of a cluster after a simultaneous restart of all nodes.

    The built-in Mria database management system no longer waits for a full sync of an internal table used to generate transaction synchronization events.

Data Integration

  • #16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.

  • #16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout, with a minimum of 30 seconds, reducing unnecessary reconnections and retries when metadata requests take longer than expected.

    This is especially beneficial when metadata request timeout is configured to a small value.

  • #16724 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.

  • #16935 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.
    - #16971 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reason closing as recoverable errors, reducing log noise.

Gateway

  • #16606 Fixed the CoAP Gateway in connection mode over DTLS.

  • #17030, #17042 Fixed CoAP client takeover handling for both UDP and DTLS connections.

    These changes improve takeover routing and token validation for reconnected clients, and keep the DTLS token takeover grace period aligned with the configured keepalive window.

Operations

  • #16732 Fixed a crash in emqx ctl subscriptions list that could happen when shared subscriptions were present.

    Before this fix, listing subscriptions could fail for some clients and return no output.

    After this fix, emqx ctl subscriptions list works reliably with both regular and shared subscriptions.

Security

  • #16690 Fixed a CRL cache regression where emqx_crl_cache:evict/1 did not fully clear internal URL state.

    After eviction, the same CRL URL now re-registers correctly on next use, its refresh timer is restored, and repeated HTTP fetches per connection are avoided.

  • #17012 Fixed password-based authentication backends to let the auth chain continue when the CONNECT packet has no password, instead of rejecting the connection immediately.

    Previously, if a client connected without a password, the first password-based authenticator (built-in database, MySQL, PostgreSQL, MongoDB, Redis, or LDAP) in the chain would return an error, blocking any subsequent authenticators (such as HTTP) from being tried.

Observability

  • #16672 Ensured that the Erlang PID is printed as a log data field.

  • #16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:

    2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'},
    [{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4,
    [{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,...
    

    EMQX now prints more meaningful information to help debug the issue.

6.2.0 Breaking risk
Breaking changes
  • Empty jq program now errors; use '.' instead
  • String indices use code points instead of byte indices
  • tonumber() rejects leading/trailing whitespace; use trim() first
Notable features
  • Agent-to-Agent Card Registry for autonomous AI agent discovery
  • MQTT subscription filters using User-Property expressions
  • GCP Workload Identity Federation authentication support
Full changelog

Enhancements

AI Interoperability

  • #16840 Implemented Agent-to-Agent (A2A) Card Registry. This feature enables autonomous AI agents to discover and collaborate through a standardized, event-driven MQTT 5.0 mechanism.

  • #16958 Added focused /api-spec.md endpoint and /api-spec.html to support drill-down discovery of EMQX HTTP API context, especially for AI agents and other tools that benefit from fetching only the relevant API slices instead of a single bloated spec.

Core MQTT Functionalities

  • #16612 Introduced the emqx_setopts app for $SETOPTS server-side option updates, including keepalive control topics and warning+suppression for unknown $SETOPTS/* publishes.

  • #16887 Added optional MQTT subscription message filters controlled by mqtt.subscription_message_filter.

    When enabled, clients can subscribe with a ? suffix such as sensor/+/temperature?location=roomA&value>25 and EMQX will deliver only the messages whose MQTT 5 User-Property entries satisfy the filter expression. When disabled, ? remains part of the topic filter text and no extra filtering is applied.

    Messages dropped by subscription-filter mismatch are reported through the existing delivery.dropped event with reason subscription_filter and counted by the new delivery.dropped.filter metric.

  • #16929 Two new limiter kinds are introduced: delivery_messages and delivery_bytes. In contrast to the existing messages and bytes limiters, which limit messages published by a single client, the new limiter throttle messages received by a single client from any source. If the limit is hit, QoS 0 messages are dropped, QoS > 0 are queued internally, and a retry is scheduled. The retry time is derived from the limiter's configuration.

    The new limiters are only supported for memory sessions (durable_sessions.enable = false).

    If unspecified, the default values are unlimited, thus keeping backwards compatibility.

  • #16779 Improved handling of malformed first packets by classifying them as invalid CONNECT packets and adding better protocol hints in logs.

Data Integration

  • #16589 Updated jq library used in the Rule Engine runtime to version 1.8.1.

    Note that the jq 1.8.1 language contains several subtle breaking changes compared to 1.6.1.

    • Providing empty string as jq program is now considered an error: use "." instead. (jq#2790)
    • String functions now use code point indices: indices/1, index/1, and rindex/1 functions now use code point indices instead of byte indices; use utf8bytelength/0 to get byte index if needed. (jq#3065)
    • tonumber/0 rejects numbers with leading or trailing whitespace: use trim/0 before calling tonumber/0. (jq#3055, jq#3195)
    • last(empty) behavior changed: last(empty) now yields no output values, consistent with first(empty). (jq#3179)
    • limit/2 errors on negative count, instead of silently accepting it. (jq#3181)
    • Tcl-style multiline comments supported: this may subtly affect parsing of existing code. (jq#2989)
    • Decimal number conversion changed: decimal numbers are now converted to binary64 (double) instead of decimal64. (jq#2949)
    • nth/2 emits empty on index out of range, instead of erroring. (jq#2674)
    • String multiplication by 0 or less than 1 now emits an empty string instead of the original string. (jq#2142)
  • #16634 Added support for GET requests in external HTTP schema validation by allowing schema registry entries to specify the HTTP method (POST remains the default).

  • #16647 Now, in GreptimeDB and EMQX Tables Actions, integer values that are not suffixed with i or u are automatically cast to float (float64) values before being sent to the database.

    In InfluxDB Write Syntax, float is the default numeric type, and integers must be annotated. Previously, when EMQX encountered a non-annotated integer, it would interpret it as a one-character string, and insertion would fail if the column was of type float.

  • #16707 Added a Data Integration to consume from and publish messages to Azure Event Grid.

  • #16750 Added support for using Workload Identity Federation (WIF) authentication with GCP Connectors (GCP PubSub Producer and Consumer, BigQuery), via Service Account Impersonation. At this point, only OIDC workload identity pool providers using Client Credentials grant type are supported.

  • #16773 Now, when using MQTT Connector with SSL enabled, if unset, the Server Name Indication (SNI) field will be automatically filled with the server's hostname.

  • #16893 Added a new Connector and Action that appends data to QuasarDB.

  • #16962 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consistently.

Access Control

  • #16597 In MySQL and PostgreSQL authentication and authorization, improved the handling of unallowed and quoted variables in the SQL template.

  • #16616 Added new configurations to SSO OIDC backend to allow specifying jq expressions to extract the desired role and namespace when creating new dashboard users.

  • #16759 Added new functions timestamp_s and timestamp_ms to retrieve system time in variform expressions (used e.g. to populate additional client attributes on connection).

  • #16817 Added REST API endpoints to reset authentication and authorization metrics counters.

    • POST /authentication/:id/metrics/reset resets counters for a specific authenticator.
    • POST /authorization/sources/:type/metrics/reset resets counters for a specific authorization source.
  • #16849 Added cookie-based authentication fallback for plugin API endpoints.

    Plugin UI iframes served by the dashboard can now authenticate via the emqx_auth cookie when no Authorization header is present. This only applies to /api/v5/plugin_api/... paths.

Management

  • [#16958] Added emqx ctl api_keys CLI commands to list, show, add, delete, enable, and disable API keys from the command line.

Gateway

  • #16734 Added ordered token, nkey, and jwt internal authentication methods to the NATS Gateway to reduce the authentication feature gap with NATS Server.

Deployment and Security

  • #16653 Made Erlang distribution listener address configurable via node.dist_bind_address.

    For example: node.dist_bind_address = "10.0.1.5".

    Previously required configuration in vm.args as -kernel inet_dist_use_interface {10,0,1,5}.

  • #16888 Refreshed the default TLS certificate bundle shipped with EMQX packages for local development and testing.

    The new server certificate is issued for localhost and loopback addresses only (localhost, 127.0.0.1, ::1).

    These default certificates are intended for test and local deployment scenarios only and must not be used in production.

  • #16916 Now, the emqx_cert_expiry_at Prometheus metric takes into account the expiry date of certificates that belong to managed certificate bundles, when they are used in MQTT listeners.

Performance

  • #16500 Optimize idle memory usage and reduce the cost of maintaining rate-based metrics.

    Note that various 5-minute average rate metrics exposed via APIs are no longer exact averages over the last 300 samples, but are instead EWMAs (Exponentially Weighted Moving Averages) that approximate them closely.

  • #16547 Disable TLS 1.2 session reuse by default to reduce TLS handshake overhead.

    The TLS 1.2 session cache size is limited to 1000 entries, and the cache is local to each node.

    This makes the reuse rate very low, especially when large numbers of connections connect to a large cluster.

  • #16794 Enabled node-level authentication and authorization caches by default.

    This reduces repeated backend lookups for repeated client checks out of the box, improving authentication and authorization performance in common deployments.

  • #16829 Optimized the NATS gateway publish hot path to reduce per-message overhead in frame parsing, subject/topic handling, metrics updates, and ACK/message build steps.

  • #16911 Reduce the overhead of Prometheus metrics collection by avoiding repeated queries of Mria statistics.

  • #16550 Stop caching subscribe ACL check results.

    MQTT subscription is mostly done once per connection life cycle. Holding the subscribe ACL check result in cache is most of the time a waste of RAM.

Bug Fixes

Core MQTT Functionalities

  • #16721 Fixed QoS 2 duplicate handling when await_rel_timeout has expired.

    Previously, if a client retried a QoS 2 PUBLISH with DUP=1 after the broker had expired the pending PUBREL state (default 300 seconds), the message could be published to subscribers again. EMQX now treats this retransmission as a duplicate handshake packet and returns PUBREC without re-delivering the application message.

  • #16725 Disabled TCP connection congestion alarm by default by setting conn_congestion.enable_alarm = false in the default zone/global configuration.

  • #16781 Fixed CONNECT validation when retained messages are unavailable.

    When mqtt.retain_available is set to false, CONNECT packets with Will Retain set are now correctly rejected with CONNACK reason Retain not supported (0x9A).

  • #16783 Fixed MQTT v5 SUBSCRIBE validation for Subscription-Identifier upper bound.

    EMQX now accepts 268435455 (0x0FFFFFFF), which is the maximum valid Subscription Identifier value defined by the MQTT spec.

  • #16974 In EMQX 6.1.1, when a session was subscribed to a topic filter containing retained messages and was later taken over or resumed without re-subscribing to the same topic filter, it would receive again the received messages. Now, the previous behavior is restored, meaning that, upon session resumption or takeover without explicit re-subscription, retained message iteration will cease.

  • #16876 Changed log message msg_publish_not_allowed to msg_not_routed_to_subscribers.

Data Integration

  • #16803 Improved error reporting when configuring batch operations for MySQL actions.

  • #16796 Fixed handling of multiline SQL statements in connector actions.

  • #16936 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.

  • #16955 Eliminate Kafka producer action false health check warning logs.

    Previously if Kafka producer is idling for too long, Kafka may close the connection (typically default is 10 minutes), if Kafka producer action health-checks happen to be performed around the same moment, there could be a false warning message with message "not_all_kafka_partitions_connected".

  • #16972 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reason closing as recoverable errors, reducing log noise.

  • #16863 Added a warning log when an async reply is received for an already-expired request in async actions.

  • #16847 Fixed a crash when non-ASCII unicode string is used in message transformation expression.

  • #16979 MQTT ingress bridges now support consuming from remote message queues $queue/{name}/{bind-filter}.

  • #16999 Fixed an issue where MQTT source failed to receive messages from $queue/ subscriptions when the remote broker has the Message Queue (mq) feature enabled. The MQ message delivery was missing the MQTT v5 Subscription-Identifier property in PUBLISH packets, which the MQTT bridge ingress relies on to route messages from queue subscriptions.

Access Control

  • #16780 Fixed an issue in authorization source validation where requests missing the type field could trigger an internal error.

    Now EMQX returns a clear BAD_REQUEST validation error for this case.

  • #16805 Added support for authz hook results to opt out of authorization cache storage for dynamic ACL decisions.

  • #16865 Added cert_common_name and cert_subject aliases for mqtt.client_attrs_init expressions, alongside the existing cn and dn variables.

  • #16868 Improved REST API authentication error messages to guide programmatic clients toward using API keys (Basic auth) instead of repeatedly logging in for bearer tokens. Error responses now mention the api_key.bootstrap_file configuration option and the POST /api_key endpoint for creating persistent API keys.

  • #16928 Dashboard-created REST API keys are now generated randomly instead of being derived from the API key name.

  • #16939 Fixed the built-in database authenticator so it no longer logs a warning when the default bootstrap file path is configured but the file does not exist.

  • #16993 Fixed an issue where an error response from an OIDC SSO provider would result in a 500 error. Now a more user-friendly result is returned.

Durable Storage

  • #16874 Fixed a rare issue where Durable Storage backed by DS Raft could stop accepting new messages after a sequence of quick cluster leadership changes, requiring a node restart to recover.

Clustering

  • #16534 Lowered the default net_ticktime from 2 minutes to 1 minute to improve cluster node failure detection.

    In the event of a network outage or abrupt node termination, remaining nodes will detect the down node sooner, reducing the time before failover mechanisms activate and improving overall cluster resilience and user experience.

Plugins

  • #16842 Reduced noisy plugin config warning logs when no peer node has the plugin config yet.

    Previously, when a node tried to fetch plugin config from peer nodes during startup, it would log a warning even when all peers simply didn't have the config (e.g., first node to load the plugin). Now this benign case is logged at debug level, and only genuine errors (RPC failures, timeouts) remain as warnings.

  • #16843 Fixed an issue where HTTP headers and query string parameters were not passed through to plugin API handlers, causing plugins to receive empty headers and missing query parameters.

  • #16904 Prevent enabling or starting multiple versions of the same plugin at once. When a newer version is enabled, older configured versions of that plugin are automatically disabled, and management API actions now return a clear error instead of reporting success while another version is still active.

Gateway

  • #16536 Fixed the CoAP Gateway when running in DTLS connection mode.

  • #16996 Fixed CoAP DTLS connection-mode to keep sessions available after sock_closed and support reconnect takeover with the same clientid and valid token.

Observability

  • #16879 Added log.audit.cache_size as the primary config key for the audit log DB cache size, while keeping log.audit.max_filter_size for backward compatibility.

Deployment and Security

  • #16683 Added support for HTTPS CRL Distribution Point URLs in the CRL cache, so CRLs fetched from https:// endpoints are now cached and refreshed correctly.

  • #16901 Fixed RPM package OpenSSL dependency for RHEL 9.6 LTS: pinned openssl >= 3.5.1 for RHEL >= 9.7 and openssl >= 3.0.7 for older RHEL 9 versions.

ExHook

  • #16890 Fixed an ExHook issue where successful reconnect reloads could duplicate the same server name in the running list and trigger repeated callback dispatches.

Licensing

  • #16764 Refined license customer tier handling by introducing STANDARD and VIP tiers in enforcement logic and reducing the official-license STANDARD expiry grace period from 90 days to 15 days before new sessions are restricted.
6.1.1 Breaking risk
Breaking changes
  • Message Stream prefix changed from $s to $stream with required name
  • Message Queue prefix changed to $queue with required name
  • Stream subscriptions require $stream/name/topic_filter syntax
Notable features
  • Retained message iteration resumes from last confirmed delivery
  • CoAP Block-Wise Transfer protocol support
  • JT/T 808 protocol 2019 with GBK character encoding
Full changelog

Enhancements

Core MQTT Functionalities

  • #16637 Previously, if a session was taken over while in the middle of receiving several retained messages from a wildcard topic subscription, iteration over those retained messages would start over for the new client, repeating already delivered retained messages. Now, the new client will resume iteration from the last confirmed delivered message from the last session, reducing the number of duplicated retained messages.

Durable Storage

  • #16704 Prevent RocksDB storage backing Durable Storage shards from preallocating large chunks of disk space by default.

    Previously, each shard consumed a significant amount of disk space immediately, which compounded due to multiple Durable Storage databases now being created by default (each consisting of 16 shards).

Message Queue and Streams

  • #16551, #16714 Refined Message Stream and Message Queue interfaces.

    For stream subscriptions, the $stream prefix is now used. Streams are now named, and the name should be specified on subscribe: SUBSCRIBE $stream/<name>/<topic_filter> (or SUBSCRIBE $stream/<name> if the stream is known to exist). The starting point for stream consumption is specified using the stream-offset user subscription property.

    For message queue subscriptions, the $queue prefix is used. Message queues are also named, and the name should be specified on subscribe: SUBSCRIBE $queue/<name>/<topic_filter> (or SUBSCRIBE $queue/<name> if the queue is known to exist).

    Notes:

    • Stream and queue names may contain only alphanumeric characters, underscores, hyphens, and dots.
    • Previously created unnamed streams and queues obtain the name derived from their topic filter. Their name becomes their topic filter with prepended /.
    • The legacy $q queue interface (introduced in 6.0.0) and $s stream interface (introduced in 6.1.0) are kept for compatibility, but their use is discouraged.
    • If Message Queues are enabled, $queue prefix cannot be used for subscribing to shared subscriptions anymore.
  • #16820 Added shorter API path aliases /queues/* and /streams/* for the Message Queue and Message Stream management APIs.

    The previous /message_queues/* and /message_streams/* paths remain functional for backward compatibility but are no longer shown in the API documentation.

Gateway

  • #16719 Added Block-Wise Transfer support for CoAP and LwM2M gateways.

    • Added block-wise settings: enable, max_block_size, max_body_size, and exchange_lifetime.
    • Improved POST /gateways/coap/clients/:clientid/request and LwM2M downlink handling for large block-wise messages.
  • #16736

    • Added the jt808.frame.parse_unknown_message option, enabling the JT808 gateway to transparently forward unknown messages.

    • Added JT/T 808 protocol 2019 support.

    • Added GBK character encoding support for JT/T 808 gateway.

      The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new frame.string_encoding configuration option is added:

      • utf8 (default): Pass through strings as-is (backward-compatible)
      • gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices

      This affects both uplink parsing (GBK to UTF-8) and downlink serialization (UTF-8 to GBK), including string fields such as license plates, driver names, text messages, area names, and client parameters.
      MQTT payloads always use UTF-8 encoding regardless of this setting.

    • Added support for custom msg_sn in JT/T 808 gateway downlink messages.

      When a downlink MQTT message payload contains a msg_sn field in the header, the gateway will use that value instead of the auto-generated channel sequence number. This allows external systems to control message sequencing for specific use cases.

    • Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with base64 encoding in JSON instead of string type.

    • Fixed JT/T 808 0x0702 driver identity report message parsing.

Security

  • #16447 Added a new force_delete query parameter to the following HTTP APIs for managing certificates:

    • DELETE /certs/global/name/:name
    • DELETE /certs/ns/:ns/name/:name

    When omitted or false, configurations in all namespaces will be checked to see if the managed bundle being deleted is being referenced and fail deletion if affirmative.

  • #16461 Support TLS 1.3 session ticket resumption.

    EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.

    Node-level configuration: node.tls_stateless_tickets_seed is the secret key seed for generating TLS 1.3 stateless session tickets.
    Listener-level configuration: listeners.ssl.<name>.ssl_options.session_tickets enables TLS 1.3 session resumption using stateless session tickets.
    Possible values are disabled (default), stateless, and stateless_with_cert (includes certificate information).

    Session tickets are only generated when node.tls_stateless_tickets_seed is configured (non-empty) and session_tickets is enabled in listener SSL options.
    If session_tickets is enabled but node.tls_stateless_tickets_seed is empty, session tickets will not be generated and an error log will be emitted when starting the listener.

Access Control

  • #16504 Added a new option to parameterize the data source from which to construct the dashboard username when creating a new user via OIDC SSO.

  • #16741
    Added configuration options idp_signs_envelopes and idp_signs_assertions to SAML SSO backend to control signature verification behavior.

    Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.

    Both options default to false for backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these to true when their IdP is configured to sign SAML responses.

  • #16684 Enabled mqtt.client_attrs_init expressions can make sure of password (for example, feed it to jwt_value) to initialize client attribute.

  • #16730 Redis authorization now supports a compatibility mode for EMQX 4.x ACL data.
    Set compatibility_mode = v4 to enable legacy %u/%c placeholder conversion and legacy ACL access values 1|2|3 (mapped to subscribe/publish/all).
    By default, compatibility mode remains disabled, so existing Redis authz behavior is unchanged.

Data Integration

  • #16511 Supported the IoTDB Table Model in the data integration.

  • #16516 Added two new Action metrics: aggregated_upload.success and aggregated_upload.failure. These are only relevant for Aggregated Upload Actions (S3, Azure Blob Storage, Snowflake and S3Tables) and are incremented when an aggregated delivery succeeds or fails, respectively.

  • #16658 Previously, when the server port was omitted in an EMQX Tables Connector, the port would default to 80. Now, it defaults to 4001.

    A more intelligible error message is returned when an EMQX Tables Connector is configured with SSL enabled but cacertfile, certfile or keyfile configurations are missing.

Rule Engine

  • #16524 Enhanced base64 encoding and decoding functions in rule engine SQL with support for padding and URL-safe options.

    The base64_encode and base64_decode functions now support optional parameters to control encoding behavior:

    • no_padding: Encode or decode without padding characters (=). Useful when you need to remove padding from encoded strings or decode strings that don't have padding.
    • urlsafe: Use URL-safe base64 encoding/decoding. Replaces + with - and / with _, making the encoded string safe to use in URLs without encoding.

    You can use these options individually or combine them. When combining options, the order doesn't matter.

    Examples in rule SQL:

    Encode without padding:

    SELECT base64_encode(payload, 'no_padding') as encoded FROM "t/#"
    

    Encode with URL-safe characters:

    SELECT base64_encode(payload, 'urlsafe') as encoded FROM "t/#"
    

    Encode with both options (no padding and URL-safe):

    SELECT base64_encode(payload, 'no_padding', 'urlsafe') as encoded FROM "t/#"
    

    Decode URL-safe base64:

    SELECT base64_decode(payload, 'urlsafe') as decoded FROM "t/#"
    

    Decode unpadded URL-safe base64:

    SELECT base64_decode(payload, 'urlsafe', 'no_padding') as decoded FROM "t/#"
    
  • #16533 Added two new Variform expression helper functions json_value and jwt_value to extract values from JSON data and JWT tokens using dot-separated key paths.

    The json_value function extracts values from JSON binary strings using a dot-separated path to navigate nested structures.
    The jwt_value function decodes JWT token payloads and extracts claim values using the same path syntax.

    For example, if username is a JSON object, you can access field with json_value(username, 'shop.floor');
    if password is JWT with a customized claim, you can access the nested value with jwt_value(password, 'client_attrs.unitid').

  • #16539 Added support for keeping track of metric aliases when utilizing the spb_decode Rule Engine function.

    Now, after a device or edge of network (EoN) node publishes its DBIRTH/NBIRTH messages, alias mappings in said message will be stored and used when the client later uses spb_decode on a message matching the DDATA/NDATA topic patterns. The original names of the metrics will be added to the output of spb_decode.

    Note: when executing fallback actions, the mapping is not available in the environment they run in. This means that, if a fallback action republishes the undecoded DDATA/NDATA payload to a Sparkplug B DDATA/NDATA topic, the metric name fields will not be populated by the alias mapping.

  • #16581 Added a new Rule SQL function: spb_zip_kvs.

    Given an already decoded, valid Sparkplug B message, it'll go through the metrics and "zip" each property name and its value together.

    • properties (and any nested PropertySet values) have their keys and values fields
      removed and the values of the two former fields zipped together and merged with the
      original map. Values that have the PropertySet or PropertySetList types are
      recursively transformed like this.

    • Values of PropertySetList type have their propertyset field removed and replaced by
      an array of PropertySets, transformed following the above item's description.

    • If present, dataset_value field is transformed in a similar fashion: its columns and
      rows fields are removed and their values zipped together in an object merged with the
      original object. types and num_of_columns fields are removed from output.

    • Other values/fields are untouched.

    For example, given this input decoded Sparkplug B message:

    {
      "metrics": [
        {
          "properties": {
            "values": [
              {"int_value": 99},
              {
                "propertyset_value": {
                  "values": [{"int_value": 999}],
                  "keys": ["inner"]
                }
              },
              {
                "propertysets_value": {
                  "propertyset": [
                    {
                      "values": [{"int_value": 1}],
                      "keys": ["inner1"]
                    },
                    {
                      "values": [{"int_value": 2}],
                      "keys": ["inner2"]
                    }
                  ]
                }
              }
            ],
            "keys": [
              "leaf",
              "nested_prop",
              "nested_prop_list"
            ]
          }
        },
        {
          "dataset_value": {
            "num_of_columns": 2,
            "types": [7, 12],
            "rows": [
              {
                "elements": [
                  {"int_value": 3},
                  {"string_value": "3"}
                ]
              },
              {
                "elements": [
                  {"int_value": 4},
                  {"string_value": "4"}
                ]
              }
            ],
            "columns": ["col1", "col2"]
          }
        }
      ]
    }
    

    Then, the output of spb_zip_kvs will be:

    {
      "metrics": [
        {
          "properties": {
            "nested_prop_list": {
              "propertysets_value": [
                {"inner1": {"int_value": 1}},
                {"inner2": {"int_value": 2}}
              ]
            },
            "nested_prop": {
              "propertyset_value": {"inner": {"int_value": 999}}
            },
            "leaf": {"int_value": 99}
          }
        },
        {
          "dataset_value": {
            "col2": {"elements": [{"int_value": 4}, {"string_value": "4"}]},
            "col1": {"elements": [{"int_value": 3}, {"string_value": "3"}]}
          }
        }
      ]
    }
    

REST API

  • #16718 Improve REST API Swagger spec.

    Previously, summaries and descriptions of spec fields were mixed together. Now, summaries are brief, simple and punctuation-free, while descriptions provide all the details.

  • #16735 EMQX now supports plugin-defined HTTP API callbacks under /api/v5/plugin_api/{plugin}/....

    This allows plugin authors to expose plugin-specific API endpoints through the dashboard API service, with consistent authentication and HTTP error handling.

Observability

  • #16656 Made system monitor reports such as busy_port and long_schedule more informative by including process labels for easier troubleshooting.

  • #16744
    Supports end-to-end tracing of messages published via HTTP API.

Performance

  • #16413 Improve subscription handling performance.

  • #16492 Slightly improve idle system memory usage.

  • #16757 Set os_mon to collect only system-wide memory statistics by default, reducing per-process memory scanning overhead.

Bug Fixes

Core MQTT Functionalities

  • #16480 Fixed an issue where WebSocket connections could crash after the peer closed the connection, typically observed under moderate load.

    crasher: initial call: cowboy_tls:connection_process/4,
    error: {{case_clause,{error,closed}},[
    {cowboy_websocket_linger,websocket_send_close,2,[{file,"cowboy_websocket_linger.erl"},{line,752}]},
    {cowboy_websocket_linger,websocket_close,3,[{file,"cowboy_websocket_linger.erl"},{line,743}]},
    {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}
    ]}
    messages: [
    {ssl,{sslsocket,{gen_tcp,#Port<...>,...},[...]},<<130,130,27,93,145,101,251,93>>},
    {ssl_closed,{sslsocket,{gen_tcp,#Port<...>,...},[...]}}
    ], ...
    
  • #16515 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised Maximum-Packet-Size.

  • #16553 Fixed an issue where not all retained messages would be delivered if a subscriber hit the retained message dispatch rate limit.

    If the dispatch rate limit is reached while iterating over retained topics, then the client process will retry the iteration at a later time with exponential back-off (minimum 300 ms, maximum 10 s).

    The retainer.flow_control.batch_deliver_number configuration has been deprecated.
    The retainer.flow_control.batch_read_number no longer supports being set to 0 to mean read all remaining retained messages at once. If set to 0, it'll default to 1000 messages.

  • #16569 Fixed a rare race condition that could cause the supporting emqx_flapping process for flapping detection to crash under high system load.

  • #16651 Fixed a rare connection process crash during shutdown caused by operating on an already closed socket, typically under high system stress.
    Prior to this fix, such race condition typically result in an error level log saying {badmatch,{ok,{sock_error,closed}....

  • #16675 Fixed timestamp ordering issue where disconnected_at could be later than connected_at during session takeover or discard scenarios.

    Previously, disconnected_at was recorded too late (in ensure_disconnected), after the new session's connected_at was already set. This caused a race condition where disconnected_at > connected_at, making it difficult to track client presence state externally.

    The fix records disconnected_at immediately when takeover begins or when discard is received, ensuring it's always earlier than the new session's connected_at. This ensures correct timestamp ordering for external presence state tracking systems.

  • #16715 Fixed an issue where retained $SYS messages (for example, broker/node identity topics) were stored without expiry, which could leave stale node identifiers visible in Dashboard views after StatefulSet rotation.

    Now, newly published retained $SYS messages include Message-Expiry-Interval = 3600 (1 hour).

    For already existing stale retained $SYS entries created before this change, you can manually clear them by publishing an empty retained message to the stale topic:

    emqx eval 'emqx:publish(emqx_message:set_flag(retain, true, emqx_message:make(emqx_sys, <<"$SYS/brokers/[email protected]/sysdescr">>, <<>>))).'
    

    Replace the topic in the command with the stale $SYS/... topic you want to remove.

  • #16731 Fixed a crash in emqx ctl subscriptions list that could happen when shared subscriptions were present.

    Before this fix, listing subscriptions could fail for some clients and return no output.

    After this fix, emqx ctl subscriptions list works reliably with both regular and shared subscriptions.

  • #16782 Fixed MQTT v5 protocol handling for invalid PUBLISH properties.

    If a client sends a PUBLISH packet containing Subscription-Identifier, EMQX now treats it as a protocol error and disconnects the client.

Gateway

  • #16603 Fixed the CoAP Gateway when running in DTLS connection mode.

  • #16670 NATS gateway now enforces the max publish payload, honors the echo option (no local delivery), and improves publish/subscribe subject handling and related error messages.

Access Control

  • #16423 Added support for verifying the 'aud' (audience) claim in JWT authentication.

    When the 'aud' claim is configured in verify_claims, the JWT token must include a valid 'aud' claim. The verification supports both string and array formats:

    • If 'aud' is a string, it must exactly match the expected value.
    • If 'aud' is an array, at least one element in the array must match the expected value.
    • Empty string or empty array will fail verification.
    • Missing 'aud' claim will fail verification when it's configured in verify_claims.
  • #16459 Fixed the issue in SCRAM authentication HTTP API. Previously, incorrect user ID was returned for the created user in the user creation API call.

Data Integration

  • #16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.

  • #16542 Fixed an issue where Kafka producer connections could disconnect prematurely when Kafka was overloaded, causing excessive produce request retries.
    The request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
    reducing unnecessary reconnections and retries when metadata requests take longer than expected.
    This is especially beneficial when metadata request timeout is configured to a small value.

  • #16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.

  • #16657 Fixed an issue where, when importing configuration from an older node version into a newer one, values would not be upgraded according to newer code, leading to strange behavior.

    One such example is importing a MQTT Connector with static clientids from 5.10.0 into 6.0.0. In 5.10.0, usernames and passwords could not be associated with particular static clientids, and this was represented internally in a certain way. Later versions added the capability of creating those associations, with a different internal representation. This subtle internal representation conversion was missing when importing such configurations in previous EMQX versions.

  • #16659 When using an older MQTT Connector configuration with static clientids (from 5.10.0 and earlier) on later EMQX versions, the username and password at the root of the configuration was ignored. This could cause trouble when upgrading and keeping the same configuration, as the MQTT clients would stop using the credentials.

    Now, if there are username and/or password fields in the root Connector, those credentials are merged with any specific ones specified per clientid, the latter taking precedence.

  • #16723 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.

  • #16742
    Fixes the issue of GreptimeDB TLS connection failure.

Durable Storage

  • #16512 Improve handling of recoverable errors in the durable session.
    Durable sessions will now retry creation of durable storage iterators when that operation fails due to network issue.
    Previously, the whole session would get disconnected.

    Fix problem with the retry mechanism in the emqx_ds_client component.
    Previously, the number of retry attempts on recoverable errors was limited.

    Fix problems with the shared subscriptions:

    • Fix problem with shared subscription leader not coming up after node restart.
    • Shared subscription leader no longer advertises streams that reached the end of replay to the clients.
    • Make shared sub leader state checkpoint transaction options configurable
  • #16614 Improvements and bug fixes related to durable storage feature.

    • Improved handling of configuration inconsistencies between the nodes.

      Previously, when a durable storage was created in a cluster where
      nodes had different initial durable storage configuration, the
      replicas wound not converge. This change addresses this problem by
      replicating the configuration of the shard leader node to the
      replicas during initialization of the storage and subsequent
      configuration changes.

      Warning: this change is not backward-compatible. During a rolling
      cluster upgrade the shards will pause until the majority of their
      replicas are upgraded to the new version of EMQX, after which
      downgrade to the previous versions of EMQX will become impossible.

    • Fixed an issue in the durable storage subscription mechanism.

      Previously, a durable subscription created with a fresh iterator
      could miss a stored message with the timestamp precisely matching
      timestamp of the iterator.

  • #16770 Improve stability of durable sessions during takeover and garbage collection.

Clustering

  • #16393 Improved the stability of the Cluster Link route replication under unstable network conditions.

  • #16465 Upgraded gen_rpc to 3.5.1.

    Prior to the gen_rpc upgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
    The new version gen_rpc no longer has the long tail and converted crash logs to more readable error logs,
    and the frequent log "failed_to_connect_server" is also throttled to avoid spamming.

  • #16544 Improve robustness of cluster autoclean procedure.

    Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
    This fix resolves this issue.

  • #16739 Improved recovery time of a cluster after a simultaneous restart of all nodes.

    Built-in Mria database management system no longer waits for the full sync of an internal table used to generate transaction synchronization events.

Observability

  • #16537 Fixed formatter crash when logging gen_rpc errors.

    Prior to this fix, EMQX may report "FORMATTER CRASH" errors when gen_rpc logged certain error messages (e.g., transmission timeout errors).
    The formatter now handles these error messages correctly without crashing.

  • #16661 Improve topic_metrics and cluster_rpc logging when invalid topic is requested.

  • #16674 Ensure Erlang pid is printed as a log data field.

  • #16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:

    2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'},[{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4,[{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,[{file,"emqx_metrics_worker.erl"},{line,322}]},{emqx_rule_runtime,do_eval_action_reply_t...
    

    Now, we print more meaningful information to help debug the issue.

Security

  • #16545 Fixed node.cookie handling of # character. Previously, if the cookie contained #, only the prefix before # would take effect.
    For example, if abc#d was configured, only abc was used as the cookie.

    Added validation to reject problematic characters: backslash, single quote, double quote, and space.

  • #16664 Previously, it was possible to upload managed certificate files associated with non-existent managed namespaces. Now, namespace existence is checked before accepting the upload.

  • #16692 Fixed a CRL cache regression where emqx_crl_cache:evict/1 did not fully clear internal URL state.
    After eviction, the same CRL URL now re-registers correctly on next use, restores its refresh timer, and avoids repeated HTTP fetches per connection.

Plugin

  • #16784 Reduced noisy plugin startup warnings in single-node deployments.

    EMQX no longer tries to fetch plugin config from the local node during cluster config sync, avoiding repeated config_not_found_on_node warnings at startup.

  • #16823 Fixed a Dashboard plugin management issue for preinstalled plugins.

    When a plugin package is unpacked into plugins/ before node startup, starting it from the Dashboard no longer causes Plugin Config Not Found on the plugin config page.

Miscellaneous

  • #16620 Fix CRC32C dynamic library load issue on aarch64.
e5.10.3 Breaking risk
Notable features
  • TLS 1.3 stateless session ticket resumption with optional certificates
  • SAML IdP signature verification configuration options
  • IoTDB Table Model data integration support
Full changelog

Enhancements

Deployment

  • #16491 Start releasing packages for macOS 15 (Sequoia)

Observability

  • #16135 Added two new metrics and corresponding rates for the GET /monitor_current HTTP API: rules_matched and actions_executed. They track the number of rules that matched and act
    ion execution rate (i.e., success + failure), respectively.

  • #16324 Added support for end-to-end tracing of messages published via HTTP API.

Security

  • #16625 Added configuration options idp_signs_envelopes and idp_signs_assertions to SAML SSO backend to control signature verification behavior.
    Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.

    Both options default to false for backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these to true when their IdP is configured to sign SAML responses.

  • #16456 Added support for TLS 1.3 session ticket resumption.

    EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.

    Node-level configuration: node.tls_stateless_tickets_seed is the secret key seed for generating TLS 1.3 stateless session tickets. Listener-level configuration: listeners.ssl.<name>.ssl_options.session_tickets enables TLS 1.3 session resumption using stateless session tickets.
    Possible values are disabled (default), stateless, and stateless_with_cert (includes certificate information).

    Session tickets are only generated when node.tls_stateless_tickets_seed is configured (non-empty) and session_tickets is enabled in listener SSL options.
    If session_tickets is enabled but node.tls_stateless_tickets_seed is empty, session tickets will not be generated and an error log will be emitted when starting the listener.

Gateway

  • #16220 Added the jt808.frame.parse_unknown_message option, enabling the JT808 gateway to transparently forward unknown messages.

  • #16596 Added support for JT/T 808 protocol 2019.

  • #16627 Add GBK character encoding support for JT/T 808 gateway.

    The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new frame.string_encoding configuration option is added:

    • utf8 (default): Pass through strings as-is (backward-compatible)
    • gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices

    This affects string fields including license plates, driver names, text messages, area names, and client parameters.
    MQTT payloads always use UTF-8 encoding regardless of this setting.

Data Integration

  • #16511 Added support for the IoTDB Table Model in the data integration.

Bug Fixes [39/760]

Core MQTT Functionalities

  • #16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.

  • #16514 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised Maximum-Packet-Size.

Rule Engine

  • #16489 Fixed an issue where the following rule functions always returned undefined:
    msgid/0, qos/0, topic/0, topic/1, flags/0, flag/1,
    clientid/0, username/0, peerhost/0, payload/0, payload/1.

    Note: This is a backward compatibility fix for EMQX v4. These functions are not documented in EMQX v5 and later. The encouraged usage is to directly reference fields from the rule evaluation context. For example, SELECT clientid ... instead of SELECT clientid().

Data Integration

  • #16263 Previously, the Kafka consumer connector performed health checks by verifying partition leader connectivity for all partitions.
    In a clustered deployment, each EMQX node is assigned only a subset of partitions, causing leader connections for unassigned partitions to remain idle.
    Since Kafka closes idle connections after a timeout (10 minutes by default), this behavior could trigger false connectivity alarms.

    The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.

  • #16336 Fixed a race condition which may cause timeout when testing connectivity or stopping a connector from the dashboard.

  • #16383 Previously, when using IoTDB Connector with its REST API driver, credentials would not be checked during health checks. Now, we send a no-op query during IoTDB connector health c
    heck. This enables early detection of misconfigured client credentials.

  • #16415 Upgraded Apache Pulsar client to 2.1.2.

    When Pulsar producer action's batch_size is configured to 1, the producer will now encode single messages instead of single-element batches.
    This enables consumers to share load using Key Share strategy.

  • #16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself w
    as restarted. Now, the Source will re-subscribe upon reconnect.

  • #16585 Fixed an issue with GreptimeDB TLS connection failures.

  • #16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
    reducing unnecessary reconnections and retries when metadata requests take longer than expected.
    This is especially beneficial when metadata request timeout is configured to a small value.

  • #16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.

Clustering

  • #16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.

  • #16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replic
    ation state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.

  • #16452 Upgraded gen_rpc to 3.5.1.

    Prior to the gen_rpc upgrade, EMQX may experience a long tail of crash logs due to connection timeout if a peer node is unreachable.
    The new version of gen_rpc no longer has the long tail and converts crash logs to more readable error logs,
    and the frequent log "failed_to_connect_server" is also throttled to avoid log spamming.

  • #16543 Improved robustness of cluster autoclean procedure.

    Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
    This fix resolves this issue.

Access Control

  • #16304 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.

  • #16541 Fixed an issue where OIDC issuer URLs were automatically normalized with a trailing slash when saved to the configuration file, causing issuer mismatch errors when the OIDC provid
    er's discovery document returned the issuer without a trailing slash.

Observability

  • #16418 Reduced the volume of logs generated when a resource exception occurs (resource_exception). These logs are now throttled, and some potentially large terms are redacted from the
    m.

  • #16535 Fixed formatter crash when logging gen_rpc errors.

    Prior to this fix, EMQX would crash with "FORMATTER CRASH" errors when gen_rpc logged certain error messages (e.g., transmission timeout errors). The formatter now handles these error messages correctly without crashing.

Gateway

  • #16609 Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with b
    ase64 encoding in JSON instead of string type.

  • #16606 Fixed CoAP Gateway working in connection mode over DTLS.

Breaking Changes

Deployment

  • #16491 Stop releasing packages for macOS 13 (Ventura)

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
16,341
Forks
2,509
Languages
Erlang Shell Elixir

Community & Support

Beta — feedback welcome: [email protected]