EMQX releases - releaseport

Config change

6.2.2 Breaking risk 23d

Auth RBAC

Clustering, Multi‑tenancy, Access Control, Data Integration, Observability

Open

Config change

6.1.3 Breaking risk 25d

Auth Breaking upgrade

License gating + namespace quotas + TLS configs + session

Open

Review required

e5.8.11 Breaking risk 1mo

Auth RBAC RCE / SSRF +3 more

Security hardening + connector & cluster bugfixes

Open

Config change

6.0.3 Breaking risk 1mo

Auth RBAC RCE / SSRF +1 more

Breaking changes — review before upgrading.

Open

Config change

6.2.1 Breaking risk 1mo

Auth Breaking upgrade

Routine maintenance and dependency updates.

Open

Review required

6.1.2 Breaking risk 1mo

Auth RBAC RCE / SSRF

Breaking changes — review before upgrading.

Open

Review required

e5.10.4 Breaking risk 1mo

Auth RBAC Dependencies

Breaking changes — review before upgrading.

Open

e5.8.10 Bug fix 3mo

Notable features

Kafka polling waits for data instead of returning empty batches
RabbitMQ Connector self-recovery without manual restart
Azure Blob Storage health check optimization for large containers

Full changelog

Enhancements

Observability and Performance

#16746 Configured os_mon to collect only system-wide memory statistics by default, reducing per-process memory scanning overhead.
#16911 Reduced the overhead of Prometheus metrics collection by avoiding accidental repeated queries of Mria statistics.

Data Integration

#16961 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consisten
tly.

Licensing

#16853 Made the v5 license parser forward-compatible with v6 license keys.

Bug Fixes

Clustering

#16729 Improved recovery time of a cluster after a simultaneous restart of all nodes.

The built-in Mria database management system no longer waits for a full sync of an internal table used to generate transaction synchronization events.

Data Integration

#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.
#16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout, with a minimum of 30 seconds, reducing unnecessary reconnections and retries when metadata requests take longer than expected.

This is especially beneficial when metadata request timeout is configured to a small value.
#16724 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.
#16935 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.
- #16971 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reason closing as recoverable errors, reducing log noise.

Gateway

#16606 Fixed the CoAP Gateway in connection mode over DTLS.
#17030, #17042 Fixed CoAP client takeover handling for both UDP and DTLS connections.

These changes improve takeover routing and token validation for reconnected clients, and keep the DTLS token takeover grace period aligned with the configured keepalive window.

Operations

#16732 Fixed a crash in emqx ctl subscriptions list that could happen when shared subscriptions were present.

Before this fix, listing subscriptions could fail for some clients and return no output.

After this fix, emqx ctl subscriptions list works reliably with both regular and shared subscriptions.

Security

#16690 Fixed a CRL cache regression where emqx_crl_cache:evict/1 did not fully clear internal URL state.

After eviction, the same CRL URL now re-registers correctly on next use, its refresh timer is restored, and repeated HTTP fetches per connection are avoided.
#17012 Fixed password-based authentication backends to let the auth chain continue when the CONNECT packet has no password, instead of rejecting the connection immediately.

Previously, if a client connected without a password, the first password-based authenticator (built-in database, MySQL, PostgreSQL, MongoDB, Redis, or LDAP) in the chain would return an error, blocking any subsequent authenticators (such as HTTP) from being tried.

Observability

#16672 Ensured that the Erlang PID is printed as a log data field.

#16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:

2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'},
[{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4,
[{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,...

EMQX now prints more meaningful information to help debug the issue.

View release on GitHub

6.2.0 Breaking risk 3mo

Breaking changes

Empty jq program now errors; use '.' instead
String indices use code points instead of byte indices
tonumber() rejects leading/trailing whitespace; use trim() first

Notable features

Agent-to-Agent Card Registry for autonomous AI agent discovery
MQTT subscription filters using User-Property expressions
GCP Workload Identity Federation authentication support

Full changelog

Enhancements

AI Interoperability

#16840 Implemented Agent-to-Agent (A2A) Card Registry. This feature enables autonomous AI agents to discover and collaborate through a standardized, event-driven MQTT 5.0 mechanism.
#16958 Added focused /api-spec.md endpoint and /api-spec.html to support drill-down discovery of EMQX HTTP API context, especially for AI agents and other tools that benefit from fetching only the relevant API slices instead of a single bloated spec.

Core MQTT Functionalities

#16612 Introduced the emqx_setopts app for $SETOPTS server-side option updates, including keepalive control topics and warning+suppression for unknown $SETOPTS/* publishes.
#16887 Added optional MQTT subscription message filters controlled by mqtt.subscription_message_filter.

When enabled, clients can subscribe with a ? suffix such as sensor/+/temperature?location=roomA&value>25 and EMQX will deliver only the messages whose MQTT 5 User-Property entries satisfy the filter expression. When disabled, ? remains part of the topic filter text and no extra filtering is applied.

Messages dropped by subscription-filter mismatch are reported through the existing delivery.dropped event with reason subscription_filter and counted by the new delivery.dropped.filter metric.
#16929 Two new limiter kinds are introduced: delivery_messages and delivery_bytes. In contrast to the existing messages and bytes limiters, which limit messages published by a single client, the new limiter throttle messages received by a single client from any source. If the limit is hit, QoS 0 messages are dropped, QoS > 0 are queued internally, and a retry is scheduled. The retry time is derived from the limiter's configuration.

The new limiters are only supported for memory sessions (durable_sessions.enable = false).

If unspecified, the default values are unlimited, thus keeping backwards compatibility.
#16779 Improved handling of malformed first packets by classifying them as invalid CONNECT packets and adding better protocol hints in logs.

Data Integration

#16589 Updated jq library used in the Rule Engine runtime to version 1.8.1.

Note that the jq 1.8.1 language contains several subtle breaking changes compared to 1.6.1.
- Providing empty string as jq program is now considered an error: use "." instead. (jq#2790)
- String functions now use code point indices: indices/1, index/1, and rindex/1 functions now use code point indices instead of byte indices; use utf8bytelength/0 to get byte index if needed. (jq#3065)
- tonumber/0 rejects numbers with leading or trailing whitespace: use trim/0 before calling tonumber/0. (jq#3055, jq#3195)
- last(empty) behavior changed: last(empty) now yields no output values, consistent with first(empty). (jq#3179)
- limit/2 errors on negative count, instead of silently accepting it. (jq#3181)
- Tcl-style multiline comments supported: this may subtly affect parsing of existing code. (jq#2989)
- Decimal number conversion changed: decimal numbers are now converted to binary64 (double) instead of decimal64. (jq#2949)
- nth/2 emits empty on index out of range, instead of erroring. (jq#2674)
- String multiplication by 0 or less than 1 now emits an empty string instead of the original string. (jq#2142)
#16634 Added support for GET requests in external HTTP schema validation by allowing schema registry entries to specify the HTTP method (POST remains the default).
#16647 Now, in GreptimeDB and EMQX Tables Actions, integer values that are not suffixed with i or u are automatically cast to float (float64) values before being sent to the database.

In InfluxDB Write Syntax, float is the default numeric type, and integers must be annotated. Previously, when EMQX encountered a non-annotated integer, it would interpret it as a one-character string, and insertion would fail if the column was of type float.
#16707 Added a Data Integration to consume from and publish messages to Azure Event Grid.
#16750 Added support for using Workload Identity Federation (WIF) authentication with GCP Connectors (GCP PubSub Producer and Consumer, BigQuery), via Service Account Impersonation. At this point, only OIDC workload identity pool providers using Client Credentials grant type are supported.
#16773 Now, when using MQTT Connector with SSL enabled, if unset, the Server Name Indication (SNI) field will be automatically filled with the server's hostname.
#16893 Added a new Connector and Action that appends data to QuasarDB.
#16962 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consistently.

Access Control

#16597 In MySQL and PostgreSQL authentication and authorization, improved the handling of unallowed and quoted variables in the SQL template.
#16616 Added new configurations to SSO OIDC backend to allow specifying jq expressions to extract the desired role and namespace when creating new dashboard users.
#16759 Added new functions timestamp_s and timestamp_ms to retrieve system time in variform expressions (used e.g. to populate additional client attributes on connection).
#16817 Added REST API endpoints to reset authentication and authorization metrics counters.
- POST /authentication/:id/metrics/reset resets counters for a specific authenticator.
- POST /authorization/sources/:type/metrics/reset resets counters for a specific authorization source.
#16849 Added cookie-based authentication fallback for plugin API endpoints.

Plugin UI iframes served by the dashboard can now authenticate via the emqx_auth cookie when no Authorization header is present. This only applies to /api/v5/plugin_api/... paths.

Management

[#16958] Added emqx ctl api_keys CLI commands to list, show, add, delete, enable, and disable API keys from the command line.

Gateway

#16734 Added ordered token, nkey, and jwt internal authentication methods to the NATS Gateway to reduce the authentication feature gap with NATS Server.

Deployment and Security

#16653 Made Erlang distribution listener address configurable via node.dist_bind_address.

For example: node.dist_bind_address = "10.0.1.5".

Previously required configuration in vm.args as -kernel inet_dist_use_interface {10,0,1,5}.
#16888 Refreshed the default TLS certificate bundle shipped with EMQX packages for local development and testing.

The new server certificate is issued for localhost and loopback addresses only (localhost, 127.0.0.1, ::1).

These default certificates are intended for test and local deployment scenarios only and must not be used in production.
#16916 Now, the emqx_cert_expiry_at Prometheus metric takes into account the expiry date of certificates that belong to managed certificate bundles, when they are used in MQTT listeners.

Performance

#16500 Optimize idle memory usage and reduce the cost of maintaining rate-based metrics.

Note that various 5-minute average rate metrics exposed via APIs are no longer exact averages over the last 300 samples, but are instead EWMAs (Exponentially Weighted Moving Averages) that approximate them closely.
#16547 Disable TLS 1.2 session reuse by default to reduce TLS handshake overhead.

The TLS 1.2 session cache size is limited to 1000 entries, and the cache is local to each node.

This makes the reuse rate very low, especially when large numbers of connections connect to a large cluster.
#16794 Enabled node-level authentication and authorization caches by default.

This reduces repeated backend lookups for repeated client checks out of the box, improving authentication and authorization performance in common deployments.
#16829 Optimized the NATS gateway publish hot path to reduce per-message overhead in frame parsing, subject/topic handling, metrics updates, and ACK/message build steps.
#16911 Reduce the overhead of Prometheus metrics collection by avoiding repeated queries of Mria statistics.
#16550 Stop caching subscribe ACL check results.

MQTT subscription is mostly done once per connection life cycle. Holding the subscribe ACL check result in cache is most of the time a waste of RAM.

Bug Fixes

Core MQTT Functionalities

#16721 Fixed QoS 2 duplicate handling when await_rel_timeout has expired.

Previously, if a client retried a QoS 2 PUBLISH with DUP=1 after the broker had expired the pending PUBREL state (default 300 seconds), the message could be published to subscribers again. EMQX now treats this retransmission as a duplicate handshake packet and returns PUBREC without re-delivering the application message.
#16725 Disabled TCP connection congestion alarm by default by setting conn_congestion.enable_alarm = false in the default zone/global configuration.
#16781 Fixed CONNECT validation when retained messages are unavailable.

When mqtt.retain_available is set to false, CONNECT packets with Will Retain set are now correctly rejected with CONNACK reason Retain not supported (0x9A).
#16783 Fixed MQTT v5 SUBSCRIBE validation for Subscription-Identifier upper bound.

EMQX now accepts 268435455 (0x0FFFFFFF), which is the maximum valid Subscription Identifier value defined by the MQTT spec.
#16974 In EMQX 6.1.1, when a session was subscribed to a topic filter containing retained messages and was later taken over or resumed without re-subscribing to the same topic filter, it would receive again the received messages. Now, the previous behavior is restored, meaning that, upon session resumption or takeover without explicit re-subscription, retained message iteration will cease.
#16876 Changed log message msg_publish_not_allowed to msg_not_routed_to_subscribers.

Data Integration

#16803 Improved error reporting when configuring batch operations for MySQL actions.
#16796 Fixed handling of multiline SQL statements in connector actions.
#16936 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.
#16955 Eliminate Kafka producer action false health check warning logs.

Previously if Kafka producer is idling for too long, Kafka may close the connection (typically default is 10 minutes), if Kafka producer action health-checks happen to be performed around the same moment, there could be a false warning message with message "not_all_kafka_partitions_connected".
#16972 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reason closing as recoverable errors, reducing log noise.
#16863 Added a warning log when an async reply is received for an already-expired request in async actions.
#16847 Fixed a crash when non-ASCII unicode string is used in message transformation expression.
#16979 MQTT ingress bridges now support consuming from remote message queues $queue/{name}/{bind-filter}.
#16999 Fixed an issue where MQTT source failed to receive messages from $queue/ subscriptions when the remote broker has the Message Queue (mq) feature enabled. The MQ message delivery was missing the MQTT v5 Subscription-Identifier property in PUBLISH packets, which the MQTT bridge ingress relies on to route messages from queue subscriptions.

Access Control

#16780 Fixed an issue in authorization source validation where requests missing the type field could trigger an internal error.

Now EMQX returns a clear BAD_REQUEST validation error for this case.
#16805 Added support for authz hook results to opt out of authorization cache storage for dynamic ACL decisions.
#16865 Added cert_common_name and cert_subject aliases for mqtt.client_attrs_init expressions, alongside the existing cn and dn variables.
#16868 Improved REST API authentication error messages to guide programmatic clients toward using API keys (Basic auth) instead of repeatedly logging in for bearer tokens. Error responses now mention the api_key.bootstrap_file configuration option and the POST /api_key endpoint for creating persistent API keys.
#16928 Dashboard-created REST API keys are now generated randomly instead of being derived from the API key name.
#16939 Fixed the built-in database authenticator so it no longer logs a warning when the default bootstrap file path is configured but the file does not exist.
#16993 Fixed an issue where an error response from an OIDC SSO provider would result in a 500 error. Now a more user-friendly result is returned.

Durable Storage

#16874 Fixed a rare issue where Durable Storage backed by DS Raft could stop accepting new messages after a sequence of quick cluster leadership changes, requiring a node restart to recover.

Clustering

#16534 Lowered the default net_ticktime from 2 minutes to 1 minute to improve cluster node failure detection.

In the event of a network outage or abrupt node termination, remaining nodes will detect the down node sooner, reducing the time before failover mechanisms activate and improving overall cluster resilience and user experience.

Plugins

#16842 Reduced noisy plugin config warning logs when no peer node has the plugin config yet.

Previously, when a node tried to fetch plugin config from peer nodes during startup, it would log a warning even when all peers simply didn't have the config (e.g., first node to load the plugin). Now this benign case is logged at debug level, and only genuine errors (RPC failures, timeouts) remain as warnings.
#16843 Fixed an issue where HTTP headers and query string parameters were not passed through to plugin API handlers, causing plugins to receive empty headers and missing query parameters.
#16904 Prevent enabling or starting multiple versions of the same plugin at once. When a newer version is enabled, older configured versions of that plugin are automatically disabled, and management API actions now return a clear error instead of reporting success while another version is still active.

Gateway

#16536 Fixed the CoAP Gateway when running in DTLS connection mode.
#16996 Fixed CoAP DTLS connection-mode to keep sessions available after sock_closed and support reconnect takeover with the same clientid and valid token.

Observability

#16879 Added log.audit.cache_size as the primary config key for the audit log DB cache size, while keeping log.audit.max_filter_size for backward compatibility.

Deployment and Security

#16683 Added support for HTTPS CRL Distribution Point URLs in the CRL cache, so CRLs fetched from https:// endpoints are now cached and refreshed correctly.
#16901 Fixed RPM package OpenSSL dependency for RHEL 9.6 LTS: pinned openssl >= 3.5.1 for RHEL >= 9.7 and openssl >= 3.0.7 for older RHEL 9 versions.

ExHook

#16890 Fixed an ExHook issue where successful reconnect reloads could duplicate the same server name in the running list and trigger repeated callback dispatches.

Licensing

#16764 Refined license customer tier handling by introducing STANDARD and VIP tiers in enforcement logic and reducing the official-license STANDARD expiry grace period from 90 days to 15 days before new sessions are restricted.

View release on GitHub

6.1.1 Breaking risk 5mo

Breaking changes

Message Stream prefix changed from $s to $stream with required name
Message Queue prefix changed to $queue with required name
Stream subscriptions require $stream/name/topic_filter syntax

Notable features

Retained message iteration resumes from last confirmed delivery
CoAP Block-Wise Transfer protocol support
JT/T 808 protocol 2019 with GBK character encoding

Full changelog

Enhancements

Core MQTT Functionalities

#16637 Previously, if a session was taken over while in the middle of receiving several retained messages from a wildcard topic subscription, iteration over those retained messages would start over for the new client, repeating already delivered retained messages. Now, the new client will resume iteration from the last confirmed delivered message from the last session, reducing the number of duplicated retained messages.

Durable Storage

#16704 Prevent RocksDB storage backing Durable Storage shards from preallocating large chunks of disk space by default.

Previously, each shard consumed a significant amount of disk space immediately, which compounded due to multiple Durable Storage databases now being created by default (each consisting of 16 shards).

Message Queue and Streams

#16551, #16714 Refined Message Stream and Message Queue interfaces.

For stream subscriptions, the $stream prefix is now used. Streams are now named, and the name should be specified on subscribe: SUBSCRIBE $stream/<name>/<topic_filter> (or SUBSCRIBE $stream/<name> if the stream is known to exist). The starting point for stream consumption is specified using the stream-offset user subscription property.

For message queue subscriptions, the $queue prefix is used. Message queues are also named, and the name should be specified on subscribe: SUBSCRIBE $queue/<name>/<topic_filter> (or SUBSCRIBE $queue/<name> if the queue is known to exist).

Notes:
- Stream and queue names may contain only alphanumeric characters, underscores, hyphens, and dots.
- Previously created unnamed streams and queues obtain the name derived from their topic filter. Their name becomes their topic filter with prepended /.
- The legacy $q queue interface (introduced in 6.0.0) and $s stream interface (introduced in 6.1.0) are kept for compatibility, but their use is discouraged.
- If Message Queues are enabled, $queue prefix cannot be used for subscribing to shared subscriptions anymore.
#16820 Added shorter API path aliases /queues/* and /streams/* for the Message Queue and Message Stream management APIs.

The previous /message_queues/* and /message_streams/* paths remain functional for backward compatibility but are no longer shown in the API documentation.

Gateway

#16719 Added Block-Wise Transfer support for CoAP and LwM2M gateways.
- Added block-wise settings: enable, max_block_size, max_body_size, and exchange_lifetime.
- Improved POST /gateways/coap/clients/:clientid/request and LwM2M downlink handling for large block-wise messages.
#16736
- Added the jt808.frame.parse_unknown_message option, enabling the JT808 gateway to transparently forward unknown messages.
- Added JT/T 808 protocol 2019 support.
- Added GBK character encoding support for JT/T 808 gateway.
  
  The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new frame.string_encoding configuration option is added:
  - utf8 (default): Pass through strings as-is (backward-compatible)
  - gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices
  This affects both uplink parsing (GBK to UTF-8) and downlink serialization (UTF-8 to GBK), including string fields such as license plates, driver names, text messages, area names, and client parameters.
  MQTT payloads always use UTF-8 encoding regardless of this setting.
- Added support for custom msg_sn in JT/T 808 gateway downlink messages.
  
  When a downlink MQTT message payload contains a msg_sn field in the header, the gateway will use that value instead of the auto-generated channel sequence number. This allows external systems to control message sequencing for specific use cases.
- Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with base64 encoding in JSON instead of string type.
- Fixed JT/T 808 0x0702 driver identity report message parsing.

Security

#16447 Added a new force_delete query parameter to the following HTTP APIs for managing certificates:
- DELETE /certs/global/name/:name
- DELETE /certs/ns/:ns/name/:name
When omitted or false, configurations in all namespaces will be checked to see if the managed bundle being deleted is being referenced and fail deletion if affirmative.
#16461 Support TLS 1.3 session ticket resumption.

EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.

Node-level configuration: node.tls_stateless_tickets_seed is the secret key seed for generating TLS 1.3 stateless session tickets.
Listener-level configuration: listeners.ssl.<name>.ssl_options.session_tickets enables TLS 1.3 session resumption using stateless session tickets.
Possible values are disabled (default), stateless, and stateless_with_cert (includes certificate information).

Session tickets are only generated when node.tls_stateless_tickets_seed is configured (non-empty) and session_tickets is enabled in listener SSL options.
If session_tickets is enabled but node.tls_stateless_tickets_seed is empty, session tickets will not be generated and an error log will be emitted when starting the listener.

Access Control

#16504 Added a new option to parameterize the data source from which to construct the dashboard username when creating a new user via OIDC SSO.
#16741
Added configuration options idp_signs_envelopes and idp_signs_assertions to SAML SSO backend to control signature verification behavior.

Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.

Both options default to false for backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these to true when their IdP is configured to sign SAML responses.
#16684 Enabled mqtt.client_attrs_init expressions can make sure of password (for example, feed it to jwt_value) to initialize client attribute.
#16730 Redis authorization now supports a compatibility mode for EMQX 4.x ACL data.
Set compatibility_mode = v4 to enable legacy %u/%c placeholder conversion and legacy ACL access values 1|2|3 (mapped to subscribe/publish/all).
By default, compatibility mode remains disabled, so existing Redis authz behavior is unchanged.

Data Integration

#16511 Supported the IoTDB Table Model in the data integration.
#16516 Added two new Action metrics: aggregated_upload.success and aggregated_upload.failure. These are only relevant for Aggregated Upload Actions (S3, Azure Blob Storage, Snowflake and S3Tables) and are incremented when an aggregated delivery succeeds or fails, respectively.
#16658 Previously, when the server port was omitted in an EMQX Tables Connector, the port would default to 80. Now, it defaults to 4001.

A more intelligible error message is returned when an EMQX Tables Connector is configured with SSL enabled but cacertfile, certfile or keyfile configurations are missing.

Rule Engine

#16524 Enhanced base64 encoding and decoding functions in rule engine SQL with support for padding and URL-safe options.

The base64_encode and base64_decode functions now support optional parameters to control encoding behavior:
- no_padding: Encode or decode without padding characters (=). Useful when you need to remove padding from encoded strings or decode strings that don't have padding.
- urlsafe: Use URL-safe base64 encoding/decoding. Replaces + with - and / with _, making the encoded string safe to use in URLs without encoding.
You can use these options individually or combine them. When combining options, the order doesn't matter.

Examples in rule SQL:

Encode without padding:
```
SELECT base64_encode(payload, 'no_padding') as encoded FROM "t/#"
```
Encode with URL-safe characters:
```
SELECT base64_encode(payload, 'urlsafe') as encoded FROM "t/#"
```
Encode with both options (no padding and URL-safe):
```
SELECT base64_encode(payload, 'no_padding', 'urlsafe') as encoded FROM "t/#"
```
Decode URL-safe base64:
```
SELECT base64_decode(payload, 'urlsafe') as decoded FROM "t/#"
```
Decode unpadded URL-safe base64:
```
SELECT base64_decode(payload, 'urlsafe', 'no_padding') as decoded FROM "t/#"
```
#16533 Added two new Variform expression helper functions json_value and jwt_value to extract values from JSON data and JWT tokens using dot-separated key paths.

The json_value function extracts values from JSON binary strings using a dot-separated path to navigate nested structures.
The jwt_value function decodes JWT token payloads and extracts claim values using the same path syntax.

For example, if username is a JSON object, you can access field with json_value(username, 'shop.floor');
if password is JWT with a customized claim, you can access the nested value with jwt_value(password, 'client_attrs.unitid').
#16539 Added support for keeping track of metric aliases when utilizing the spb_decode Rule Engine function.

Now, after a device or edge of network (EoN) node publishes its DBIRTH/NBIRTH messages, alias mappings in said message will be stored and used when the client later uses spb_decode on a message matching the DDATA/NDATA topic patterns. The original names of the metrics will be added to the output of spb_decode.

Note: when executing fallback actions, the mapping is not available in the environment they run in. This means that, if a fallback action republishes the undecoded DDATA/NDATA payload to a Sparkplug B DDATA/NDATA topic, the metric name fields will not be populated by the alias mapping.

#16581 Added a new Rule SQL function: spb_zip_kvs.

Given an already decoded, valid Sparkplug B message, it'll go through the metrics and "zip" each property name and its value together.

properties (and any nested PropertySet values) have their keys and values fields
removed and the values of the two former fields zipped together and merged with the
original map. Values that have the PropertySet or PropertySetList types are
recursively transformed like this.
Values of PropertySetList type have their propertyset field removed and replaced by
an array of PropertySets, transformed following the above item's description.
If present, dataset_value field is transformed in a similar fashion: its columns and
rows fields are removed and their values zipped together in an object merged with the
original object. types and num_of_columns fields are removed from output.
Other values/fields are untouched.

For example, given this input decoded Sparkplug B message:

{
  "metrics": [
    {
      "properties": {
        "values": [
          {"int_value": 99},
          {
            "propertyset_value": {
              "values": [{"int_value": 999}],
              "keys": ["inner"]
            }
          },
          {
            "propertysets_value": {
              "propertyset": [
                {
                  "values": [{"int_value": 1}],
                  "keys": ["inner1"]
                },
                {
                  "values": [{"int_value": 2}],
                  "keys": ["inner2"]
                }
              ]
            }
          }
        ],
        "keys": [
          "leaf",
          "nested_prop",
          "nested_prop_list"
        ]
      }
    },
    {
      "dataset_value": {
        "num_of_columns": 2,
        "types": [7, 12],
        "rows": [
          {
            "elements": [
              {"int_value": 3},
              {"string_value": "3"}
            ]
          },
          {
            "elements": [
              {"int_value": 4},
              {"string_value": "4"}
            ]
          }
        ],
        "columns": ["col1", "col2"]
      }
    }
  ]
}

Then, the output of spb_zip_kvs will be:

{
  "metrics": [
    {
      "properties": {
        "nested_prop_list": {
          "propertysets_value": [
            {"inner1": {"int_value": 1}},
            {"inner2": {"int_value": 2}}
          ]
        },
        "nested_prop": {
          "propertyset_value": {"inner": {"int_value": 999}}
        },
        "leaf": {"int_value": 99}
      }
    },
    {
      "dataset_value": {
        "col2": {"elements": [{"int_value": 4}, {"string_value": "4"}]},
        "col1": {"elements": [{"int_value": 3}, {"string_value": "3"}]}
      }
    }
  ]
}

REST API

#16718 Improve REST API Swagger spec.

Previously, summaries and descriptions of spec fields were mixed together. Now, summaries are brief, simple and punctuation-free, while descriptions provide all the details.
#16735 EMQX now supports plugin-defined HTTP API callbacks under /api/v5/plugin_api/{plugin}/....

This allows plugin authors to expose plugin-specific API endpoints through the dashboard API service, with consistent authentication and HTTP error handling.

Observability

#16656 Made system monitor reports such as busy_port and long_schedule more informative by including process labels for easier troubleshooting.
#16744
Supports end-to-end tracing of messages published via HTTP API.

Performance

#16413 Improve subscription handling performance.
#16492 Slightly improve idle system memory usage.
#16757 Set os_mon to collect only system-wide memory statistics by default, reducing per-process memory scanning overhead.

Bug Fixes

Core MQTT Functionalities

#16480 Fixed an issue where WebSocket connections could crash after the peer closed the connection, typically observed under moderate load.

crasher: initial call: cowboy_tls:connection_process/4,
error: {{case_clause,{error,closed}},[
{cowboy_websocket_linger,websocket_send_close,2,[{file,"cowboy_websocket_linger.erl"},{line,752}]},
{cowboy_websocket_linger,websocket_close,3,[{file,"cowboy_websocket_linger.erl"},{line,743}]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}
]}
messages: [
{ssl,{sslsocket,{gen_tcp,#Port<...>,...},[...]},<<130,130,27,93,145,101,251,93>>},
{ssl_closed,{sslsocket,{gen_tcp,#Port<...>,...},[...]}}
], ...

#16515 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised Maximum-Packet-Size.
#16553 Fixed an issue where not all retained messages would be delivered if a subscriber hit the retained message dispatch rate limit.

If the dispatch rate limit is reached while iterating over retained topics, then the client process will retry the iteration at a later time with exponential back-off (minimum 300 ms, maximum 10 s).

The retainer.flow_control.batch_deliver_number configuration has been deprecated.
The retainer.flow_control.batch_read_number no longer supports being set to 0 to mean read all remaining retained messages at once. If set to 0, it'll default to 1000 messages.
#16569 Fixed a rare race condition that could cause the supporting emqx_flapping process for flapping detection to crash under high system load.
#16651 Fixed a rare connection process crash during shutdown caused by operating on an already closed socket, typically under high system stress.
Prior to this fix, such race condition typically result in an error level log saying {badmatch,{ok,{sock_error,closed}....
#16675 Fixed timestamp ordering issue where disconnected_at could be later than connected_at during session takeover or discard scenarios.

Previously, disconnected_at was recorded too late (in ensure_disconnected), after the new session's connected_at was already set. This caused a race condition where disconnected_at > connected_at, making it difficult to track client presence state externally.

The fix records disconnected_at immediately when takeover begins or when discard is received, ensuring it's always earlier than the new session's connected_at. This ensures correct timestamp ordering for external presence state tracking systems.
#16715 Fixed an issue where retained $SYS messages (for example, broker/node identity topics) were stored without expiry, which could leave stale node identifiers visible in Dashboard views after StatefulSet rotation.

Now, newly published retained $SYS messages include Message-Expiry-Interval = 3600 (1 hour).

For already existing stale retained $SYS entries created before this change, you can manually clear them by publishing an empty retained message to the stale topic:
```
emqx eval 'emqx:publish(emqx_message:set_flag(retain, true, emqx_message:make(emqx_sys, <<"$SYS/brokers/[email protected]/sysdescr">>, <<>>))).'
```
Replace the topic in the command with the stale $SYS/... topic you want to remove.
#16731 Fixed a crash in emqx ctl subscriptions list that could happen when shared subscriptions were present.

Before this fix, listing subscriptions could fail for some clients and return no output.

After this fix, emqx ctl subscriptions list works reliably with both regular and shared subscriptions.
#16782 Fixed MQTT v5 protocol handling for invalid PUBLISH properties.

If a client sends a PUBLISH packet containing Subscription-Identifier, EMQX now treats it as a protocol error and disconnects the client.

Gateway

#16603 Fixed the CoAP Gateway when running in DTLS connection mode.
#16670 NATS gateway now enforces the max publish payload, honors the echo option (no local delivery), and improves publish/subscribe subject handling and related error messages.

Access Control

#16423 Added support for verifying the 'aud' (audience) claim in JWT authentication.

When the 'aud' claim is configured in verify_claims, the JWT token must include a valid 'aud' claim. The verification supports both string and array formats:
- If 'aud' is a string, it must exactly match the expected value.
- If 'aud' is an array, at least one element in the array must match the expected value.
- Empty string or empty array will fail verification.
- Missing 'aud' claim will fail verification when it's configured in verify_claims.
#16459 Fixed the issue in SCRAM authentication HTTP API. Previously, incorrect user ID was returned for the created user in the user creation API call.

Data Integration

#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.
#16542 Fixed an issue where Kafka producer connections could disconnect prematurely when Kafka was overloaded, causing excessive produce request retries.
The request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
reducing unnecessary reconnections and retries when metadata requests take longer than expected.
This is especially beneficial when metadata request timeout is configured to a small value.
#16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.
#16657 Fixed an issue where, when importing configuration from an older node version into a newer one, values would not be upgraded according to newer code, leading to strange behavior.

One such example is importing a MQTT Connector with static clientids from 5.10.0 into 6.0.0. In 5.10.0, usernames and passwords could not be associated with particular static clientids, and this was represented internally in a certain way. Later versions added the capability of creating those associations, with a different internal representation. This subtle internal representation conversion was missing when importing such configurations in previous EMQX versions.
#16659 When using an older MQTT Connector configuration with static clientids (from 5.10.0 and earlier) on later EMQX versions, the username and password at the root of the configuration was ignored. This could cause trouble when upgrading and keeping the same configuration, as the MQTT clients would stop using the credentials.

Now, if there are username and/or password fields in the root Connector, those credentials are merged with any specific ones specified per clientid, the latter taking precedence.
#16723 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.
#16742
Fixes the issue of GreptimeDB TLS connection failure.

Durable Storage

#16512 Improve handling of recoverable errors in the durable session.
Durable sessions will now retry creation of durable storage iterators when that operation fails due to network issue.
Previously, the whole session would get disconnected.

Fix problem with the retry mechanism in the emqx_ds_client component.
Previously, the number of retry attempts on recoverable errors was limited.

Fix problems with the shared subscriptions:
- Fix problem with shared subscription leader not coming up after node restart.
- Shared subscription leader no longer advertises streams that reached the end of replay to the clients.
- Make shared sub leader state checkpoint transaction options configurable
#16614 Improvements and bug fixes related to durable storage feature.
- Improved handling of configuration inconsistencies between the nodes.
  
  Previously, when a durable storage was created in a cluster where
  nodes had different initial durable storage configuration, the
  replicas wound not converge. This change addresses this problem by
  replicating the configuration of the shard leader node to the
  replicas during initialization of the storage and subsequent
  configuration changes.
  
  Warning: this change is not backward-compatible. During a rolling
  cluster upgrade the shards will pause until the majority of their
  replicas are upgraded to the new version of EMQX, after which
  downgrade to the previous versions of EMQX will become impossible.
- Fixed an issue in the durable storage subscription mechanism.
  
  Previously, a durable subscription created with a fresh iterator
  could miss a stored message with the timestamp precisely matching
  timestamp of the iterator.
#16770 Improve stability of durable sessions during takeover and garbage collection.

Clustering

#16393 Improved the stability of the Cluster Link route replication under unstable network conditions.
#16465 Upgraded gen_rpc to 3.5.1.

Prior to the gen_rpc upgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new version gen_rpc no longer has the long tail and converted crash logs to more readable error logs,
and the frequent log "failed_to_connect_server" is also throttled to avoid spamming.
#16544 Improve robustness of cluster autoclean procedure.

Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
This fix resolves this issue.
#16739 Improved recovery time of a cluster after a simultaneous restart of all nodes.

Built-in Mria database management system no longer waits for the full sync of an internal table used to generate transaction synchronization events.

Observability

#16537 Fixed formatter crash when logging gen_rpc errors.

Prior to this fix, EMQX may report "FORMATTER CRASH" errors when gen_rpc logged certain error messages (e.g., transmission timeout errors).
The formatter now handles these error messages correctly without crashing.
#16661 Improve topic_metrics and cluster_rpc logging when invalid topic is requested.
#16674 Ensure Erlang pid is printed as a log data field.

#16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:

2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'},[{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4,[{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,[{file,"emqx_metrics_worker.erl"},{line,322}]},{emqx_rule_runtime,do_eval_action_reply_t...

Now, we print more meaningful information to help debug the issue.

Security

#16545 Fixed node.cookie handling of # character. Previously, if the cookie contained #, only the prefix before # would take effect.
For example, if abc#d was configured, only abc was used as the cookie.

Added validation to reject problematic characters: backslash, single quote, double quote, and space.
#16664 Previously, it was possible to upload managed certificate files associated with non-existent managed namespaces. Now, namespace existence is checked before accepting the upload.
#16692 Fixed a CRL cache regression where emqx_crl_cache:evict/1 did not fully clear internal URL state.
After eviction, the same CRL URL now re-registers correctly on next use, restores its refresh timer, and avoids repeated HTTP fetches per connection.

Plugin

#16784 Reduced noisy plugin startup warnings in single-node deployments.

EMQX no longer tries to fetch plugin config from the local node during cluster config sync, avoiding repeated config_not_found_on_node warnings at startup.
#16823 Fixed a Dashboard plugin management issue for preinstalled plugins.

When a plugin package is unpacked into plugins/ before node startup, starting it from the Dashboard no longer causes Plugin Config Not Found on the plugin config page.

Miscellaneous

#16620 Fix CRC32C dynamic library load issue on aarch64.

View release on GitHub

e5.10.3 Breaking risk 5mo

Notable features

TLS 1.3 stateless session ticket resumption with optional certificates
SAML IdP signature verification configuration options
IoTDB Table Model data integration support

Full changelog

Enhancements

Deployment

#16491 Start releasing packages for macOS 15 (Sequoia)

Observability

#16135 Added two new metrics and corresponding rates for the GET /monitor_current HTTP API: rules_matched and actions_executed. They track the number of rules that matched and act
ion execution rate (i.e., success + failure), respectively.
#16324 Added support for end-to-end tracing of messages published via HTTP API.

Security

#16625 Added configuration options idp_signs_envelopes and idp_signs_assertions to SAML SSO backend to control signature verification behavior.
Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.

Both options default to false for backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these to true when their IdP is configured to sign SAML responses.
#16456 Added support for TLS 1.3 session ticket resumption.

EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.

Node-level configuration: node.tls_stateless_tickets_seed is the secret key seed for generating TLS 1.3 stateless session tickets. Listener-level configuration: listeners.ssl.<name>.ssl_options.session_tickets enables TLS 1.3 session resumption using stateless session tickets.
Possible values are disabled (default), stateless, and stateless_with_cert (includes certificate information).

Session tickets are only generated when node.tls_stateless_tickets_seed is configured (non-empty) and session_tickets is enabled in listener SSL options.
If session_tickets is enabled but node.tls_stateless_tickets_seed is empty, session tickets will not be generated and an error log will be emitted when starting the listener.

Gateway

#16220 Added the jt808.frame.parse_unknown_message option, enabling the JT808 gateway to transparently forward unknown messages.
#16596 Added support for JT/T 808 protocol 2019.
#16627 Add GBK character encoding support for JT/T 808 gateway.

The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new frame.string_encoding configuration option is added:
- utf8 (default): Pass through strings as-is (backward-compatible)
- gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices
This affects string fields including license plates, driver names, text messages, area names, and client parameters.
MQTT payloads always use UTF-8 encoding regardless of this setting.

Data Integration

#16511 Added support for the IoTDB Table Model in the data integration.

Bug Fixes [39/760]

Core MQTT Functionalities

#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
#16514 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised Maximum-Packet-Size.

Rule Engine

#16489 Fixed an issue where the following rule functions always returned undefined:
msgid/0, qos/0, topic/0, topic/1, flags/0, flag/1,
clientid/0, username/0, peerhost/0, payload/0, payload/1.

Note: This is a backward compatibility fix for EMQX v4. These functions are not documented in EMQX v5 and later. The encouraged usage is to directly reference fields from the rule evaluation context. For example, SELECT clientid ... instead of SELECT clientid().

Data Integration

#16263 Previously, the Kafka consumer connector performed health checks by verifying partition leader connectivity for all partitions.
In a clustered deployment, each EMQX node is assigned only a subset of partitions, causing leader connections for unassigned partitions to remain idle.
Since Kafka closes idle connections after a timeout (10 minutes by default), this behavior could trigger false connectivity alarms.

The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.
#16336 Fixed a race condition which may cause timeout when testing connectivity or stopping a connector from the dashboard.
#16383 Previously, when using IoTDB Connector with its REST API driver, credentials would not be checked during health checks. Now, we send a no-op query during IoTDB connector health c
heck. This enables early detection of misconfigured client credentials.
#16415 Upgraded Apache Pulsar client to 2.1.2.

When Pulsar producer action's batch_size is configured to 1, the producer will now encode single messages instead of single-element batches.
This enables consumers to share load using Key Share strategy.
#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself w
as restarted. Now, the Source will re-subscribe upon reconnect.
#16585 Fixed an issue with GreptimeDB TLS connection failures.
#16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
reducing unnecessary reconnections and retries when metadata requests take longer than expected.
This is especially beneficial when metadata request timeout is configured to a small value.
#16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.

Clustering

#16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
#16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replic
ation state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.
#16452 Upgraded gen_rpc to 3.5.1.

Prior to the gen_rpc upgrade, EMQX may experience a long tail of crash logs due to connection timeout if a peer node is unreachable.
The new version of gen_rpc no longer has the long tail and converts crash logs to more readable error logs,
and the frequent log "failed_to_connect_server" is also throttled to avoid log spamming.
#16543 Improved robustness of cluster autoclean procedure.

Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
This fix resolves this issue.

Access Control

#16304 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.
#16541 Fixed an issue where OIDC issuer URLs were automatically normalized with a trailing slash when saved to the configuration file, causing issuer mismatch errors when the OIDC provid
er's discovery document returned the issuer without a trailing slash.

Observability

#16418 Reduced the volume of logs generated when a resource exception occurs (resource_exception). These logs are now throttled, and some potentially large terms are redacted from the
m.
#16535 Fixed formatter crash when logging gen_rpc errors.

Prior to this fix, EMQX would crash with "FORMATTER CRASH" errors when gen_rpc logged certain error messages (e.g., transmission timeout errors). The formatter now handles these error messages correctly without crashing.

Gateway

#16609 Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with b
ase64 encoding in JSON instead of string type.
#16606 Fixed CoAP Gateway working in connection mode over DTLS.

Breaking Changes

Deployment

#16491 Stop releasing packages for macOS 13 (Ventura)

View release on GitHub

6.0.2 Breaking risk 6mo

Notable features

base64_encode/decode with no_padding and urlsafe options
json_value and jwt_value functions for nested value extraction
Sparkplug B metric alias tracking in spb_decode

Full changelog

6.0.2

Release Date: 2026-01-16

Make sure to check the breaking changes and known issues before upgrading to 6.0.2.

Enhancements

Security

#16461 EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS connections without requiring server-side session state.

Configuration
- Node-level: node.tls_stateless_tickets_seed
  
  Secret key seed used to generate TLS 1.3 stateless session tickets.
- Listener-level: listeners.ssl.<name>.ssl_options.session_tickets
  
  Enables TLS 1.3 session resumption. Supported values:
  - disabled (default)
  - stateless
  - stateless_with_cert (includes certificate information in the ticket)
Notes
- Session tickets are generated only when node.tls_stateless_tickets_seed is configured (non-empty), and session_tickets is enabled in listener SSL options.
- If session_tickets is enabled but node.tls_stateless_tickets_seed is empty, session tickets will not be generated and an error log will be emitted when starting the listener.
This PR also included a fix for the TLS 1.2 session resumption configuration. Previously, the reuse_sessions option for SSL listener did not take effect, i.e. EMQX always tried to enable TLS 1.2 session resumption. It is now possible to turn it off. Please note that TLS 1.2 session resumption will be disabled by default starting version 6.2.0.

Rule Engine

#16524 Enhanced base64 encoding and decoding functions in rule engine SQL with support for padding and URL-safe options.

The base64_encode and base64_decode functions now support optional parameters to control encoding behavior:
- no_padding: Encode or decode without padding characters (=). Useful when you need to remove padding from encoded strings or decode strings that do not have padding.
- urlsafe: Use URL-safe base64 encoding/decoding. Replaces + with - and / with _, making the encoded string safe to use in URLs without encoding.
These options can be used individually or combined in any order.

Examples in rule SQL:

Encode without padding:
```
SELECT base64_encode(payload, 'no_padding') as encoded FROM "t/#"
```
Encode with URL-safe characters:
```
SELECT base64_encode(payload, 'urlsafe') as encoded FROM "t/#"
```
Encode with both options (no padding and URL-safe):
```
SELECT base64_encode(payload, 'no_padding', 'urlsafe') as encoded FROM "t/#"
```
Decode URL-safe base64:
```
SELECT base64_decode(payload, 'urlsafe') as decoded FROM "t/#"
```
Decode unpadded URL-safe base64:
```
SELECT base64_decode(payload, 'urlsafe', 'no_padding') as decoded FROM "t/#"
```
#16533 Added two new variadic expression helper functions, json_value and jwt_value, for extracting values from JSON data and JWT tokens using dot-separated key paths.
- json_value extracts values from JSON binary strings by navigating nested objects with a dot-separated key path.
- jwt_value decodes the payload of a JWT and extracts claim values using the same dot-separated path syntax.
Examples:
- If username contains a JSON object, you can access a nested field with json_value(username, 'shop.floor').
- If password contains a JWT with a customized claim, you can access a nested value with jwt_value(password, 'client_attrs.unitid').
#16539 Added support for tracking Sparkplug B metric aliases when using the spb_decode Rule Engine function.

After a device or Edge of Network (EoN) node publishes its NBIRTH or DBIRTH messages, EMQX records the alias-to-name mappings defined in those messages. When spb_decode is later applied to NDATA or DDATA messages from the same session, the original metric names are automatically restored and included in the decoded output.

Note: when executing fallback actions, the mapping is not available in the environment where they run. This means that, if a fallback action republishes the undecoded DDATA/NDATA payload to a Sparkplug B DDATA/NDATA topic, the metric name fields will not be populated by the alias mapping.

Durable Storage

#16136 Improved resource management and performance for durable storage.

Introduced a concept of a durable storage database group. Certain resources (such as memtable size and disk usage quota) can be shared between the group members.

Added the following new metrics (per DB group):
- emqx_ds_disk_usage: Total size of SST files
- emqx_ds_write_buffer_memory_usage: RocksDB memtable size
- emqx_ds_total_trash_size: Disk usage by trash SST files
Added the following group configurations:
- durable_storage.db_groups.<group>.storage_quota: Soft quota for the SST files size
- durable_storage.db_groups.<group>.write_buffer_size: Maximum memtable size
- durable_storage.db_groups.<group>.rocksdb_nthreads_high and durable_storage.db_groups.<group>.rocksdb_nthreads_low: Size of RocksDB thread pools.
Added a new alarm that is raised when the quota is exceeded: db_storage_quota_exceeded:<DB>. Please refer to the "Storage Quota" section of the documentation for more details.

Default session checkpoint interval has been changed to 15s.
#16286 Optimized the default durable storage settings to reduce CPU load. This PR disables subscriptions for DBs that don't use them.

Performance

#16413 Improved subscription handling performance by reducing redundant monitoring of MQTT session processes.

Bug Fixes

Core MQTT Functionalities

#16354 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
#16515 Fixed an issue where WebSocket connections could crash when the broker sent messages exceeding the client-advertised Maximum-Packet-Size.
#16569 Fixed a rare race condition that could cause the supporting emqx_flapping process for flapping detection to crash under high system load.

Data Integration

#16265 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.

Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
#16542 Fixed an issue where Kafka producer connections could disconnect prematurely when Kafka was overloaded, leading to excessive produce request retries.

The produce request timeout is now automatically set to at least twice the metadata request timeout, with a minimum of 30 seconds. This reduces unnecessary reconnections and retries when metadata requests take longer than expected, especially when the metadata request timeout is configured to a small value.
#16352 Upgraded Apache Pulsar client to 2.1.2. When Pulsar producer action's batch_size is configured to 1, the producer will now encode single messages instead of single-element batch. This should allow consumers to share load using Key Share strategy.
#16383 Improved the IoTDB Connector health check when using the REST API driver.

Previously, client credentials were not validated during health checks. The health check now sends a lightweight no-op query, allowing misconfigured credentials to be detected early.
#16507 Fixed an issue where an MQTT Source would stop receiving messages after its Connector reconnected.

Previously, when an MQTT Source’s Connector recovered from a connection loss, its topics were not re-subscribed, causing the Source to stop working until the Connector was restarted. The Source now automatically re-subscribes upon reconnect.

Clustering

#16269 Fixed an issue in the Cluster Linking route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
#16317 Fixed an issue in Cluster Linking garbage-collection logic that could incorrectly remove active routes from the internal routing table while cleaning up stale route replication state.

This issue could occur only in setups with multiple independent Cluster Links, where some links remained down for extended periods.
#16465 Upgraded gen_rpc to 3.5.1.

Before the gen_rpc upgrade, EMQX may experience a long tail of crash logs due to a connect timeout if a peer node is unreachable. The new version of gen_rpc no longer has the long tail and has converted crash logs to more readable error logs. Additionally, the frequent log "failed_to_connect_server" is also throttled to avoid spamming.
#16544 Improved the robustness of the cluster autoclean procedure. Previously, if the autoclean feature was disabled during the initial startup of a node, it would not be activated after subsequent configuration changes.

Upgrade

#16308 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.

Configuration Management

#16397 Added TLS certificate and key file validation before listener startup.

EMQX now performs basic validation when parsing SSL listener configuration and emits error-level logs if invalid PEM files are detected (for example, invalid_pem_file_ignored and bad_keyfile_ignored). This makes troubleshooting easier as administrators can observe errors when starting/reconfiguring, instead of troubleshooting TLS handshake failures.

Access Control

#16423 Added support for verifying the JWT aud (audience) claim during authentication.

When the aud claim is configured in verify_claims, the JWT must include a valid aud value. Both string and array formats are supported:
- If aud is a string, it must exactly match the configured value.
- If aud is an array, at least one element must match the configured value.
- An empty string or empty array fails verification.
- The verification also fails if the aud claim is missing when it is configured in verify_claims.
#16459 Fixed the issue in SCRAM authentication HTTP API. Previously, incorrect user ID was returned for the created user in the user creation API call.

Observability

#16417 Reduced log volume for resource_exception events. Logs generated when a resource exception occurs are now throttled, and potentially large terms are redacted to prevent excessive log output.
#16537 Fixed a formatter crash triggered by certain gen_rpc error messages.

Previously, EMQX could crash with a “FORMATTER CRASH” error when gen_rpc logged specific errors (such as transmission timeouts). The formatter now safely handles these messages without crashing.

View release on GitHub

e5.8.9 Breaking risk 6mo

Minor fixes and improvements.

Full changelog

Enhancements

#16491 Start releasing packages for macOS 15 (Sequoia)
#15944 Improved the information returned when a resource is marked as disconnected for the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication.
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds.
#15944 Improved the information returned when a resource is marked as disconnected for the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication.
#15845 Extended the static_clientids configuration of MQTT Connector to allow specifying usernames and passwords associated with each clientid.

Bug Fixes

Core MQTT Functionalities

#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.

#16081 Fixed an issue where, if a client used extended authentication mechanisms and memory sessions, they could crash with an session_stepdown_request_exception error and calling_self reason.

e.g.:

2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...

#15872 Eliminate warning log unclean_terminate when disconnected after CONNACK is sent with a non-zero reason code.
#15902 Upgraded MQTT client library to 1.13.8

This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long since left the cluster.

This also fixes a race condition that could cause accumulating inconsistencies in the routing table and shared subscription state when a large number of shared subscribers disconnect simultaneously.

Clustering

#16452 Upgraded gen_rpc to 3.5.1.

Prior to the gen_rpc upgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new version gen_rpc no longer has the long tail and converted crash logs to more readable error logs,
and the frequent log "failed_to_connect_server" is also throttled to avoid spamming.

Cluster Linking

#16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replication state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.
#16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.

Data Integration

#16415 Upgrade Apache Pulsar client to 2.1.2.

When Pulsar producer action's batch_size is configured to 1, the producer will now encode single messages instead of single-element batch.
This should make consumers to share load using Key Share strategy.
#16383 When using the IoTDB Connector with the REST API driver, credentials were previously not validated during health checks. Health checks now issue a no-op query to IoTDB, ensuring that invalid or misconfigured client credentials are detected early.
#16336 Fixed a race condition which may cause timeout when testing connectivity or stopping a connector from the dashboard.
#16263 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.

Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
#16138 Fixed Redis cluster failover issue. With this fix, failed PING responses now trigger a cluster topology refresh, ensuring that connector management promptly recovers and updates its view of the Redis cluster after failovers.

Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (e.g., GET) failed. However, periodic PING commands did not trigger a refresh when they failed.
This could cause the connector to remain in a “connecting” state and keep using outdated topology information if no new queries were made after a failover.
#16043 Fixed log details for Kafka data integration when "not_all_kafka_partitions_connected" happened.
#15906 Upgraded Kafka producer library Wolff from 4.0.12 to 4.0.13`, which adds handling for the record_list_too_large error in ProduceResponse.
#15866 Upgraded Kafka producer lib wollf to 4.0.12 to improve handling of temporarily missing partitions in Kafka metadata responses.

In rare race conditions, Kafka may return an incomplete partition list.
Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing.
This gap could cause the partition producer to stall and block shutdown indefinitely.
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
#15826 Now, if the Kafka broker returns an ACL denied response, the connection is considered healthy. Previously, if the user used in a Kafka Consumer Connector did not have permissions to read the special ____emqx_consumer_probe group used for health checks, the health check would fail.
#15827 Fixed atom and process leaks in the GreptimeDB driver.

Fixed a function_clause error that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions.
#15910 Fixed an issue with connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.

Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded gun and related dependencies to 2.1.0.

Security and Authentication

#16237 Fixed an issue where logs related to OIDC SSO could still be emitted after OIDC SSO was disabled.
#16217 Fixed an issue where OIDC callback could fail to find the session during login in a multi-node cluster.
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.

If you have such an user and wish to delete it, run the following in an EMQX console:
```
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end).
```
#15818 Corrected handling of {allow|deny, all} ACL rules.

Previously, these rules were internally translated to match #, which incorrectly failed to match topics prefixed with $ (e.g. $testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure {allow|deny, all} rules correctly match any topic, including $-prefixed ones.
#15899 Improved memory usage: authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.

Rule Engine

#16028 Fixed rule engine jq function memory leak.

Previously if jq built-in function index is used (e.g. .key | index("name")), it would result in memory leak.

Observability

#15967 Prevented rapid memory growth caused by Mnesia transaction blocking when cleaning up large volumes of audit logs.
#15963 Reduced excessive audit log generation triggered by operations from the remote console.
#15863 Fixed license quota alarm text.

Durable Storage

#14674 Limited the number and size of RocksDB info log files created by EMQX durable storage.

Breaking Changes

#16491 Stop releasing packages for macOS 13 (Ventura)
#16062 Fixed an issue where RocketMQ action was disregarding the given payload template and rendering the whole Rule output.

View release on GitHub

v5.8.9 Breaking risk 6mo

Breaking changes

macOS 13 (Ventura) packages no longer released

Notable features

macOS 15 (Sequoia) package support added
HTTP Action timeout configurable via resource_opts.request_ttl
MQTT Connector static_clientids supports per-client credentials

Full changelog

Enhancements

#16491 Start releasing packages for macOS 15 (Sequoia)
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds.
#15845 Extended the static_clientids configuration of MQTT Connector to allow specifying usernames and passwords associated with each clientid.

Bug Fixes

Core MQTT Functionalities

#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.

#16081 Fixed an issue where, if a client used extended authentication mechanisms and memory sessions, they could crash with an session_stepdown_request_exception error and calling_self reason.

e.g.:

2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...

#15872 Eliminate warning log unclean_terminate when disconnected after CONNACK is sent with a non-zero reason code.
#15902 Upgraded MQTT client library to 1.13.8

This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long since left the cluster.

This also fixes a race condition that could cause accumulating inconsistencies in the routing table and shared subscription state when a large number of shared subscribers disconnect simultaneously.

Clustering

#16452 Upgraded gen_rpc to 3.5.1.

Prior to the gen_rpc upgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new version gen_rpc no longer has the long tail and converted crash logs to more readable error logs,
and the frequent log "failed_to_connect_server" is also throttled to avoid spamming.

Security and Authentication

#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.

If you have such an user and wish to delete it, run the following in an EMQX console:
```
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end).
```
#15818 Corrected handling of {allow|deny, all} ACL rules.

Previously, these rules were internally translated to match #, which incorrectly failed to match topics prefixed with $ (e.g. $testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure {allow|deny, all} rules correctly match any topic, including $-prefixed ones.
#15899 Improved memory usage: authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.

Rule Engine

#16028 Fixed rule engine jq function memory leak.

Previously if jq built-in function index is used (e.g. .key | index("name")), it would result in memory leak.

Durable Storage

#14674 Limited the number and size of RocksDB info log files created by EMQX durable storage.

Breaking Changes

#16491 Stop releasing packages for macOS 13 (Ventura)

View release on GitHub

6.1.0 Breaking risk 6mo

Notable features

MQTT Streams with ordering guarantees and multiple consumption
Namespaced metrics for messages, sessions, and integrations
OAuth authentication for Kafka producers

Full changelog

Feature Highlights

EMQX 6.1.0 introduces MQTT Streams, enhanced namespace capabilities, new data integrations, and centralized certificate management.

MQTT Streams

MQTT Streams provide durable collections of messages identified by a topic filter, with explicit lifecycle management. Messages matching a stream's topic filter are automatically appended, enabling consumption with ordering guarantees and support for multiple consumers. Clients can subscribe to streams using the special topic format $s/<timestamp>/topic/filter to consume messages from a specific point in time.

Enhanced Namespace Capabilities

Configurations for namespace and isolation settings are now grouped together in the dashboard.
Expanded namespace functionality with namespaced metrics, authentication, and authorization.
Namespaced metrics are now available for messages, sessions, and data integration operations, exposed via Prometheus endpoints.
Built-in authentication and authorization backends now support namespace-specific users and rules, enabling better multi-tenant isolation.
Added automatic topic isolation using client namespaces as mountpoints.

New Data Integrations

AWS Timestream for InfluxDB connector
EMQX Tables connector
InfluxDB API v3 support for InfluxDB and AWS Timestream connectors
OAuth authentication for Kafka and Confluent Producer connectors
Parquet file support for Azure Blob Storage and S3 Actions in Aggregated mode

Certificate Management

Added centralized certificate management via HTTP API, allowing certificates to be managed independently and referenced in SSL options for listeners and connectors.

Enhancements

Message Queue and Streams

#16326 Implemented MQTT Streams.

MQTT Streams are durable collections of messages identified by a topic filter.
They have an explicit lifecycle, and any published message that matches the Stream's topic filter is automatically appended to the stream.
Streams allow consumption of messages with ordering guarantees and can be consumed multiple times.
To consume messages from a stream, clients can subscribe to a special topic of the form
$s/<timestamp>/topic/filter, where topic/filter refers to an existing Stream. Subscribing with a timestamp allows consumption to begin at a specific point in time. The timestamp may be a Unix timestamp in microseconds or one of two special values: earliest or latest.
#16454 For Message Queues and Streams, reconfigured garbage collection interval is now applied immediately. Previously, the new interval was applied only after the next garbage collection cycle.

Core MQTT Functionalities

#16099 Added a new rule engine event: $events/client/ping. This is triggered when a client sends a PINGREQ packet.

Access Control

#16132 Added an HTTP API to manage certificates in a centralized manner.
#16154 Added support for referencing managed certificate files in SSL options of listeners and clients.
#16266 Added a new authorization.include_mountpoint configuration. When enabled, topics will be prefixed by the listener's mountpoint before being evaluated by authorization backends.
#16272 Added support for specifying namespaced rules when using the built-in authorization backend. Now, MQTT clients that belong to a namespace will consider only their namespaced rules when authorizing actions.
#16345 Added support for specifying namespaced users when using the built-in authentication backend. Now, MQTT clients that belong to a namespace will consider only their namespaced data when authenticating.

Data Integration

#15905 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds.
#16169 Updated our parquer dependency to support encoding timestamp Iceberg types to Parquet files.
#16179 Added support for writing Parquet files when using the Aggregated mode in Azure Blob Storage and S3 Actions.
#16267 Added a new Connector and Action that appends data to AWS Timestream for InfluxDB.
#16290 Added support for OAuth authentication when using Kafka and Confluent Producer Connectors.
#16316 Changed the default batch size and time for multiple actions. Actions that previously supported batch operations had their defaults increased, so that now batching is the default behavior for them.
#16372 Added support for InfluxDB API v3 to InfluxDB and AWS Timestream Connectors.
#16396 Added a new Connector and Action that appends data to EMQX Tables.

Durable Storage

#16136 Improved resource management and performance for durable storage.

Introduced a concept of durable storage database group. Certain resources (such as memtable size and disk usage quota) can be shared between the group members.

Added the following new metrics (per DB group):
- emqx_ds_disk_usage: Total size of SST files
- emqx_ds_write_buffer_memory_usage: RocksDB memtable size
- emqx_ds_total_trash_size: Disk usage by trash SST files
Added the following group configurations:
- durable_storage.db_groups.<group>.storage_quota: Soft quota for the SST files size
- durable_storage.db_groups.<group>.write_buffer_size: Maximum memtable size
- durable_storage.db_groups.<group>.rocksdb_nthreads_high and durable_storage.db_groups.<group>.rocksdb_nthreads_low: Size of RocksDB thread pools.
Added a new alarm that is raised when the quota is exceeded: db_storage_quota_exceeded:<DB>. Please refer to the "Storage Quota" section of the documentation for more details.

Default session checkpoint interval has been changed to 15s.
#16286 Optimized the default durable storage settings to reduce CPU load. This PR disables subscriptions for DBs that don't use them.

Namespace

#16211 Added initial support for namespaced metrics.
- Messages received
- Count
- Bytes
- Messages sent
- Count
- Bytes
- Number of sessions
- Data integration
- Number of actions triggered
- DB records
- Number of AuthN records
- Number of AuthZ records
Clients in managed namespaces will bump the namespaced metrics above, as well as continue to bump the usual global metrics.

These metrics are exposed in Prometheus format to be scraped from the GET /prometheus/ns/stats endpoint. By specifying the ns=NAMESPACE query parameter, only data from NAMESPACE will be returned. Omitting this parameter causes data from all namespaces to be scraped. Namespaces are added as labels to metrics.
#16314 Now, global admin users will see resources from all namespaces (by default) when listing namespaced resources (connectors/sources/actions/rules). They may focus on one particular namespace when performing CRUD operations by passing the ns=NS query parameter. If they want to list only the global namespace resources, they omit ns and pass only_global=true query parameter. Namespaced resources now return the namespace field to denote where they come from, with namespace being null for global resources to distinguish them from a potential namespace called "global".
#16360 Added a GET /mt/ns/:ns/metrics endpoint that will return namespace-specific metrics in JSON format.
#16472 Added a new configuration option namespace_as_mountpoint to enable automatic topic isolation using client namespaces.

When enabled, EMQX uses the client's namespace (from client_attrs.tns) as a topic mountpoint if no mountpoint is configured on the listener.

Topics are automatically prefixed with the namespace for PUBLISH, SUBSCRIBE, UNSUBSCRIBE, and Will messages, and the prefix is stripped when delivering messages to clients.

This setting is ignored if the listener already has a mountpoint configured, ensuring existing configurations take precedence.

Observability

#16135 Added two new metrics and corresponding rates for the GET /monitor_current HTTP API: rules_matched and actions_executed. They track the number of rules that matched and action execution rate (i.e., success + failure), respectively.
#16213 Added MQTT client ID as a process label so crash logs (including max-heap and force-shutdown errors) now include the client ID for easier troubleshooting.

Performance

#16368 Upgraded the underlying runtime system from Erlang/OTP 27 to Erlang/OTP 28.
#16377 Reduced the number of pre-allocated metrics counters, which should contribute to reduced memory usage, especially in clusters using lots of namespaces.

MQTT over QUIC

#16133 MQTT over QUIC: Added support for connection probing using datagrams.

EMQX now supports zero-length datagram packets sent by clients to test connectivity. Clients can also send non-zero-length datagram packets, but they will be ignored by EMQX.

Bug Fixes

Core MQTT Functionalities

#16344 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.

Access Control

#16308 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.
#16446 Fixed an issue with authenticator metrics when using SCRAM in which the 'Total' count would be incremented twice for each authentication attempt, and the 'Success' count would not be bumped.

Data Integration

#16265 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.

Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
#16352 Upgraded Apache Pulsar client to 2.1.2. When Pulsar producer action's batch_size is configured to 1, the producer will now encode single messages instead of single-element batch. This should allow consumers to share load using Key Share strategy.
#16383 Previously, when using IoTDB Connector with its RestAPI driver, credentials would not be checked during health checks. Now, we send a no-op query during IoTDB connector health-check. This enables early detection of misconfigured client credentials.

Message Queue

#16270 Fixed a shutdown handling issue in the EMQX message queue consumer.

Clustering

#16453 Upgraded gen_rpc to 3.5.1.

Prior to the gen_rpc upgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable. The new version gen_rpc no longer has the long tail and converted crash logs to more readable error logs,
and the frequent log "failed_to_connect_server" is also throttled to avoid spamming.

Cluster Linking

#16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
#16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replication state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.

Observability

#16417 Reduced the volume of logs generated when a resource exception occurs (resource_exception). These logs are now throttled, and some potentially large terms are redacted from them.
#16434 Previously, using the HTTP API to force deactivate an alarm would not clear it from all nodes. Now, clearing an alarm name will clear it from all nodes.

Gateway

#16425 Improved the returned errors when creating or updating a Gateway via the HTTP API.

Miscellaneous

#16397 Added TLS certificate validation before listener start. Fail-fast if listener is misconfigured with invalid certificates.
#16311 Updated error codes to correct terminology from misspelled 'REST_FAILED' to 'RESET_FAILED'.

Breaking Changes

#16368 The internal regular expression engine has been upgraded to PCRE2, providing improved matching performance and stricter syntax enforcement.

If you use the regex_match, regex_replace, or regex_extract functions in Rule Engine SQL, some existing regular expressions that relied on lenient or undefined behavior may no longer compile or match as expected.

Key changes to be aware of include:
- Stricter escaping rules: Invalid or unnecessary escape sequences that were previously ignored are now treated as errors.
  - Broken: [\w-\.], escaping . inside a character class is unnecessary and no longer accepted; only metacharacters require escaping.
  - Broken: \x without valid hexadecimal digits (for example, \xGG) now causes a compilation error instead of being interpreted as a literal x.
- Stricter group name validation: Regular expressions with duplicate or empty named capture groups are no longer permitted.
Action required: Review and validate all Rule Engine SQL definitions that use regular expressions. For complex patterns, verify compatibility with a PCRE2-compliant tester (most online regex tools support PCRE2) or test thoroughly in a staging environment before upgrading.

View release on GitHub

e5.9.2 Breaking risk 8mo

Security fixes

Fixed TLS connection race condition during certificate renewal
Fixed OIDC SSO login in multi-node clusters

Notable features

S3 IAM role support without manual credentials
Kafka API support expansion
HTTP request timeout configuration

Full changelog

Enhancements

Core MQTT Functionalities

#15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code 137 (Server Busy) in the CONNACK with Reason-String "THROTTLED", and should retry after the cleanup completes.
- Fixed the reason code returned when another connection registers the same client ID; now correctly returns 137 instead of 133.

Data Integration

#15542 Upgraded our erlcloud library to 3.8.3.0. This allows one to set up a S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s).
#15585 Updated the brod client to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation of JoinGroups API versions v0 - v1.
#15845 The static_clientids configuration for the MQTT Connector now supports specifying a username and password for each client ID. This is particularly useful for scenarios like connecting to Azure IoT Hub, where each device (client ID) requires a unique set of credentials. This enhancement helps ensure successful connections across multiple nodes in a clustered environment.
#15911 The HTTP request timeout for the HTTP Action is now configurable via the resource_opts.request_ttl setting. Previously, this timeout was fixed at 30 seconds and could not be adjusted.

Observability

#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
#15944 Improved the information returned when a resource is marked as disconnected for the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication.

Performance

#15536 Disable the node.global_gc_interval configuration by default.
#15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (+zdbbl 32768) to prevent busy_dist_port alarms during intensive Mnesia operations.
- Disabled scheduler busy-waiting (+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system.
- Set scheduler binding type to db (+stbt db) to reduce message latency.
#15907 Improve system memory usage.
- Authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.
- Fields such as client ID, username, password, and topic are copied into new binaries (when more than 64 bytes) instead of being slices from the raw packet to reduce 'binary' part of memory usage in Erlang VM.
#15949 Changed the default value of the parse_unit option in listener configuration from chunk to frame. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).

Note: With parse_unit = frame, if a PUBLISH packet exceeds the maximum allowed size, EMQX will close the connection instead of sending a DISCONNECT packet.
#16165 Optimized the performance of the GET /clients_v2 API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.

Bug Fixes

Core MQTT Functionalities

#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.
#15872 Eliminated warning log unclean_terminate when disconnected after CONNACK is sent with a non-zero reason code.

Deployment

#15553 Fixed an issue in the Helm chart where deploying EMQX with default values started multiple replicas and caused all nodes except one to crash. The chart now defaults to a single replica, since clustered deployments require an Commercial License.
#15580 Added a new emqxLicenseSecretRef variable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.

This replaces the non-functional emqxLicenseSecretName variable, which created and mounted a secret file but did not pass the license to EMQX.
#15712 Fixed node boot-up failure during rolling upgrade from older versions (before 5.9)

In previous EMQX versions (before 5.9), a bug in the ZIP timestamp encoder could store an invalid “seconds” value in archive entries (values corresponding to the 30th or 31st 2-second slot in DOS time format).
#15863 Fixed the license quota alarm message to correctly reflect session quotas instead of live connections.

Security

#15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a bad_certificate / invalid_signature error.
#16237 Fixed an issue where OIDC SSO–related logs might still be printed even after SSO was disabled.
#16217 Fixed an issue where the OIDC login callback could fail to locate the user session in multi-node cluster environments.

Access Control

#15818 Corrected handling of {allow|deny, all} ACL rules.

Previously, these rules were internally translated to match #, which incorrectly failed to match topics prefixed with $ (e.g. $testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure {allow|deny, all} rules correctly match any topic, including $-prefixed ones.
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.

If you have such an user and wish to delete it, run the following in an EMQX console:
```
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end).
```
#15899 Improved memory management by ensuring that the authorization (authz) cache is cleared immediately when a client disconnects, reducing unnecessary memory consumption.

#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a session_stepdown_request_exception caused by a calling_self error.

Example error log

2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...

Data Integration

#15616 Kafka connections are now considered healthy even if a topic_authorization_failed error is returned for the default probing topic.
#15826 Improved Kafka consumer connector health check behavior with restricted ACLs. Previously, Kafka Consumer Connector health checks could fail if the configured user lacked permission to access the internal ____emqx_consumer_probe consumer group used for the check. With this fix, if the Kafka broker returns an "ACL denied" response, EMQX will treat the connection as healthy.
#15827 Fixed atom and process leaks in the GreptimeDB driver.

Fixed a function_clause error that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions.
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
#15850 Fixed an issue where the MQTT bridge incorrectly showed a stale connection as Connected, and failed to re-establish the connection.
#15866 Upgraded Kafka producer lib wollf to 4.0.12 to improve handling of temporarily missing partitions in Kafka metadata responses.

In rare race conditions, Kafka may return an incomplete partition list. Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing. This gap could cause the partition producer to stall and block shutdown indefinitely.
#15906 Upgraded Kafka producer library Wolff from 4.0.12 to 4.0.13, which adds handling for the record_list_too_large error in ProduceResponse.
#15902 Upgraded MQTT client library to 1.13.8. This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
#15910 Fixed an issue with Connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.

Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded gun and related dependencies to 2.1.0.

#16010 Fixed an issue where a Republish Fallback Action could fail with a function_clause error if the originating rule's SQL did not include the metadata field from the rule environment.

Example error log:

[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">>

#16043 Improved log details for Kafka data integration when not_all_kafka_partitions_connected event occurs.
#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
#16138 Fixed a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.

Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as GET) failed. However, failures in periodic PING commands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.

With this fix, failed PING responses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly.

Rule Engine

#16028 Fixed rule engine jq function memory leak.

Previously if jq built-in function index is used (e.g. .key | index("name")), it would result in memory leak.

Smart Data Hub

#15706 Fixed an indexing issue that could cause Message Transformations and Schema Validations to behave inconsistently. Deleting one item could corrupt the topic index, so that a subsequent item remained active even after being disabled.
#15708 Fixed an issue where external schema registries were not reloaded after a node restart.
#15810 Introduced spb_{en,de}code functions to correct handling of bytes_value Metrics. Fixed an issue with the original sparkplug_{en,de}code functions, which did not base64 encode/decode bytes_value metric values as required by the Protobuf specification. To address this, new spb_{en,de}code functions have been introduced for correct encoding/decoding of such fields. The old sparkplug_{en,de}code functions are now deprecated to maintain backward compatibility.

Observability

#15639 Fixed incorrect counting of the packets.subscribe.auth_error metric.
#15785 Resolved a crash that occurred when MQTT usernames containing non-ASCII characters were used in formatting network congestion alarm messages.
#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (remsh).
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.

Gateway

#15679 Fixed incorrect global chain names for the ExProto, JT/T 808, GB/T 32960, and OCPP gateways. Built-in authentication data for these gateways was previously grouped under unknown:global, causing conflicts between gateways.
#15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.
#15822 Fixed an issue where the OCPP connection would crash after sending a certain number of messages.

Rate Limit

#15794 Improved the behavior of connection rate limit updates to ensure that changes (e.g., to burst rate or rate thresholds) are applied immediately after the listener configuration is updated. Previously, parts of the internal limiter state were not refreshed correctly, which could result in rate limits appearing stricter than configured.

ExHook

#15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.

Breaking Changes

#15753 Listener connection rate limits (max_conn_rate and max_conn_burst) are now enforced per listener rather than per acceptor, restoring the behavior before 5.9.0.

As a result, configurations from versions 5.9.0 and 5.9.1 are incompatible: the specified rate values must be scaled up by the number of acceptors configured for each listener to preserve the same effective limits.
#16062 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.

If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.
#16284 Stopped releasing packages for macOS 13 and CentOS 7.

View release on GitHub

e6.0.1 Breaking risk 8mo

Security fixes

TLS certificate garbage collection now preserves actively used certificates
Fixed RSA signature verification with missing default configurations

Notable features

Message Queue configuration options and auto-creation
GreptimeDB ingester v0.2.3 upgrade with row-based gRPC
Optimized GET /clients_v2 API performance

Full changelog

Enhancements

Message Queue

#16080 Added a configuration option to disable the Message Queues feature. Disabling Message Queues can slightly reduce the resource usage in the cluster. When Durable Sessions are also disabled, EMQX avoids maintaining Durable Storage, further reducing administrative overhead and improving performance.
#16096 Added support for automatic creation of message queues when clients subscribe to non-existent $q/ topics. Now configuration options are available to enable auto-creation for both regular and last-value semantics queues.
#16097 Optimized message writing to regular message queues by replacing transactional appends with dirty append functions. For QoS 0 messages, asynchronous append operations are now used. These changes significantly improve the performance of message insertion into regular queues.
#16098 Added a maximum queue count configuration option to limit the total number of message queues in the system.
#16152 Introduced per-queue limits for maximum message count and total message size. Also added new metrics to monitor message append latency and help diagnose performance or queue-limiting issues.

Data Integration

#16121 Upgraded the GreptimeDB ingester client to v0.2.3, which fixes several bugs and introduces support for row-based gRPC protocol (the column-based protocol is now deprecated).

Additionally, updated the CI image to the latest stable version of GreptimeDB.
#16127 Fixed an invalid string value issue in the GreptimeDB connector, following the changes introduced in #16121.

Performance

#15949 Changed the default value of the parse_unit option in listener configuration from chunk to frame. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).

Note: With parse_unit = frame, if a PUBLISH packet exceeds the maximum allowed size, EMQX will close the connection instead of sending a DISCONNECT packet.
#16165 Optimized the performance of the GET /clients_v2 API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.

Bug Fixes

Core MQTT Functionalities

#15884 Resolve an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.

Upgrade

#16047 Added support to perform rolling upgrade from EMQX Enterprise base version 5.8.0 and newer to 6.0. During the upgrade, legacy configurations are automatically migrated to the new format supported in 6.0. Specifically, the deprecated bridges configuration root is converted into the new connectors, sources, and actions roots.

However, the GCP PubSub Consumer and Kafka Consumer sources will still require manual changes. If any source configuration still includes the deprecated topic_mapping field, it must be removed. Then, for each entry previously defined in topic_mapping, a separate "Source + Rule" pair must be created manually.

Security

#16156 Fixed an issue where some dependencies were missing default configurations compared to EMQX 5.10, potentially causing RSA signature verification failures. The missing defaults could lead to errors, such as the following log message:
```
{sign_unsupported,[[{rsa_padding,rsa_pkcs1_padding}]]}, [{jose_jwa_unsupported,verify,5,[{file,"src/jwa/jose_jwa_unsupported.erl"},{line,55}]}
```
#16175 Fixed an issue with periodic TLS certificate garbage collection. Previously, the garbage collection process incorrectly deleted certificate files that were actively used by configurations in managed namespaces.

Access Control

#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a session_stepdown_request_exception caused by a calling_self error.

Example error log

2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...

Clustering

#16123 Fix a bug in the component managing Mria replication that could cause cluster joins to hang or remain incomplete in core-replicant clusters.

During cluster changes involving adding new core nodes, those new core nodes could sometimes fail to start replication-related processes required by replicants. As a result, upgraded or newly added replicants could hang during startup.

In Kubernetes deployments, this often caused readiness probes to fail, leading the controller to repeatedly restart the affected replicant pods.

This issue typically affected upgrade rollouts involving the addition of new core and replicant nodes. For example, adding two cores and two replicants (running a newer EMQX version) to an existing cluster with 2 cores and 2 replicants.

Rule Engine

#16028 Fixed rule engine jq function memory leak.

Previously if jq built-in function index is used (e.g. .key | index("name")), it would result in memory leak.

Data Integration

#16010 Fixed an issue where a Republish Fallback Action could fail with a function_clause error if the originating rule's SQL did not include the metadata field from the rule environment.

Example error log:

[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">>

#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
#16140 Fix a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.

Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as GET) failed. However, failures in periodic PING commands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.

With this fix, failed PING responses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly.

MQTT Durable Sessions

#16105 Durable storage performance optimization. In particular, this fix reduces the latency of CONNACK for clients using a durable session.
#16129 Durable storage transaction configuration can be changed in the runtime. Previously changing this configuration required a node restart.

Observability

#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (remsh).
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.

#16060 Fixed a logger formatter crash that could occur for some debug-level log messages containing deeply nested terms with non-ASCII characters.

Example error log

2025-09-29T06:55:34.120640+00:00 debug: FORMATTER CRASH: {report,#{request => #{messages => [#{role => <<"user">>,content => <<"{\"msg\": \"hello\"}">>}],system => <<"将输入的 JSON 数据中，值为数字的 value 相加起来，并输出，只需返回输出结果。"/utf8>>,model => <<"claude-3-haiku-20240307">>,max_tokens => 100},msg => emqx_ai_completion_request}}
2025-09-29T06:55:34.120780+00:00 [debug] formatter_crashed: emqx_logger_textfmt, config: #{time_offset => [],chars_limit => unlimited,depth => 100,single_line => true,template => ["[",level,"] ",msg,"\n"],with_mfa => false,timestamp_format => auto,payload_encode => text}, log_event: #{meta => #{line => 44,pid => <0.281254.0>,time => 1759128934120640,file => "emqx_ai_completion_anthropic.erl",gl => <0.4317.0>,mfa => {emqx_ai_completion_anthropic,call_completion,3},report_cb => fun logger:format_otp_report/1,matched => <<"t/1">>,namespace => global,clientid => <<"c_emqx">>,trigger => <<"t/1">>,rule_id => <<"r1sczoo0">>,rule_trigger_ts => [1759128934120]},msg => {report,#{request => #{messages => [#{role => <<"user">>,content => <<"{\"msg\": \"hello\"}">>}],system => <<"将输入的 JSON 数据中，值为数字的 value 相加起来，并输出，只需返回输出结果。"/utf8>>,model => <<"claude-3-haiku-20240307">>,max_tokens => 100},msg => emqx_ai_completion_request}},level => debug}, reason: {error,badarg,[{erlang,iolist_to_binary,[["[",[["messages",": ",[[91,[[35,123,[["role"," => ",[60,60,"\"user\"",62,62]],44,["content"," => ",[60,60,"\"{\\\"msg\\\": \\\"hello\\\"}\"",62,62]]],125]],93]]],", ",["system",": ","将输入的 JSON 数据中，值为数字的 value 相加起来，并输出，只需返回输出结果。"],", ",["model",": ","claude-3-haiku-20240307"],", ",["max_tokens",": ","100"]],"]"]],[{error_info,#{module => erl_erts_errors}}]},{emqx_trace_formatter,format_term,2,[{file,"emqx_trace_formatter.erl"},{line,126}]},{emqx_logger_textfmt,format_term,2,[{file,"emqx_logger_textfmt.erl"},{line,230}]},{emqx_logger_textfmt,try_encode_meta,4,[{file,"emqx_logger_textfmt.erl"},{line,206}]},{lists,foldl_1,3,[{file,"lists.erl"},{line,2151}]},{emqx_logger_textfmt,enrich_report,3,[{file,"emqx_logger_textfmt.erl"},{line,102}]},{emqx_logger_textfmt,format,2,[{file,"emqx_logger_textfmt.erl"},{line,24}]}]}

#16134 Fixed a backward compatibility issue that could prevent new Log Traces from being created in some cases.

Rate Limit

#16160 Improved the rate limiting algorithm for individual client connections. Previously, clients could temporarily exceed their publish rate limits, particularly just after connecting or after periods of inactivity.

This update makes the limiter behavior more predictable and consistent, ensuring rate limits are correctly enforced from the start of a connection.

Breaking Changes

#16061 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.

If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.

View release on GitHub

e5.10.2 Breaking risk 8mo

Breaking changes

RocketMQ actions now respect payload templates instead of sending entire rule output; existing templates may need updating

Notable features

Kafka API version deprecation updates
GreptimeDB custom timestamp column support

Full changelog

Enhancements

Data Integration

#16183 EMQX now logs messages about dropped expired messages (buffer_worker_dropped_expired_messages ) at the warning level, and throttles such messages per resource ID. This helps identify when specific external resources are not keeping up with incoming message rates, potentially leading to message drops.
#16206 Added the allow_auto_topic_creation configuration option to the Kafka Producer Connector. When enabled, EMQX allows Kafka to automatically create a topic if it doesn’t exist when a client sends a metadata fetch request.
#16209 Added support for specifying a custom timestamp column name (ts_column) parameter to GreptimeDB Connector.

Performance

#15949 Changed the default value of the parse_unit option in listener configuration from chunk to frame. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).

Note: With parse_unit = frame, if a PUBLISH packet exceeds the maximum allowed size, EMQX will close the connection instead of sending a DISCONNECT packet.
#16165 Optimized the performance of the GET /clients_v2 API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.

Bug Fixes

Core MQTT Functionalities

#15884 Resolve an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.

Access Control

#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a session_stepdown_request_exception caused by a calling_self error.

Example error log

2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...

Rule Engine

#16028 Fixed rule engine jq function memory leak.

Previously if jq built-in function index is used (e.g. .key | index("name")), it would result in memory leak.

Data Integration

#16010 Fixed an issue where a Republish Fallback Action could fail with a function_clause error if the originating rule's SQL did not include the metadata field from the rule environment.

Example error log:

[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">>

#16043 Improved log details for Kafka data integration when not_all_kafka_partitions_connected event occurs.
#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
#16138 Fix a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.

Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as GET) failed. However, failures in periodic PING commands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.

With this fix, failed PING responses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly.
#16212 Removed Kafka producer linger time when the buffer queue is in memory mode.

Observability

#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (remsh).
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.

Breaking Changes

#16062 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.

If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.

View release on GitHub

e6.0.0 Breaking risk 9mo

Notable features

Message Queues with offline storage and last-value retention
Namespace feature with role-based access control
Enhanced LDAP with extended ACL rules and client-side caching

Full changelog

Feature Highlights

EMQX Enterprise 6.0.0 is the first release of the EMQX Enterprise version 6 series, bringing significant architectural improvements and new capabilities.

Message Queue

The native Message Queue feature unifies real-time MQTT publish/subscribe with persistent asynchronous queuing. The server buffers messages that match a topic filter, retaining them even when subscribers are offline. Clients can consume these messages through the special $q/{topic} topic, ensuring reliable message delivery.

Message Queues support offline message storage, last-value retention, and flexible dispatch strategies, enhancing MQTT with both real-time and durable messaging capabilities.

Namespace

The Namespace feature improves multi-tenancy and observability with namespace-level roles in the Dashboard. Users are restricted to their own resources (e.g., Rules, Actions, and Connectors) with fine-grained permissions such as Administrator or Viewer, and roles can be managed via the Dashboard, API, or CLI, simplifying multi-tenant operations.

Session count tracking has also been optimized: counts refresh on demand when there are fewer than 1,000 connections, and every 5 seconds otherwise. During rolling upgrades from older versions, counts may temporarily appear inconsistent, but will stabilize once all nodes are updated.

MQTT Durable Sessions

Durable storage has been optimized by separating session data from the broker’s other metadata, significantly reducing RAM usage and improving storage efficiency.

New configuration options provide finer control over RocksDB memory usage and performance. In addition, the default serialization schema for stored messages has been updated to ASN.1, further enhancing efficiency.

New Data Integrations

Google BigQuery
AWS AlloyDB
CockroachDB
AWS Redshift

Enhanced Integration

AWS:
- Support for Instance Metadata Service v2 APIs from EC2 instances when using S3 or S3Tables data integration. This enables seamless access to S3 buckets without manual AWS credential configuration, leveraging IAM roles for better security.
- Parquet format support for S3 Tables Action.
RabbitMQ: Define custom Headers and Properties Templates in RabbitMQ Sink to enhance message routing and compatibility within RabbitMQ.
Snowflake: Snowpipe Streaming upload mode for Snowflake Action (preview feature).
RocketMQ: New key and tag template fields in Action, along with a key_dispatch option for the Produce Strategy, allowing greater customization of message metadata.

Elixir Support

All packages now ship with Elixir support through the Mix build system, opening EMQX to the Elixir community and enabling better tooling with IEx console.

Enhanced LDAP Support

LDAP authorization now supports extended ACL rules in JSON format, and LDAP authentication can fetch ACL rules directly from LDAP with client-side caching.

Improved Tracing

Configurable limits for maximum traces (trace.max_traces) and trace file sizes (trace.max_file_size).
After max_file_size is reached, the trace log will rotate to a new file instead of halting.

Cluster Management

New cluster.description configuration option allows users to set and display custom cluster descriptions in the EMQX Dashboard.

Enhancements

Message Queue

#15789 Implemented Message Queues, which are collections of messages identified by topic_filter. Each queue has an explicit lifecycle and is automatically replenished with published messages matched with the queue's topic filter during the queue's lifetime. Clients can cooperatively consume messages from a queue by subscribing to a special topic in the format: $q/{topic}.

Core MQTT Functionalities

#15805 Introduced a dedicated worker pool for handling sharded fanout message delivery.
Previously, the broker pool handled both subscription management and message dispatch, which could lead to scheduling contention. This change separates the fanout dispatch workload into its own pool to ensure more balanced and efficient handling of pub/sub operations.

Access Control

#15349 Optimize external resource management for authentication and authorization. Previously, EMQX could remain connected to a resource configured for a disabled authenticator or authorizer.
#15294 Enhanced LDAP authentication and authorization. LDAP authorization now supports extended ACL rules in JSON format. LDAP authentication can now fetch ACL rules from LDAP. These rules are cached in the client's metadata, so authorization is performed without additional LDAP queries.
#15730 Added support for overriding the client ID based on authentication results. If an authentication backend returns a clientid_override attribute upon successful authentication, it will replace the client’s original client ID.

The following backends now support clientid_override:
- HTTP
- JWT
- LDAP
- MongoDB
- MySQL
- Postgres
- Redis
#15820 Changed default value of config authorization.no_match from allow to deny for better security defaults.

Clustering

#15600 Introduced a new configuration option cluster.description that allows you to add a descriptive label to the EMQX cluster. This description can be updated via PUT /cluster, and retrieved with the GET /cluster API.

LLM-Based MQTT Data Processing

#15467 Exposed transport configuration options for AI Completion Providers. Users can now configure connection timeouts and the maximum number of connections to AI Completion Providers. This helps prevent checkout_timeout errors when message throughput is high and the provider is under load.
Flow designer supports integrating with the Google Gemini model.
#15631 Added a new API endpoint to list all models available for an AI provider.
#15467 Exposed transport options for AI Completion Providers. These options allow configuring connection timeouts and maximum connections to an AI Completion Provider.
#15724 Introduced openai_response type for AI Completion Providers and completion profiles to use OpenAI's response API.

Data Integration

#15418 EMQX supports data integration with BigQuery.
#15401 Added support for the Snowpipe Streaming upload mode in the Snowflake Action.
Note: Snowpipe Streaming is currently a preview feature and is only available for Snowflake accounts hosted on AWS.
#15387 Added rate limiting to Kinesis Producer Connector and Action health checks to comply with AWS API quotas and improve cluster behavior.
- Health check calls to ListStreams and DescribeStream are now limited to 5/s and 10/s per Connector, respectively, matching AWS rate limits.
- A distributed limiter is coordinated by a core node in the cluster to enforce these limits consistently.
- If a health check is throttled or times out, the Connector or Action will now retain its previous status instead of being marked as disconnected.
Also introduced a new resource_opts.health_check_interval_jitter, which adds a uniform random delay to resource_opts.health_check_interval to reduce the chance of multiple Actions under the same Connector running health checks at the same time.
#15176 Upgraded the GreptimeDB Connector client and supported an optional new parameter ttl to set the default time-to-live for automatically created tables.
#15649 EMQX supports data integration with AWS AlloyDB, CockroachDB, and AWS Redshift.
#15635 Added new key and tag template fields in the RocketMQ Action, allowing customization of the message's key and tag. Also, introduced a new key_dispatch option for the Produce Strategy field.
#15621 Now, access_key_id and secret_access_key are optional fields for the S3 Tables Connector. If omitted, they'll be obtained from the Instance Metadata Service v2 APIs from the EC2 instance where EMQX is deployed.
#15628 Removed HStreamDB data integration.
#15544 Added Arrow Flight SQL NIF driver support for Datalayers Integration.
#15637 Added support for templating message headers and properties for the RabbitMQ Action.
#15864 Removed the deprecated "Bridges V1" APIs and configuration schemas. All endpoints under /bridges/* and configuration entries under the bridges root key are no longer available, as data integrations have fully migrated to the "Connectors/Actions/Sources" model.
#15583 Updated the brod client to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation of JoinGroups API versions v0 - v1.

Smart Data Hub

#15525 Prevented deletion of internal schemas that are still in use. If a schema is referenced by a Schema Validation or Message Transformation, it can no longer be removed to avoid runtime errors and configuration inconsistencies.

Durable Storage

#15463 Improved durable storage RAM usage and storage efficiency.
- Introduced the following configuration parameters for the durable storage to improve control over RocksDB memory usage and storage performance:
  - durable_storage.messages.rocksdb.write_buffer_size: RocksDB memtable size per shard.
  - durable_storage.messages.rocksdb.cache_size: RocksDB block size per shard.
  - durable_storage.messages.rocksdb.max_open_files: Limits the number of file descriptors used by RocksDB per shard.
  - durable_storage.messages.layout.wildcard_thresholds: Allows to tune wildcard thresholds for the wildcard_optimized_v2 storage layout.
- Additionally, the default serialization_schema for stored messages has been changed to asn1.
#16044 Some of config fields for durable sessions have been removed or renamed, and old values are marked as deprecated:
- durable_sessions.heartbeat_interval has been renamed to durable_sessions.checkpoint_interval.
- durable_sessions.idle_poll_interval and durable_sessions.renew_streams_interval have been removed, as sessions are now fully event-driven.
- durable_sessions.session_gc_interval and durable_sessions.session_gc_batch_size have been removed as obsolete.

CLI

#15399 The node_dump tool now exports the current system configuration in HOCON format, with sensitive information (such as passwords and secrets) automatically redacted for security.

Namespace

#15841 Improved the refresh rate of the session count for namespaced sessions.
- If a namespace has fewer than 1000 connections, its session count is now updated on demand.
- For namespaces with 1000 or more connections, the count is updated every 5 seconds.
During a rolling upgrade from versions prior to 6.0, session counts may appear inconsistent due to changes in the internal tracking tables. This is expected: as clients reconnect to upgraded nodes, the session counts will gradually stabilize and become accurate once all nodes are running version 6.0 or later.

Observability

#15594 Introduced a new configuration option trace.max_traces to control the maximum number of active cluster-wide traces. This limit does not apply to node-local traces managed using emqx ctl trace.

This update also optimized tracing implementation to eliminate potential atom leaks per created trace.
#15556 Introduced a new configuration option trace.max_file_size to limit the maximum file size for each individual trace.
#15650 Implemented automatic trace log rotation.

When a trace file size exceeds trace.max_file_size, EMQX no longer discards all subsequent events and emits an incomprehensible warning to stderr. Instead, portions of the oldest events are discarded while the most recent ones are retained.

As such, this also implies that:
- EMQX now maintains multiple trace log files per active trace. The layout of the trace directory has changed accordingly.
- Trace API has been updated to reflect this behavior. The Log Stream API may return new errors, such as when a stream becomes stale due to a slow consumer.
#15904 Support viewing and updating of tracing configuration through Trace API.

Performance

#15451 Introduced an experimental socket backend for TCP listeners, aimed at improving message processing latency and reducing compute resource usage. The feature can be enabled with the new tcp_backend listener option.

Build and Tooling

#15484 Switched the build system to Elixir's Mix, enabling all packages to include native Elixir support. This change improves developer tooling, allows integration with Elixir dependencies when needed, and enables use of the IEx shell as a more powerful EMQX console.

License

#15921 Introduced a license alarm for cluster-wide maximum transactions per second (TPS).
- Each node calculates TPS as the average number of MQTT messages sent and received over the past 10 seconds.
- The total cluster TPS is aggregated every 5 seconds.
- If the observed TPS exceeds the licensed limit, an alarm is triggered.
- The alarm remains active until a license with a higher TPS allowance is applied.

MQTT over QUIC

#15997 Added support for disabling QUIC stack loading by setting the environment variable QUICER_SKIP_NIF_LOAD=1.

Bug Fixes

Core MQTT Functionalities

#15396 Removed redundant cleanup operations for shared subscriptions of disconnected clients. These operations were prone to crashes under high disconnect volumes and could lead to inconsistencies in the global broker state.
#15361 Fixed a function_clause error when parsing a malformed User-Property pair with invalid (too short) length.
#15783 Ensure that any changes to connection rate limits take effect immediately after the listener update has completed. Previously, parts of internal limiter state were not directly affected by configuration changes. For example, after increasing the burst rate, the effective rate limit could appear stricter than expected.

Access Control

#15489 Fixed OIDC issuer URL validation in Single Sign-On (SSO) settings. Previously, issuer URLs containing a port number (for example,
https://xxxxxxxx:8443/webman/sso/.well-known/openid-configuration) were rejected with a bad_port_number error. These URLs are now supported.

Rule Engine

#15569 Fixed an issue where a Republish Rule Action could fail if the direct_dispatch template was empty or resolved to a non-boolean value. In these cases, the default value false is now used.

Data Integration

#15522 Fixed an issue where Snowflake Connector would fail to start correctly if username was not provided.
#15476 Fixed a missing callback in emqx_connector_aggreg_delivery that caused a crash when formatting delivery process status for aggregated-mode Actions (e.g., Azure Blob Storage, Snowflake, S3 Tables).
This occurred during failures or when inspecting delivery processes with gen_server:format_status/1. The issue is now resolved, and more detailed delivery status information will be logged.
#15394 Fixed a rare race condition where Action metrics could become inconsistent due to unexpected asynchronous replies.
#15647 Fixed an issue where a MongoDB Connector was marked as Disconnected if the MongoDB account specified in the connector configuration lacked privileges to perform find queries on the foo collection.
#15603 Fixed an issue in the MQTT bridge where a stale connection could be shown as Connected and would not automatically reconnect.
#15383 Fixed a potential resource leak in MQTT bridge. When a bridge failed to start, the topic index table was not properly cleaned up.
#15786 Fixed a potential atom leak when probing RocketMQ Connectors.
#15806 Improved validation for Oracle Actions during creation. Previously, in rare cases, an Action containing an invalid SQL statement could be added successfully.
#15848 Improved error reporting for the Oracle Connector. When the connector becomes disconnected, its status now includes a more specific reason, making diagnostics easier.
#15693 Fixed a resource leak in Postgres-based bridges. Under certain race conditions during pool initialization, deleting a Connector could leave its connection pool behind. This has been corrected to ensure connection pools are properly cleaned up.
#15543 Fixed an issue in HTTP Server data integration when sending large payloads. If the payload size was 10 MB or more, the HTTP request could fail.

Smart Data Hub

#15839 Fixed an encoding issue with Protobuf schemas that use map<_, _> fields.
Previously, schemas containing map<string, string> fields could fail to encode valid payloads, resulting in cryptic runtime errors.

Example schema:

syntax = "proto3";

message test {
map<string, string> args = 1;
}

Example rule:

SELECT
schema_encode('xxx', json_decode(payload), 'test') as protobuf_test
FROM
"t/#"

Example payload failed to be encoded:

{
"args": {
"env": "stag"
}
}

Previous error similar to:

2025-06-17T06:59:22.725785+00:00 [warning] tag: RULE_SQL_EXEC, clientid: c_emqx, msg: SELECT_clause_exception, reason: {error,{gpb_type_error,{bad_unicode_string,[{value,env},{path,"test.args.key"}]}},[{'$schema_parser_xxx',mk_type_error,3,[{file,"$schema_parser_xxx.erl"},{line,437}]},{'$schema_parser_xxx','-v_map<string,string>/3-lc$^0/1-0-',3,[{file,"$schema_parser_xxx.erl"},{line,429}]},{'$schema_parser_xxx','v_map<string,string>',3,[{file,"$schema_parser_xxx.erl"},{line,429}]},{'$schema_parser_xxx',v_msg_test,3,[{file,"$schema_parser_xxx.erl"},{line,404}]},{'$schema_parser_xxx',encode_msg,3,[{file,"$schema_parser_xxx.erl"},{line,73}]},{emqx_schema_registry_serde,with_serde,2,[{file,"emqx_schema_registry_serde.erl"},{line,212}]}...

Observability

#15931 Resolved a bug where spurious but harmless error logs could appear during node startup:

[error] Generic event handler emqx_alarm_handler crashed ...
Reason: {aborted,{no_exists,[emqx_activated_alarm,runq_overload]}}

#15973 Fixed a bug where an alarm activation timeout could crash the connection process under certain conditions.

MQTT over QUIC

#15614 QUIC Listener: When TLS key logging (SSLKEYLOGFILE) is enabled, EMQX now dumps TLS keys even if the handshake fails.

Clustering

#16021 Fixed issues that occasionally prevented the DS Raft backend from functioning correctly when an existing node joined a new cluster and subsequently became member of DS replica sets.

Cluster Linking

#15894 Previously, when listing all cluster links via GET /cluster/links, disabled links would be returned having an inconsistent status. Now they are returned as disconnected.

Performance

#15696 Added connection rate limiting support for WebSocket (WS) and WebSocket Secure (WSS) listeners.
The max_conn_rate and max_conn_burst configuration options are now enforced: incoming connections exceeding the defined rate are immediately closed upon acceptance, consistent with existing TCP listener behavior.

Additionally, the behavior of max_connections has been updated. When the connection limit is exceeded, WS/WSS listeners now close connections immediately before any HTTP handshake, resulting in an abrupt socket close instead of returning an HTTP 429 response.
#15854 Reduced the default active_n value from 100 to 10 to improve MQTT client responsiveness, especially under high message rates with small payloads.

The lower active_n introduces more backpressure at the TCP layer, stricter than the default Receive-Maximum of 32, which helps in the following scenarios:
- The client process is blocked by external authorization checks
- Data integration operations are delaying message handling
- The system is under heavy load or nearing resource limits
#15981 Prevented excessive memory growth caused by Mnesia transaction blocking during cleanup of large volumes of audit logs. This improves system stability and memory efficiency during heavy audit log maintenance operations.

Breaking Changes

Deprecated Packages

#15939 Stopped releasing packages for systems that have already reached end-of-life:
- Debian 10 (Buster)
- Enterprise Linux (CentOS) 7
- Ubuntu 18.04
- Ubuntu 20.04
- macOS 13 (Ventura)
#16050 Stopped releasing packages for Amazon Linux 2. It will reach end-of-life on June 30, 2026.

Durable Sessions

If the durable sessions feature was not enabled before, you can ignore this section.

In EMQX 6.0, the internal representation of durable sessions and their messages has changed.
Clusters previously running on version 5.x with durable sessions enabled must be recreated from a clean state when upgrading to 6.0.

For detailed upgrade instructions, see the rolling upgrade documentation.

#15496 The state of durable sessions has been migrated from Mnesia to a new database built on EMQX durable storage.
- As a result, all durable session states created before 6.0.0 will be lost during the migration.
- This change resolves potential session state corruption caused by Mnesia’s limited transaction isolation (see #14039).
- It also improves the performance and scalability of durable sessions through sharding and a more efficient data representation.

Will Message Behavior

Authorization checks for durable sessions are now performed at the moment of client disconnection to determine whether the will message may be published.

Previously, these checks were deferred until after the configured Will-Delay-Interval had expired.

Configuration Changes

Durable Sessions

durable_storage.messages.n_sites parameter has been renamed to durable_storage.n_sites. This parameter has become common for all durable storage.
durable_storage.sessions and durable_storage.timers have been added.
#15734 Improved the reliability and throughput of durable sessions.

Durable Storage

durable_storage.messages.n_sites has been renamed to durable_storage.n_sites, which now applies to all durable storage types.
Added new configuration entries for durable_storage.sessions and durable_storage.timers.

RocketMQ

#15635 The parameters.strategy field no longer accepts key templates (which previously implied the key_dispatch strategy).
Instead, set parameters.strategy = key_dispatch explicitly and specify the key template in parameters.key.

Rate Limit

#15743 Listener connection rate limits (max_conn_rate and max_conn_burst) are now enforced per listener rather than per acceptor, restoring the behavior before 5.9.0. As a result, configurations from versions 5.9.0, 5.9.1, and 5.10.0 are incompatible: the specified rate values must be scaled up by the number of acceptors configured for each listener to preserve the same effective limits.

View release on GitHub

e5.10.1 Breaking risk 10mo

Notable features

Authorization cache cleared on client disconnect reduces memory consumption
Kinesis health check rate limiting complies with AWS API quotas
MQTT bridge stale connection detection and automatic reconnection

Full changelog

Enhancements

Performance

#15907 Improve system memory usage. Fields such as client ID, username, password, and topic are copied into new binaries (when more than 64 bytes) instead of being slices from the raw packet to reduce 'binary' part of memory usage in Erlang VM.
#15899 Authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.

Observability

#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
#15364 Added HTTP header configuration items to the OpenTelemetry integration to adapt to collectors with HTTP authentication.

Access Control

#15294 Enhanced LDAP authentication and authorization.
LDAP authorization now supports an extended ACL rule format using JSON, in addition to the existing simple topic list. ACL rules can also be fetched from LDAP during authentication based on client information, and are cached in the client’s metadata to avoid repeated LDAP queries during authorization.
#15349 Optimized external resource management for authentication and authorization. Previously, EMQX could remain connected to a resource configured for a disabled authentication or authorization provider.

Data Integration

#15360 Added support for writing data files in Parquet format for Amazon S3 Tables Action.
#15387 Added rate limiting to Kinesis Producer Connector and Action health checks to comply with AWS API quotas and improve cluster behavior.
- Health check calls to ListStreams and DescribeStream are now limited to 5/s and 10/s per Connector, respectively, matching AWS rate limits.
- A distributed limiter is coordinated by a core node in the cluster to enforce these limits consistently.
- If a health check is throttled or times out, the Connector or Action will now retain its previous status instead of being marked as disconnected.
Also introduced a new resource_opts.health_check_interval_jitter, which adds a uniform random delay to resource_opts.health_check_interval to reduce the chance of multiple Actions under the same Connector running health checks at the same time.
#15542 Upgraded our erlcloud library to 3.8.3.0. This allows one to setup a S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s).
#15845 The static_clientids configuration for the MQTT Connector now supports specifying a username and password for each client ID. This is particularly useful for scenarios like connecting to Azure IoT Hub, where each device (client ID) requires a unique set of credentials. This enhancement helps ensure successful connections across multiple nodes in a clustered environment.
#15944 Improved the information returned when a resource is marked as disconnected for the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication.
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds.
#15371 Added tags fields to the return of GET /actions_summary and GET /sources_summary, and to the fallback actions returned in GET /actions/:id.

CLI

#15399 The node_dump tool now exports the current system configuration in HOCON format, with sensitive information (such as passwords and secrets) automatically redacted for security.

Bug Fixes

API

#15547 Resolved an issue where EMQX would fail to process HTTP requests with large bodies (e.g., 10MB) in the REST API.
#15797 To improve compatibility with EMQX 4.x, the encoding parameter has been reintroduced in the batch publish HTTP API (/api/v5/publish/bulk) as an alias for payload_encoding. This change addresses migration issues for users relying on the original encoding parameter, and ensures existing integrations using EMQX v4 APIs can continue working without requiring software-level changes.

Observability

#15785 Resolved a crash that occurred when MQTT usernames containing non-ASCII characters were used in formatting network congestion alarm messages.

Gateway

#15342 Fixed a crash in the NATS gateway caused by client info override templates referencing undefined packet fields. The system now returns an empty binary instead of undefined atom.

Core MQTT Functions

#15361 Fixed a function_clause error when parsing a malformed User-Property pair with invalid (too short) length.
#15396 Removed redundant cleanup operations for shared subscriptions of disconnected clients. These operations were prone to crashes under high disconnect volumes and could lead to inconsistencies in the global broker state.
#15416 Fixed occasional warning-level log events and crashes during session expiration of WebSocket connections. This issue was introduced by recent WebSocket performance improvements. If did not affect broker capacity, but produced log entries like the following:
- error: {function_clause,[{gen_tcp,send,[closed,[]],[{file,“gen_tcp.erl”},{line,966}]},{cowboy_websocket_linger,commands,3,[{file,“cowboy_websocket_linger.erl”},{line,665}]},...
- message: {tcp,#Port<0.364>,<<136,130,...>>}, msg: emqx_session_mem_unknown_message
#15872 Eliminated warning log unclean_terminate when disconnected after CONNACK is sent with a non-zero reason code.
#15518 Resolve a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.

Data Integration

#15394 Fixed a rare race condition where Action metrics could become inconsistent due to unexpected asynchronous replies.
#15603 Fixed an issue in the MQTT bridge where a stale connection could be shown as Connected and would not automatically reconnect.
#15826 Improved Kafka consumer connector health check behavior with restricted ACLs. Previously, Kafka Consumer Connector health checks could fail if the configured user lacked permission to access the internal ____emqx_consumer_probe consumer group used for the check. With this fix, if the Kafka broker returns an "ACL denied" response, EMQX will treat the connection as healthy.
#15827 Fixed atom and process leaks in the GreptimeDB driver.

Fixed a function_clause error that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions.
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
#15850 Fixed an issue with the MQTT bridge when a stale connection was displayed as Connected and the connection was not re-established.
#15866 Upgrade Kafka producer lib wollf to 4.0.12 to improve handling of temporarily missing partitions in Kafka metadata responses.

In rare race conditions, Kafka may return an incomplete partition list.
Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing.
This gap could cause the partition producer to stall and block shutdown indefinitely.
#15906 Upgraded Kafka producer library Wolff from 4.0.12 to 4.0.13, which adds handling for the record_list_too_large error in ProduceResponse.
#15902 Upgrade MQTT client library to 1.13.8

This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
#15910 Fixed an issue with Connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.

Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded gun and related dependencies to 2.1.0.

Deployment

#15553 Fixed an issue in the Helm chart where deploying EMQX with default values started multiple replicas and caused all nodes except one to crash. The chart now defaults to a single replica, since clustered deployments require an Commercial License.
#15712 Fix node boot-up failure during rolling upgrade from older versions (before 5.9)

In previous EMQX versions (before 5.9), a bug in the ZIP timestamp encoder could store an invalid “seconds” value in archive entries (values corresponding to the 30th or 31st 2-second slot in DOS time format).
#15863 Fixed license quota alarm text.

Clustering

#15788 Fixed etcd cluster discovery issue. Resolved an issue where EMQX nodes from different clusters could mistakenly join each other when using a shared etcd server. This was caused by a bug in the etcd client library.

Rate Limit

#15794 Improved the behavior of connection rate limit updates to ensure that changes (e.g., to burst rate or rate thresholds) are applied immediately after the listener configuration is updated. Previously, parts of the internal limiter state were not refreshed correctly, which could result in rate limits appearing stricter than configured.

Smart Data Hub

#15810 Introduced spb_{en,de}code functions to correct handling of bytes_value Metrics. Fixed an issue with the original sparkplug_{en,de}code functions, which did not base64 encode/decode bytes_value metric values as required by the Protobuf specification. To address this, new spb_{en,de}code functions have been introduced for correct encoding/decoding of such fields. The old sparkplug_{en,de}code functions are now deprecated to maintain backward compatibility.

Access Control

#15818 Corrected handling of {allow|deny, all} ACL rules.

Previously, these rules were internally translated to match #, which incorrectly failed to match topics prefixed with $ (e.g. $testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure {allow|deny, all} rules correctly match any topic, including $-prefixed ones.
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.

If you have such an user and wish to delete it, run the following in an EMQX console:
```
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end).
```

Breaking Changes

#15752 Listener connection rate limits (max_conn_rate and max_conn_burst) are now enforced per listener rather than per acceptor, restoring the pre-5.9.0 behavior. As a result, configurations from versions 5.9.0, 5.9.1 and 5.10.0 are incompatible: specified rates must be scaled up by the number of acceptors configured for respective listeners.

View release on GitHub

v5.8.8 Breaking risk 10mo

Security fixes

Fixed TLS connection race condition during certificate renewal
Added support for RSA-PSS certificate signatures

Notable features

Client ID registration throttling prevents aggressive reconnect instability
Erlang VM parameters tuned for message latency and CPU usage
Global garbage collection disabled by default

Full changelog

Enhancements

Deployment

#15813 Added package release for Debian 13 (Trixie), and updated Docker images to use Debian 13 as the base.

Core MQTT Functionalities

#15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code 137 (Server Busy) in the CONNACK with Reason-String "THROTTLED", and should retry after the cleanup completes.
- Fixed the reason code returned when another connection registers the same client ID; now correctly returns 137 instead of 133.

Observability

#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.

Performance

#15536 Disabled the node.global_gc_interval configuration by default to improve overall performance stability, as it caused CPU fluctuations and higher message latency while providing little benefit over Erlang’s built-in garbage collector.
#15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (+zdbbl 32768) to prevent busy_dist_port alarms during intensive Mnesia operations.
- Disabled scheduler busy-waiting (+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system.
- Set scheduler binding type to db (+stbt db) to reduce message latency.

Bug Fixes

Deployment

#15580 Added a new emqxLicenseSecretRef variable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.

This replaces the non-functional emqxLicenseSecretName variable, which created and mounted a secret file but did not pass the license to EMQX.

Clustering

#14778 Fixed an issue where a node could not join a running cluster if that node had broken symlinks in its data/certs or data/authz directories.

Security

#15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a bad_certificate / invalid_signature error.

Observability

#15639 Fixed an issue where the packets.subscribe.auth_error metric was not incremented when subscription authentication failed.

Gateway

#15679 Fixed incorrect global chain names for the ExProto gateways. Built-in authentication data for these gateways was previously grouped under unknown:global, causing conflicts between gateways.
#15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.

ExHook

#15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.

View release on GitHub

e5.8.8 Breaking risk 10mo

Security fixes

Fixed TLS connection race condition during certificate renewal
Added support for RSA-PSS certificate signatures

Notable features

Client ID registration throttling prevents aggressive reconnect instability
Erlang VM parameters tuned for message latency and CPU usage
Global garbage collection disabled by default

Full changelog

Enhancements

Deployment

#15813 Added package release for Debian 13 (Trixie), and updated Docker images to use Debian 13 as the base.

Core MQTT Functionalities

#15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code 137 (Server Busy) in the CONNACK with Reason-String "THROTTLED", and should retry after the cleanup completes.
- Fixed the reason code returned when another connection registers the same client ID; now correctly returns 137 instead of 133.

Data Integration

#15542 Upgraded our erlcloud library to 3.8.3.0. This allows users to set up an S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s).
#15585 Updated the brod client to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation of JoinGroups API versions v0 - v1.

Observability

#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.

Performance

#15536 Disabled the node.global_gc_interval configuration by default to improve overall performance stability, as it caused CPU fluctuations and higher message latency while providing little benefit over Erlang’s built-in garbage collector.
#15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (+zdbbl 32768) to prevent busy_dist_port alarms during intensive Mnesia operations.
- Disabled scheduler busy-waiting (+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system.
- Set scheduler binding type to db (+stbt db) to reduce message latency.

Bug Fixes

Deployment

#15580 Added a new emqxLicenseSecretRef variable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.

This replaces the non-functional emqxLicenseSecretName variable, which created and mounted a secret file but did not pass the license to EMQX.

Clustering

#14778 Fixed an issue where a node could not join a running cluster if that node had broken symlinks in its data/certs or data/authz directories.

Security

#15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a bad_certificate / invalid_signature error.

Data Integration

#15616 Kafka connections are now considered healthy even if a topic_authorization_failed error is returned for the default probing topic.

Smart Data Hub

#15706 Fixed an indexing issue that could cause Message Transformations and Schema Validations to behave inconsistently. Deleting one item could corrupt the topic index, so that a subsequent item remained active even after being disabled.
#15708 Fixed an issue where external schema registries were not reloaded after a node restart.

Observability

#15639 Fixed an issue where the packets.subscribe.auth_error metric was not incremented when subscription authentication failed.

Gateway

#15679 Fixed incorrect global chain names for the ExProto, JT/T 808, GB/T 32960, and OCPP gateways. Built-in authentication data for these gateways was previously grouped under unknown:global, causing conflicts between gateways.
#15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.
#15822 Fixed an issue where the OCPP connection would crash after sending a certain number of messages.

ExHook

#15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.

View release on GitHub

v5.8.7 Bug fix 1y

Minor fixes and improvements.

Full changelog

Bug Fixes

#15383 Fixed a potential resource leak in the MQTT bridge. When the bridge failed to start, the topic index table was not properly cleaned up. This fix ensures that the index table is correctly deleted to prevent resource leaks.

View release on GitHub

e5.9.1 Bug fix 1y

Notable features

OpenTelemetry HTTP header configuration for authenticated collectors
Snowflake Connector private key file path support
Configuration key removal via emqx ctl conf remove command

Full changelog

Enhancements

#15364 Added support for custom HTTP headers in the OpenTelemetry gRPC (over HTTP/2) integration. This enhancement enables compatibility with collectors that require HTTP authentication.
#15160 Added the DELETE /mt/bulk_delete_ns API for multi-tenancy management, which allows deleting namespaces in bulk.
#15158 Added new emqx ctl conf remove x.y.z command, which removes the configuration key path x.y.z from the existing configuration.
#15157 Added support for specifying private key file path for Snowflake Connector instead of using password.

Users should either use password, private key, or neither (set parameters in /etc/odbc.ini).
#15043 Instrument the DS Raft backend with basic metrics to provide insights into cluster status, database overview, shard replication, and replica transitions.

Bug Fixes

Data Integration

#15331 Fixed an issue in the InfluxDB action where line protocol conversion failed if the timestamp in WriteSyntax was left blank and no timestamp field was provided in the rule.
Now the system's current millisecond value is used instead, and millisecond precision is enforced.
#15274 Improved the resilience of Postgres, Matrix, and TimescaleDB connectors by triggering a full reconnection on any health check failure. Previously, failed health checks could leave the connection in a broken state, causing operations to hang and potentially leading to out-of-memory issues.
#15154 Fixed a rare race condition in Actions running in aggregated mode (e.g., S3, Azure Blob Storage, Snowflake) that could lead to a crash with errors like:
```
** Reason for termination ==
** {function_clause,[{emqx_connector_aggregator,handle_close_buffer,[...], ...
```
#15147 Fixed an issue where some Actions failed to emit trace events during rule testing with simulated input data, even after request rendering.

Affected Actions:
- Couchbase
- Snowflake
- IoTDB (Thrift driver)
#15383 Fixed a potential resource leak in the MQTT bridge. When the bridge failed to start, the topic index table was not properly cleaned up. This fix ensures that the index table is correctly deleted to prevent resource leaks.

Smart Data Hub

#15224 Fixed an issue where updating an External Schema Registry via the Dashboard would unintentionally overwrite the existing password with ******. The password is now correctly preserved during updates.
#15190 Enhanced Message Transformation by allowing hard-coded values for QoS and topic.

Observability

#15299 Fixed a badarg error that occurred when exporting OpenTelemetry metrics.

Telemetry

#15216 Fixed a crash in the emqx_telemetry process that could occur when plugins were activated.

Access Control

#15184 Fixed the formatting of error messages returned when creating a blacklist fails.

Clustering

#15180 Reduced the risk of deadlocks during channel registration by fixing improper handling of badrpc errors in the ekka_locker module. These errors previously led to false positives in lock operations, potentially causing inconsistent cluster state and deadlocks.

Security

#15159 Improved handling of Certificate Revocation List (CRL) Distribution Point URLs by stopping their refresh after repeated failures (default: 60 seconds). This prevents excessive error logs from unreachable URLs and improves overall system stability.

View release on GitHub

e5.10.0 Breaking risk 1y

Notable features

NATS Gateway accepts NATS client connections with MQTT transformation
Subscription maximum QoS control via topic matching rules
WebSocket performance improvements (20% CPU reduction)

Full changelog

Release Date: 2025-06-10

Make sure to check the breaking changes and known issues before upgrading to EMQX 5.10.0.

Enhancements

Core MQTT Functionalities

#15118 Provided a new configuration option mqtt.subscription_max_qos_rules to control the maximum QoS level allowed per client subscription. This allows administrators to limit the QoS requested in SUBSCRIBE packets based on matching rules for specific topics. Currently, only a limited set of matching rules (predicates) is supported, based on the topic in the SUBSCRIBE packet.
#15246 Improved WebSocket connections performance and resource consumption.
- Reduced CPU usage by approximately 20% and slightly lowered memory consumption, according to synthetic benchmarks measuring 1-on-1 MQTT messaging performance.
- Improved connection setup efficiency when the listener-wide connection limit is enabled, especially on nodes managing a large number of connections.

Deployment

#14791 Added support for custom annotations on the EMQX StatefulSet in the Helm chart, enabling automated pod restarts on ConfigMap or Secret changes. This improves automation and reliability when managing EMQX on Kubernetes.

Access Control

#15250 Improved LDAP bind authentication to correctly extract the is_superuser flag from LDAP entry attributes.
Previously, the is_superuser value was always set to false, even when the LDAP entry included a valid isSuperuser attribute.
#15249 Improved the LDAP authentication and authorization.
- Validation for the LDAP filter/base_dn settings was added.
- Fixed various variable interpolation issues.

Rule Engine

#15001 Add ai_completion function to the Rule Engine SQL that allows to use AI services to process the data.
#15201 Add base_url option to AI completion provider configuration.
#15188 Rule event topics now have namespaces.

| Previous event topic | New event topic |
| :-------------------------------------- | :-------------------------------------- |
| $events/client_connected | $events/client/connected |
| $events/client_disconnected | $events/client/disconnected |
| $events/client_connack | $events/client/connack |
| $events/client_check_authz_complete | $events/auth/check_authz_complete |
| $events/client_check_authn_complete | $events/auth/check_authn_complete |
| $events/session_subscribed | $events/session/subscribed |
| $events/session_unsubscribed | $events/session/unsubscribed |
| $events/message_delivered | $events/message/delivered |
| $events/message_acked | $events/message/acked |
| $events/message_dropped | $events/message/dropped |
| $events/delivery_dropped | $events/message/delivery_dropped |
| $events/message_transformation_failed | $events/message_transformation/failed |
| $events/schema_validation_failed | $events/schema_validation/failed |

Previous event topics are kept for backwards compatibility.
#15175 Added support for matching event topics in Rule Engine using wildcards. Now, it's possible to use $events/#, $events/sys/+ and similar for matching multiple events at once.

Smart Data Hub

#15174 Added support to upload Protobuf source file bundles for Schema Registry.

For example, assuming that the Protobuf source file bundle is at /tmp/bundle.tar.gz and has the following file structure, with a.proto being the root Protobuf schema file:
```
.
├── a.proto
├── c.proto
└── nested
    └── b.proto
```
Then, to create a new schema using that bundle via the HTTP API:
```
curl -v http://127.0.0.1:18083/api/v5/schema_registry_protobuf/bundle \
  -XPOST \
  -H "Authorization: Bearer xxxx" \
  -F bundle=@/tmp/bundle.tar.gz \
  -F name=my_cool_schema \
  -F root_proto_file=a.proto
```

Data Integration

#15248 EMQX supports data integration with Doris, supporting data writing using SQL statements.
#15218 Added support for IAM authentication in Kafka Producer and Consumer Connectors when connecting to Amazon MSK (Managed Streaming for Apache Kafka). When EMQX runs on AWS EC2, it uses the AWS SDK to generate OAuth Bearer tokens for Kafka clients.
#15157 Added support for specifying private key file path for Snowflake Connector instead of using password.

Users should either use password, private key, or neither (set parameters in /etc/odbc.ini).
#14983 EMQX supports data integration with S3Tables.

Current limitations:
- Only S3Tables catalogs are supported (hence table data and metadata must live in S3).
- Only Iceberg table format version 2 is supported.
- Only the following partition transform functions are supported:
  - identity
  - void
  - bucket[N]
- Data files are written only in Avro.
#15331 Fixed the issue in influxdb action where the line protocol conversion failed when the timestamp in WriteSyntax was left blank and there was no timestamp field in the rule. Now the system's current millisecond value is used instead, and millisecond precision is enforced.
#15348 Make middlebox_comp_mode configurable for SSL clients. The middlebox_comp_mode option, which was previously always enabled (true) for all TLS 1.3 connections, is now configurable. By default, it remains true to maintain compatibility with most network environments.

In rare cases where TLS fails with an error such as: unexpected_message, TLS client: In state hello_retry_middlebox_assert ..., try setting middlebox_comp_mode to false.

Multi-Tenancy

#15253 Added two new multi-tenancy APIs: GET /mt/ns_list_details and GET /mt/ns_list_managed_details. Both work similarly to their existing counterpars, but returns extra metadata associated with the namespace besides its name.
#15160 Added the DELETE /mt/bulk_delete_ns API for multi-tenancy management, which allows deleting namespaces in bulk.

CLI

#15158 Added new emqx ctl conf remove x.y.z command, which removes the configuration key path x.y.z from the existing configuration.

Gateway

#15138 Introduced NATS Gateway for accepting NATS client connections over TCP/TLS, WS/WSS transport protocols.

For example, the NATS gateway will transform the following NATS message into an MQTT message with the topic sub/t and payload hello, while supporting seamless integration with existing EMQX features such as the rule engine, data integration, and more:
```
PUB sub.t 5  
hello
```

Durable Storage

#15043 Instrument the DS Raft backend with basic metrics to provide insights into cluster status, database overview, shard replication, and replica transitions.

Bug Fixes

Access Control

#15184 Fixed an issue where the error message format was incorrect when creating a new banned list record failed.

Clustering

#15304 Fixed the problem related to core node discovery by replicant nodes when using static discovery strategy.

Previously, the replicants could ignore core nodes not explicitly listed in the static_seeds list.
This could lead to an inconsistent cluster view and load imbalance.
#15180 Fixed an issue in ekka_locker where RPC (badrpc) errors were not handled correctly, causing false-positive lock successes. This could lead to inconsistent lock states and deadlocks in clustered deployments.

Security

#15159 Improved CRL Distribution Point (CDP) handling: If a CDP URL fails to refresh continuously (default timeout: 60 seconds), it will now be evicted and excluded from further refresh attempts to prevent repeated error logs.

Rule Engine

#15247 Fixed an issue where function_clause error logs would be printed when attempting to call emqx ctl conf remove dashboard.sso.<BACKEND_NAME>.

Smart Data Hub

#15285 Added content-type header to External HTTP Schema requests.
#15224 Fixed an issue where updating an External Schema Registry via the dashboard would inadvertently change the password to ******.
#15190 Support setting hard-coded QoS and topic in message transformation.

Data Integration

#15274 Now, any health check failure for Postgres, Matrix and TimescaleDB Connectors will trigger a full reconnection. Prior to this change, there were situations where the connection would become unusable and attempts to use it would hang, potentially leading to out of memory issues.
#15234 Added trace events for rule testing when either the Action is not installed yet, and for Republish Fallback actions. These will now appear in the frontend while testing Rules with simulated input data.
#15219 Reduced the amount of logs generated by Clickhouse Connector when a health check timeout occurs. Also, when a health check timeout occurs for this Connector, we now mark it as connecting instead of disconnected, meaning that a full reconnect attempt will no longer be triggered by such timeouts.
#15154 Fixed a rare race condition in Actions that run in aggregated mode (S3, Azure Blob Storage, Snowflake) that could result in crash logs similar to the following:
```
** Reason for termination ==
** {function_clause,[{emqx_connector_aggregator,handle_close_buffer,[...], ...
```
#15147 When running Rule tests with simulated input data, some Actions would not emit trace events after rendering requests. This has been fixed.

Affected Actions:
- Couchbase
- Snowflake
- IoTDB (Thrift driver)
#15306 Fixed an issue where a Connector's health check response would always trigger health checks for all dependent Actions and Sources, regardless of their actual state.

Multi-Tenancy

#15242 Fixed an issue where, upon node restart after configuring limiters for multi-tenancy, logs like the following would be logged while initializing limiters:

2025-05-15T16:45:13.276895+08:00 [error] clientid: ns3mqttx_620053b2_100, msg: hook_callback_exception, peername: 127.0.0.1:39364, username: ns3, reason: {limiter_group_not_found,{mt_tenant,<<"ns3">>}}, stacktrace: [{emqx_limiter,connect,1,[{file,"emqx_limiter.erl"},{line,134}]}

Observability

#15299 Fixed a badarg error when exporting OpenTelemetry metrics.

Telemetry

#15216 Fixed a crash of emqx_telemetry process when there are plugins activated.

Breaking Changes

#15289 Added a new resource_opts.health_check_timeout configuration to all Connectors, Actions and Sources, with default value of 60 seconds. If a health check takes more than this to return a response, the Connector/Action/Source will be deemed disconnected.

Note: since the default is 60 seconds, this means that if a Connector/Action/Source previously could take more than that to return a healthy response, now it'll be deemed disconnected in such situations.
#15286 Configuration option broker.routing.storage_schema is now deprecated and ignored. Legacy v1 routing storage schema is no longer supported, and EMQX will refuse to start in a cluster running older versions that still use it.
#15239 The type for the multi_tenancy.default_max_sessions is now either infinity or a positive integer. Previously, 0 would be accepted.
#15156 Schema validation was added to dashboard.sso.oidc.issuer field. Now, this value is checked to be a valid URL.

View release on GitHub

All releases

Enhancements

Observability and Performance

Data Integration

Licensing

Bug Fixes

Clustering

Data Integration

Gateway

Operations

Security

Observability

Enhancements

AI Interoperability

Core MQTT Functionalities

Data Integration

Access Control

Management

Gateway

Deployment and Security

Performance

Bug Fixes

Core MQTT Functionalities

Data Integration

Access Control

Durable Storage

Clustering

Plugins

Gateway

Observability

Deployment and Security

ExHook

Licensing

Enhancements

Core MQTT Functionalities

Durable Storage

Message Queue and Streams

Gateway

Security

Access Control

Data Integration

Rule Engine

REST API

Observability

Performance

Bug Fixes

Core MQTT Functionalities

Gateway

Access Control

Data Integration

Durable Storage

Clustering

Observability

Security

Plugin

Miscellaneous

Enhancements

Deployment

Observability

Security

Gateway

Data Integration

Bug Fixes [39/760]

Core MQTT Functionalities

Rule Engine

Data Integration

Clustering

Access Control

Observability

Gateway

Breaking Changes

Deployment

6.0.2

Enhancements

Security

Rule Engine

Durable Storage

Performance

Bug Fixes

Core MQTT Functionalities