Breaking changes — review before upgrading.
Release history
EMQX releases
Scalable MQTT broker. Connect 100M+ IoT devices in one single cluster, move and process real-time IoT data with 1M msg/s throughput at 1ms latency.
All releases
19 shown
- Kafka polling waits for data instead of returning empty batches
- RabbitMQ Connector self-recovery without manual restart
- Azure Blob Storage health check optimization for large containers
Full changelog
Enhancements
Observability and Performance
-
#16746 Configured
os_monto collect only system-wide memory statistics by default, reducing per-process memory scanning overhead. -
#16911 Reduced the overhead of Prometheus metrics collection by avoiding accidental repeated queries of Mria statistics.
Data Integration
- #16961 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consisten
tly.
Licensing
- #16853 Made the v5 license parser forward-compatible with v6 license keys.
Bug Fixes
Clustering
-
#16729 Improved recovery time of a cluster after a simultaneous restart of all nodes.
The built-in Mria database management system no longer waits for a full sync of an internal table used to generate transaction synchronization events.
Data Integration
-
#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.
-
#16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout, with a minimum of 30 seconds, reducing unnecessary reconnections and retries when metadata requests take longer than expected.
This is especially beneficial when metadata request timeout is configured to a small value.
-
#16724 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.
-
#16935 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.
- #16971 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reasonclosingas recoverable errors, reducing log noise.
Gateway
-
#16606 Fixed the CoAP Gateway in connection mode over DTLS.
-
#17030, #17042 Fixed CoAP client takeover handling for both UDP and DTLS connections.
These changes improve takeover routing and token validation for reconnected clients, and keep the DTLS token takeover grace period aligned with the configured keepalive window.
Operations
-
#16732 Fixed a crash in
emqx ctl subscriptions listthat could happen when shared subscriptions were present.Before this fix, listing subscriptions could fail for some clients and return no output.
After this fix,
emqx ctl subscriptions listworks reliably with both regular and shared subscriptions.
Security
-
#16690 Fixed a CRL cache regression where
emqx_crl_cache:evict/1did not fully clear internal URL state.After eviction, the same CRL URL now re-registers correctly on next use, its refresh timer is restored, and repeated HTTP fetches per connection are avoided.
-
#17012 Fixed password-based authentication backends to let the auth chain continue when the
CONNECTpacket has no password, instead of rejecting the connection immediately.Previously, if a client connected without a password, the first password-based authenticator (built-in database, MySQL, PostgreSQL, MongoDB, Redis, or LDAP) in the chain would return an error, blocking any subsequent authenticators (such as HTTP) from being tried.
Observability
-
#16672 Ensured that the Erlang PID is printed as a log data field.
-
#16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:
2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'}, [{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4, [{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,...EMQX now prints more meaningful information to help debug the issue.
- Empty jq program now errors; use '.' instead
- String indices use code points instead of byte indices
- tonumber() rejects leading/trailing whitespace; use trim() first
- Agent-to-Agent Card Registry for autonomous AI agent discovery
- MQTT subscription filters using User-Property expressions
- GCP Workload Identity Federation authentication support
Full changelog
Enhancements
AI Interoperability
-
#16840 Implemented Agent-to-Agent (A2A) Card Registry. This feature enables autonomous AI agents to discover and collaborate through a standardized, event-driven MQTT 5.0 mechanism.
-
#16958 Added focused
/api-spec.mdendpoint and/api-spec.htmlto support drill-down discovery of EMQX HTTP API context, especially for AI agents and other tools that benefit from fetching only the relevant API slices instead of a single bloated spec.
Core MQTT Functionalities
-
#16612 Introduced the
emqx_setoptsapp for$SETOPTSserver-side option updates, including keepalive control topics and warning+suppression for unknown$SETOPTS/*publishes. -
#16887 Added optional MQTT subscription message filters controlled by
mqtt.subscription_message_filter.When enabled, clients can subscribe with a
?suffix such assensor/+/temperature?location=roomA&value>25and EMQX will deliver only the messages whose MQTT 5User-Propertyentries satisfy the filter expression. When disabled,?remains part of the topic filter text and no extra filtering is applied.Messages dropped by subscription-filter mismatch are reported through the existing
delivery.droppedevent with reasonsubscription_filterand counted by the newdelivery.dropped.filtermetric. -
#16929 Two new limiter kinds are introduced:
delivery_messagesanddelivery_bytes. In contrast to the existingmessagesandbyteslimiters, which limit messages published by a single client, the new limiter throttle messages received by a single client from any source. If the limit is hit, QoS 0 messages are dropped, QoS > 0 are queued internally, and a retry is scheduled. The retry time is derived from the limiter's configuration.The new limiters are only supported for memory sessions (
durable_sessions.enable = false).If unspecified, the default values are unlimited, thus keeping backwards compatibility.
-
#16779 Improved handling of malformed first packets by classifying them as invalid CONNECT packets and adding better protocol hints in logs.
Data Integration
-
#16589 Updated
jqlibrary used in the Rule Engine runtime to version 1.8.1.Note that the jq 1.8.1 language contains several subtle breaking changes compared to 1.6.1.
- Providing empty string as jq program is now considered an error: use
"."instead. (jq#2790) - String functions now use code point indices:
indices/1,index/1, andrindex/1functions now use code point indices instead of byte indices; useutf8bytelength/0to get byte index if needed. (jq#3065) tonumber/0rejects numbers with leading or trailing whitespace: usetrim/0before callingtonumber/0. (jq#3055, jq#3195)last(empty)behavior changed:last(empty)now yields no output values, consistent withfirst(empty). (jq#3179)limit/2errors on negative count, instead of silently accepting it. (jq#3181)- Tcl-style multiline comments supported: this may subtly affect parsing of existing code. (jq#2989)
- Decimal number conversion changed: decimal numbers are now converted to binary64 (double) instead of decimal64. (jq#2949)
nth/2emits empty on index out of range, instead of erroring. (jq#2674)- String multiplication by 0 or less than 1 now emits an empty string instead of the original string. (jq#2142)
- Providing empty string as jq program is now considered an error: use
-
#16634 Added support for GET requests in external HTTP schema validation by allowing schema registry entries to specify the HTTP method (POST remains the default).
-
#16647 Now, in GreptimeDB and EMQX Tables Actions, integer values that are not suffixed with
ioruare automatically cast to float (float64) values before being sent to the database.In InfluxDB Write Syntax, float is the default numeric type, and integers must be annotated. Previously, when EMQX encountered a non-annotated integer, it would interpret it as a one-character string, and insertion would fail if the column was of type float.
-
#16707 Added a Data Integration to consume from and publish messages to Azure Event Grid.
-
#16750 Added support for using Workload Identity Federation (WIF) authentication with GCP Connectors (GCP PubSub Producer and Consumer, BigQuery), via Service Account Impersonation. At this point, only OIDC workload identity pool providers using Client Credentials grant type are supported.
-
#16773 Now, when using MQTT Connector with SSL enabled, if unset, the Server Name Indication (SNI) field will be automatically filled with the server's hostname.
-
#16893 Added a new Connector and Action that appends data to QuasarDB.
-
#16962 Improved Kafka source polling behavior by ensuring fetch requests wait briefly for data instead of returning empty batches immediately when no records are available. This reduces unnecessary polling delays and helps Kafka consumers receive new records more consistently.
Access Control
-
#16597 In MySQL and PostgreSQL authentication and authorization, improved the handling of unallowed and quoted variables in the SQL template.
-
#16616 Added new configurations to SSO OIDC backend to allow specifying
jqexpressions to extract the desired role and namespace when creating new dashboard users. -
#16759 Added new functions
timestamp_sandtimestamp_msto retrieve system time in variform expressions (used e.g. to populate additional client attributes on connection). -
#16817 Added REST API endpoints to reset authentication and authorization metrics counters.
POST /authentication/:id/metrics/resetresets counters for a specific authenticator.POST /authorization/sources/:type/metrics/resetresets counters for a specific authorization source.
-
#16849 Added cookie-based authentication fallback for plugin API endpoints.
Plugin UI iframes served by the dashboard can now authenticate via the
emqx_authcookie when noAuthorizationheader is present. This only applies to/api/v5/plugin_api/...paths.
Management
- [#16958] Added
emqx ctl api_keysCLI commands to list, show, add, delete, enable, and disable API keys from the command line.
Gateway
- #16734 Added ordered
token,nkey, andjwtinternal authentication methods to the NATS Gateway to reduce the authentication feature gap with NATS Server.
Deployment and Security
-
#16653 Made Erlang distribution listener address configurable via
node.dist_bind_address.For example:
node.dist_bind_address = "10.0.1.5".Previously required configuration in
vm.argsas-kernel inet_dist_use_interface {10,0,1,5}. -
#16888 Refreshed the default TLS certificate bundle shipped with EMQX packages for local development and testing.
The new server certificate is issued for
localhostand loopback addresses only (localhost,127.0.0.1,::1).These default certificates are intended for test and local deployment scenarios only and must not be used in production.
-
#16916 Now, the
emqx_cert_expiry_atPrometheus metric takes into account the expiry date of certificates that belong to managed certificate bundles, when they are used in MQTT listeners.
Performance
-
#16500 Optimize idle memory usage and reduce the cost of maintaining rate-based metrics.
Note that various 5-minute average rate metrics exposed via APIs are no longer exact averages over the last 300 samples, but are instead EWMAs (Exponentially Weighted Moving Averages) that approximate them closely.
-
#16547 Disable TLS 1.2 session reuse by default to reduce TLS handshake overhead.
The TLS 1.2 session cache size is limited to 1000 entries, and the cache is local to each node.
This makes the reuse rate very low, especially when large numbers of connections connect to a large cluster.
-
#16794 Enabled node-level authentication and authorization caches by default.
This reduces repeated backend lookups for repeated client checks out of the box, improving authentication and authorization performance in common deployments.
-
#16829 Optimized the NATS gateway publish hot path to reduce per-message overhead in frame parsing, subject/topic handling, metrics updates, and ACK/message build steps.
-
#16911 Reduce the overhead of Prometheus metrics collection by avoiding repeated queries of Mria statistics.
-
#16550 Stop caching subscribe ACL check results.
MQTT subscription is mostly done once per connection life cycle. Holding the subscribe ACL check result in cache is most of the time a waste of RAM.
Bug Fixes
Core MQTT Functionalities
-
#16721 Fixed QoS 2 duplicate handling when
await_rel_timeouthas expired.Previously, if a client retried a QoS 2
PUBLISHwithDUP=1after the broker had expired the pending PUBREL state (default 300 seconds), the message could be published to subscribers again. EMQX now treats this retransmission as a duplicate handshake packet and returnsPUBRECwithout re-delivering the application message. -
#16725 Disabled TCP connection congestion alarm by default by setting
conn_congestion.enable_alarm = falsein the default zone/global configuration. -
#16781 Fixed CONNECT validation when retained messages are unavailable.
When
mqtt.retain_availableis set tofalse, CONNECT packets with Will Retain set are now correctly rejected with CONNACK reasonRetain not supported (0x9A). -
#16783 Fixed MQTT v5 SUBSCRIBE validation for
Subscription-Identifierupper bound.EMQX now accepts
268435455(0x0FFFFFFF), which is the maximum valid Subscription Identifier value defined by the MQTT spec. -
#16974 In EMQX 6.1.1, when a session was subscribed to a topic filter containing retained messages and was later taken over or resumed without re-subscribing to the same topic filter, it would receive again the received messages. Now, the previous behavior is restored, meaning that, upon session resumption or takeover without explicit re-subscription, retained message iteration will cease.
-
#16876 Changed log message
msg_publish_not_allowedtomsg_not_routed_to_subscribers.
Data Integration
-
#16803 Improved error reporting when configuring batch operations for MySQL actions.
-
#16796 Fixed handling of multiline SQL statements in connector actions.
-
#16936 Fixed an issue where the health check of an Azure Blob Storage Action in aggregate mode could timeout if the container contained too many blobs.
-
#16955 Eliminate Kafka producer action false health check warning logs.
Previously if Kafka producer is idling for too long, Kafka may close the connection (typically default is 10 minutes), if Kafka producer action health-checks happen to be performed around the same moment, there could be a false warning message with message
"not_all_kafka_partitions_connected". -
#16972 HTTP and GCP PubSub Actions were patched to treat transient connection errors with reason
closingas recoverable errors, reducing log noise. -
#16863 Added a warning log when an async reply is received for an already-expired request in async actions.
-
#16847 Fixed a crash when non-ASCII unicode string is used in message transformation expression.
-
#16979 MQTT ingress bridges now support consuming from remote message queues
$queue/{name}/{bind-filter}. -
#16999 Fixed an issue where MQTT source failed to receive messages from
$queue/subscriptions when the remote broker has the Message Queue (mq) feature enabled. The MQ message delivery was missing the MQTT v5 Subscription-Identifier property in PUBLISH packets, which the MQTT bridge ingress relies on to route messages from queue subscriptions.
Access Control
-
#16780 Fixed an issue in authorization source validation where requests missing the
typefield could trigger an internal error.Now EMQX returns a clear
BAD_REQUESTvalidation error for this case. -
#16805 Added support for authz hook results to opt out of authorization cache storage for dynamic ACL decisions.
-
#16865 Added
cert_common_nameandcert_subjectaliases formqtt.client_attrs_initexpressions, alongside the existingcnanddnvariables. -
#16868 Improved REST API authentication error messages to guide programmatic clients toward using API keys (Basic auth) instead of repeatedly logging in for bearer tokens. Error responses now mention the
api_key.bootstrap_fileconfiguration option and thePOST /api_keyendpoint for creating persistent API keys. -
#16928 Dashboard-created REST API keys are now generated randomly instead of being derived from the API key name.
-
#16939 Fixed the built-in database authenticator so it no longer logs a warning when the default bootstrap file path is configured but the file does not exist.
-
#16993 Fixed an issue where an error response from an OIDC SSO provider would result in a 500 error. Now a more user-friendly result is returned.
Durable Storage
- #16874 Fixed a rare issue where Durable Storage backed by DS Raft could stop accepting new messages after a sequence of quick cluster leadership changes, requiring a node restart to recover.
Clustering
-
#16534 Lowered the default
net_ticktimefrom 2 minutes to 1 minute to improve cluster node failure detection.In the event of a network outage or abrupt node termination, remaining nodes will detect the down node sooner, reducing the time before failover mechanisms activate and improving overall cluster resilience and user experience.
Plugins
-
#16842 Reduced noisy plugin config warning logs when no peer node has the plugin config yet.
Previously, when a node tried to fetch plugin config from peer nodes during startup, it would log a warning even when all peers simply didn't have the config (e.g., first node to load the plugin). Now this benign case is logged at debug level, and only genuine errors (RPC failures, timeouts) remain as warnings.
-
#16843 Fixed an issue where HTTP headers and query string parameters were not passed through to plugin API handlers, causing plugins to receive empty headers and missing query parameters.
-
#16904 Prevent enabling or starting multiple versions of the same plugin at once. When a newer version is enabled, older configured versions of that plugin are automatically disabled, and management API actions now return a clear error instead of reporting success while another version is still active.
Gateway
-
#16536 Fixed the CoAP Gateway when running in DTLS connection mode.
-
#16996 Fixed CoAP DTLS connection-mode to keep sessions available after
sock_closedand support reconnect takeover with the sameclientidand validtoken.
Observability
- #16879 Added
log.audit.cache_sizeas the primary config key for the audit log DB cache size, while keepinglog.audit.max_filter_sizefor backward compatibility.
Deployment and Security
-
#16683 Added support for HTTPS CRL Distribution Point URLs in the CRL cache, so CRLs fetched from
https://endpoints are now cached and refreshed correctly. -
#16901 Fixed RPM package OpenSSL dependency for RHEL 9.6 LTS: pinned
openssl >= 3.5.1for RHEL >= 9.7 andopenssl >= 3.0.7for older RHEL 9 versions.
ExHook
- #16890 Fixed an ExHook issue where successful reconnect reloads could duplicate the same server name in the running list and trigger repeated callback dispatches.
Licensing
- #16764 Refined license customer tier handling by introducing
STANDARDandVIPtiers in enforcement logic and reducing the official-licenseSTANDARDexpiry grace period from 90 days to 15 days before new sessions are restricted.
- Message Stream prefix changed from $s to $stream with required name
- Message Queue prefix changed to $queue with required name
- Stream subscriptions require $stream/name/topic_filter syntax
- Retained message iteration resumes from last confirmed delivery
- CoAP Block-Wise Transfer protocol support
- JT/T 808 protocol 2019 with GBK character encoding
Full changelog
Enhancements
Core MQTT Functionalities
- #16637 Previously, if a session was taken over while in the middle of receiving several retained messages from a wildcard topic subscription, iteration over those retained messages would start over for the new client, repeating already delivered retained messages. Now, the new client will resume iteration from the last confirmed delivered message from the last session, reducing the number of duplicated retained messages.
Durable Storage
-
#16704 Prevent RocksDB storage backing Durable Storage shards from preallocating large chunks of disk space by default.
Previously, each shard consumed a significant amount of disk space immediately, which compounded due to multiple Durable Storage databases now being created by default (each consisting of 16 shards).
Message Queue and Streams
-
#16551, #16714 Refined Message Stream and Message Queue interfaces.
For stream subscriptions, the
$streamprefix is now used. Streams are now named, and the name should be specified on subscribe:SUBSCRIBE $stream/<name>/<topic_filter>(orSUBSCRIBE $stream/<name>if the stream is known to exist). The starting point for stream consumption is specified using thestream-offsetuser subscription property.For message queue subscriptions, the
$queueprefix is used. Message queues are also named, and the name should be specified on subscribe:SUBSCRIBE $queue/<name>/<topic_filter>(orSUBSCRIBE $queue/<name>if the queue is known to exist).Notes:
- Stream and queue names may contain only alphanumeric characters, underscores, hyphens, and dots.
- Previously created unnamed streams and queues obtain the name derived from their topic filter. Their name becomes their topic filter with prepended
/. - The legacy
$qqueue interface (introduced in 6.0.0) and$sstream interface (introduced in 6.1.0) are kept for compatibility, but their use is discouraged. - If Message Queues are enabled,
$queueprefix cannot be used for subscribing to shared subscriptions anymore.
-
#16820 Added shorter API path aliases
/queues/*and/streams/*for the Message Queue and Message Stream management APIs.The previous
/message_queues/*and/message_streams/*paths remain functional for backward compatibility but are no longer shown in the API documentation.
Gateway
-
#16719 Added Block-Wise Transfer support for CoAP and LwM2M gateways.
- Added block-wise settings:
enable,max_block_size,max_body_size, andexchange_lifetime. - Improved
POST /gateways/coap/clients/:clientid/requestand LwM2M downlink handling for large block-wise messages.
- Added block-wise settings:
-
-
Added the
jt808.frame.parse_unknown_messageoption, enabling the JT808 gateway to transparently forward unknown messages. -
Added JT/T 808 protocol 2019 support.
-
Added GBK character encoding support for JT/T 808 gateway.
The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new
frame.string_encodingconfiguration option is added:utf8(default): Pass through strings as-is (backward-compatible)gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices
This affects both uplink parsing (GBK to UTF-8) and downlink serialization (UTF-8 to GBK), including string fields such as license plates, driver names, text messages, area names, and client parameters.
MQTT payloads always use UTF-8 encoding regardless of this setting. -
Added support for custom
msg_snin JT/T 808 gateway downlink messages.When a downlink MQTT message payload contains a
msg_snfield in the header, the gateway will use that value instead of the auto-generated channel sequence number. This allows external systems to control message sequencing for specific use cases. -
Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with base64 encoding in JSON instead of string type.
-
Fixed JT/T 808 0x0702 driver identity report message parsing.
-
Security
-
#16447 Added a new
force_deletequery parameter to the following HTTP APIs for managing certificates:DELETE /certs/global/name/:nameDELETE /certs/ns/:ns/name/:name
When omitted or
false, configurations in all namespaces will be checked to see if the managed bundle being deleted is being referenced and fail deletion if affirmative. -
#16461 Support TLS 1.3 session ticket resumption.
EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.
Node-level configuration:
node.tls_stateless_tickets_seedis the secret key seed for generating TLS 1.3 stateless session tickets.
Listener-level configuration:listeners.ssl.<name>.ssl_options.session_ticketsenables TLS 1.3 session resumption using stateless session tickets.
Possible values aredisabled(default),stateless, andstateless_with_cert(includes certificate information).Session tickets are only generated when
node.tls_stateless_tickets_seedis configured (non-empty) andsession_ticketsis enabled in listener SSL options.
Ifsession_ticketsis enabled butnode.tls_stateless_tickets_seedis empty, session tickets will not be generated and an error log will be emitted when starting the listener.
Access Control
-
#16504 Added a new option to parameterize the data source from which to construct the dashboard username when creating a new user via OIDC SSO.
-
#16741
Added configuration optionsidp_signs_envelopesandidp_signs_assertionsto SAML SSO backend to control signature verification behavior.Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.
Both options default to
falsefor backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these totruewhen their IdP is configured to sign SAML responses. -
#16684 Enabled
mqtt.client_attrs_initexpressions can make sure of password (for example, feed it tojwt_value) to initialize client attribute. -
#16730 Redis authorization now supports a compatibility mode for EMQX 4.x ACL data.
Setcompatibility_mode = v4to enable legacy%u/%cplaceholder conversion and legacy ACL access values1|2|3(mapped to subscribe/publish/all).
By default, compatibility mode remains disabled, so existing Redis authz behavior is unchanged.
Data Integration
-
#16511 Supported the IoTDB Table Model in the data integration.
-
#16516 Added two new Action metrics:
aggregated_upload.successandaggregated_upload.failure. These are only relevant for Aggregated Upload Actions (S3, Azure Blob Storage, Snowflake and S3Tables) and are incremented when an aggregated delivery succeeds or fails, respectively. -
#16658 Previously, when the server port was omitted in an EMQX Tables Connector, the port would default to 80. Now, it defaults to 4001.
A more intelligible error message is returned when an EMQX Tables Connector is configured with SSL enabled but
cacertfile,certfileorkeyfileconfigurations are missing.
Rule Engine
-
#16524 Enhanced base64 encoding and decoding functions in rule engine SQL with support for padding and URL-safe options.
The
base64_encodeandbase64_decodefunctions now support optional parameters to control encoding behavior:no_padding: Encode or decode without padding characters (=). Useful when you need to remove padding from encoded strings or decode strings that don't have padding.urlsafe: Use URL-safe base64 encoding/decoding. Replaces+with-and/with_, making the encoded string safe to use in URLs without encoding.
You can use these options individually or combine them. When combining options, the order doesn't matter.
Examples in rule SQL:
Encode without padding:
SELECT base64_encode(payload, 'no_padding') as encoded FROM "t/#"Encode with URL-safe characters:
SELECT base64_encode(payload, 'urlsafe') as encoded FROM "t/#"Encode with both options (no padding and URL-safe):
SELECT base64_encode(payload, 'no_padding', 'urlsafe') as encoded FROM "t/#"Decode URL-safe base64:
SELECT base64_decode(payload, 'urlsafe') as decoded FROM "t/#"Decode unpadded URL-safe base64:
SELECT base64_decode(payload, 'urlsafe', 'no_padding') as decoded FROM "t/#" -
#16533 Added two new Variform expression helper functions
json_valueandjwt_valueto extract values from JSON data and JWT tokens using dot-separated key paths.The
json_valuefunction extracts values from JSON binary strings using a dot-separated path to navigate nested structures.
Thejwt_valuefunction decodes JWT token payloads and extracts claim values using the same path syntax.For example, if
usernameis a JSON object, you can access field withjson_value(username, 'shop.floor');
ifpasswordis JWT with a customized claim, you can access the nested value withjwt_value(password, 'client_attrs.unitid'). -
#16539 Added support for keeping track of metric aliases when utilizing the
spb_decodeRule Engine function.Now, after a device or edge of network (EoN) node publishes its
DBIRTH/NBIRTHmessages, alias mappings in said message will be stored and used when the client later usesspb_decodeon a message matching theDDATA/NDATAtopic patterns. The original names of the metrics will be added to the output ofspb_decode.Note: when executing fallback actions, the mapping is not available in the environment they run in. This means that, if a fallback action republishes the undecoded
DDATA/NDATApayload to a Sparkplug BDDATA/NDATAtopic, the metricnamefields will not be populated by the alias mapping. -
#16581 Added a new Rule SQL function:
spb_zip_kvs.Given an already decoded, valid Sparkplug B message, it'll go through the metrics and "zip" each property name and its value together.
-
properties(and any nestedPropertySetvalues) have theirkeysandvaluesfields
removed and the values of the two former fields zipped together and merged with the
original map. Values that have thePropertySetorPropertySetListtypes are
recursively transformed like this. -
Values of
PropertySetListtype have theirpropertysetfield removed and replaced by
an array ofPropertySets, transformed following the above item's description. -
If present,
dataset_valuefield is transformed in a similar fashion: itscolumnsand
rowsfields are removed and their values zipped together in an object merged with the
original object.typesandnum_of_columnsfields are removed from output. -
Other values/fields are untouched.
For example, given this input decoded Sparkplug B message:
{ "metrics": [ { "properties": { "values": [ {"int_value": 99}, { "propertyset_value": { "values": [{"int_value": 999}], "keys": ["inner"] } }, { "propertysets_value": { "propertyset": [ { "values": [{"int_value": 1}], "keys": ["inner1"] }, { "values": [{"int_value": 2}], "keys": ["inner2"] } ] } } ], "keys": [ "leaf", "nested_prop", "nested_prop_list" ] } }, { "dataset_value": { "num_of_columns": 2, "types": [7, 12], "rows": [ { "elements": [ {"int_value": 3}, {"string_value": "3"} ] }, { "elements": [ {"int_value": 4}, {"string_value": "4"} ] } ], "columns": ["col1", "col2"] } } ] }Then, the output of
spb_zip_kvswill be:{ "metrics": [ { "properties": { "nested_prop_list": { "propertysets_value": [ {"inner1": {"int_value": 1}}, {"inner2": {"int_value": 2}} ] }, "nested_prop": { "propertyset_value": {"inner": {"int_value": 999}} }, "leaf": {"int_value": 99} } }, { "dataset_value": { "col2": {"elements": [{"int_value": 4}, {"string_value": "4"}]}, "col1": {"elements": [{"int_value": 3}, {"string_value": "3"}]} } } ] } -
REST API
-
#16718 Improve REST API Swagger spec.
Previously, summaries and descriptions of spec fields were mixed together. Now, summaries are brief, simple and punctuation-free, while descriptions provide all the details.
-
#16735 EMQX now supports plugin-defined HTTP API callbacks under
/api/v5/plugin_api/{plugin}/....This allows plugin authors to expose plugin-specific API endpoints through the dashboard API service, with consistent authentication and HTTP error handling.
Observability
-
#16656 Made system monitor reports such as
busy_portandlong_schedulemore informative by including process labels for easier troubleshooting. -
#16744
Supports end-to-end tracing of messages published via HTTP API.
Performance
-
#16413 Improve subscription handling performance.
-
#16492 Slightly improve idle system memory usage.
-
#16757 Set
os_monto collect only system-wide memory statistics by default, reducing per-process memory scanning overhead.
Bug Fixes
Core MQTT Functionalities
-
#16480 Fixed an issue where WebSocket connections could crash after the peer closed the connection, typically observed under moderate load.
crasher: initial call: cowboy_tls:connection_process/4, error: {{case_clause,{error,closed}},[ {cowboy_websocket_linger,websocket_send_close,2,[{file,"cowboy_websocket_linger.erl"},{line,752}]}, {cowboy_websocket_linger,websocket_close,3,[{file,"cowboy_websocket_linger.erl"},{line,743}]}, {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]} ]} messages: [ {ssl,{sslsocket,{gen_tcp,#Port<...>,...},[...]},<<130,130,27,93,145,101,251,93>>}, {ssl_closed,{sslsocket,{gen_tcp,#Port<...>,...},[...]}} ], ... -
#16515 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised
Maximum-Packet-Size. -
#16553 Fixed an issue where not all retained messages would be delivered if a subscriber hit the retained message dispatch rate limit.
If the dispatch rate limit is reached while iterating over retained topics, then the client process will retry the iteration at a later time with exponential back-off (minimum 300 ms, maximum 10 s).
The
retainer.flow_control.batch_deliver_numberconfiguration has been deprecated.
Theretainer.flow_control.batch_read_numberno longer supports being set to 0 to mean read all remaining retained messages at once. If set to 0, it'll default to 1000 messages. -
#16569 Fixed a rare race condition that could cause the supporting
emqx_flappingprocess for flapping detection to crash under high system load. -
#16651 Fixed a rare connection process crash during shutdown caused by operating on an already closed socket, typically under high system stress.
Prior to this fix, such race condition typically result in anerrorlevel log saying{badmatch,{ok,{sock_error,closed}.... -
#16675 Fixed timestamp ordering issue where
disconnected_atcould be later thanconnected_atduring session takeover or discard scenarios.Previously,
disconnected_atwas recorded too late (inensure_disconnected), after the new session'sconnected_atwas already set. This caused a race condition wheredisconnected_at > connected_at, making it difficult to track client presence state externally.The fix records
disconnected_atimmediately when takeover begins or when discard is received, ensuring it's always earlier than the new session'sconnected_at. This ensures correct timestamp ordering for external presence state tracking systems. -
#16715 Fixed an issue where retained
$SYSmessages (for example, broker/node identity topics) were stored without expiry, which could leave stale node identifiers visible in Dashboard views after StatefulSet rotation.Now, newly published retained
$SYSmessages includeMessage-Expiry-Interval = 3600(1 hour).For already existing stale retained
$SYSentries created before this change, you can manually clear them by publishing an empty retained message to the stale topic:emqx eval 'emqx:publish(emqx_message:set_flag(retain, true, emqx_message:make(emqx_sys, <<"$SYS/brokers/[email protected]/sysdescr">>, <<>>))).'Replace the topic in the command with the stale
$SYS/...topic you want to remove. -
#16731 Fixed a crash in
emqx ctl subscriptions listthat could happen when shared subscriptions were present.Before this fix, listing subscriptions could fail for some clients and return no output.
After this fix,
emqx ctl subscriptions listworks reliably with both regular and shared subscriptions. -
#16782 Fixed MQTT v5 protocol handling for invalid PUBLISH properties.
If a client sends a PUBLISH packet containing
Subscription-Identifier, EMQX now treats it as a protocol error and disconnects the client.
Gateway
-
#16603 Fixed the CoAP Gateway when running in DTLS connection mode.
-
#16670 NATS gateway now enforces the max publish payload, honors the
echooption (no local delivery), and improves publish/subscribe subject handling and related error messages.
Access Control
-
#16423 Added support for verifying the 'aud' (audience) claim in JWT authentication.
When the 'aud' claim is configured in verify_claims, the JWT token must include a valid 'aud' claim. The verification supports both string and array formats:
- If 'aud' is a string, it must exactly match the expected value.
- If 'aud' is an array, at least one element in the array must match the expected value.
- Empty string or empty array will fail verification.
- Missing 'aud' claim will fail verification when it's configured in verify_claims.
-
#16459 Fixed the issue in SCRAM authentication HTTP API. Previously, incorrect user ID was returned for the created user in the user creation API call.
Data Integration
-
#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself was restarted. Now, the Source will re-subscribe upon reconnect.
-
#16542 Fixed an issue where Kafka producer connections could disconnect prematurely when Kafka was overloaded, causing excessive produce request retries.
The request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
reducing unnecessary reconnections and retries when metadata requests take longer than expected.
This is especially beneficial when metadata request timeout is configured to a small value. -
#16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.
-
#16657 Fixed an issue where, when importing configuration from an older node version into a newer one, values would not be upgraded according to newer code, leading to strange behavior.
One such example is importing a MQTT Connector with static clientids from 5.10.0 into 6.0.0. In 5.10.0, usernames and passwords could not be associated with particular static clientids, and this was represented internally in a certain way. Later versions added the capability of creating those associations, with a different internal representation. This subtle internal representation conversion was missing when importing such configurations in previous EMQX versions.
-
#16659 When using an older MQTT Connector configuration with static clientids (from 5.10.0 and earlier) on later EMQX versions, the username and password at the root of the configuration was ignored. This could cause trouble when upgrading and keeping the same configuration, as the MQTT clients would stop using the credentials.
Now, if there are username and/or password fields in the root Connector, those credentials are merged with any specific ones specified per clientid, the latter taking precedence.
-
#16723 Fixed an issue with RabbitMQ Connector/Action/Source where, if some connection or channel processes died unexpectedly, the Connector/Action/Source would be reported as disconnected and never recover by itself without restarting it.
-
#16742
Fixes the issue of GreptimeDB TLS connection failure.
Durable Storage
-
#16512 Improve handling of recoverable errors in the durable session.
Durable sessions will now retry creation of durable storage iterators when that operation fails due to network issue.
Previously, the whole session would get disconnected.Fix problem with the retry mechanism in the
emqx_ds_clientcomponent.
Previously, the number of retry attempts on recoverable errors was limited.Fix problems with the shared subscriptions:
- Fix problem with shared subscription leader not coming up after node restart.
- Shared subscription leader no longer advertises streams that reached the end of replay to the clients.
- Make shared sub leader state checkpoint transaction options configurable
-
#16614 Improvements and bug fixes related to durable storage feature.
-
Improved handling of configuration inconsistencies between the nodes.
Previously, when a durable storage was created in a cluster where
nodes had different initial durable storage configuration, the
replicas wound not converge. This change addresses this problem by
replicating the configuration of the shard leader node to the
replicas during initialization of the storage and subsequent
configuration changes.Warning: this change is not backward-compatible. During a rolling
cluster upgrade the shards will pause until the majority of their
replicas are upgraded to the new version of EMQX, after which
downgrade to the previous versions of EMQX will become impossible. -
Fixed an issue in the durable storage subscription mechanism.
Previously, a durable subscription created with a fresh iterator
could miss a stored message with the timestamp precisely matching
timestamp of the iterator.
-
-
#16770 Improve stability of durable sessions during takeover and garbage collection.
Clustering
-
#16393 Improved the stability of the Cluster Link route replication under unstable network conditions.
-
#16465 Upgraded
gen_rpcto3.5.1.Prior to the
gen_rpcupgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new versiongen_rpcno longer has the long tail and converted crash logs to more readableerrorlogs,
and the frequent log"failed_to_connect_server"is also throttled to avoid spamming. -
#16544 Improve robustness of cluster autoclean procedure.
Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
This fix resolves this issue. -
#16739 Improved recovery time of a cluster after a simultaneous restart of all nodes.
Built-in Mria database management system no longer waits for the full sync of an internal table used to generate transaction synchronization events.
Observability
-
#16537 Fixed formatter crash when logging
gen_rpcerrors.Prior to this fix, EMQX may report "FORMATTER CRASH" errors when
gen_rpclogged certain error messages (e.g., transmission timeout errors).
The formatter now handles these error messages correctly without crashing. -
#16661 Improve
topic_metricsandcluster_rpclogging when invalid topic is requested. -
#16674 Ensure Erlang pid is printed as a log data field.
-
#16699 Previously, under certain race conditions, long and cryptic logs like the following could be printed:
2026-02-03T13:53:54.576326+00:00 [error] Generic server <0.11323236.0> terminating. Reason: {{badkey,'actions.success'},[{erlang,map_get,['actions.success',#{}],[{error_info,#{module => erl_erts_errors}}]},{emqx_metrics_worker,idx_metric,4,[{file,"emqx_metrics_worker.erl"},{line,683}]},{emqx_metrics_worker,inc,4,[{file,"emqx_metrics_worker.erl"},{line,322}]},{emqx_rule_runtime,do_eval_action_reply_t...Now, we print more meaningful information to help debug the issue.
Security
-
#16545 Fixed
node.cookiehandling of#character. Previously, if the cookie contained#, only the prefix before#would take effect.
For example, ifabc#dwas configured, onlyabcwas used as the cookie.Added validation to reject problematic characters: backslash, single quote, double quote, and space.
-
#16664 Previously, it was possible to upload managed certificate files associated with non-existent managed namespaces. Now, namespace existence is checked before accepting the upload.
-
#16692 Fixed a CRL cache regression where
emqx_crl_cache:evict/1did not fully clear internal URL state.
After eviction, the same CRL URL now re-registers correctly on next use, restores its refresh timer, and avoids repeated HTTP fetches per connection.
Plugin
-
#16784 Reduced noisy plugin startup warnings in single-node deployments.
EMQX no longer tries to fetch plugin config from the local node during cluster config sync, avoiding repeated
config_not_found_on_nodewarnings at startup. -
#16823 Fixed a Dashboard plugin management issue for preinstalled plugins.
When a plugin package is unpacked into
plugins/before node startup, starting it from the Dashboard no longer causesPlugin Config Not Foundon the plugin config page.
Miscellaneous
- #16620 Fix CRC32C dynamic library load issue on aarch64.
- TLS 1.3 stateless session ticket resumption with optional certificates
- SAML IdP signature verification configuration options
- IoTDB Table Model data integration support
Full changelog
Enhancements
Deployment
- #16491 Start releasing packages for macOS 15 (Sequoia)
Observability
-
#16135 Added two new metrics and corresponding rates for the
GET /monitor_currentHTTP API:rules_matchedandactions_executed. They track the number of rules that matched and act
ion execution rate (i.e., success + failure), respectively. -
#16324 Added support for end-to-end tracing of messages published via HTTP API.
Security
-
#16625 Added configuration options
idp_signs_envelopesandidp_signs_assertionsto SAML SSO backend to control signature verification behavior.
Previously, SAML signature verification was not working correctly because the IdP certificate fingerprint was not being extracted from metadata and passed to esaml for verification.Both options default to
falsefor backwards compatibility with existing configurations. Users who want to enable signature verification should explicitly set these totruewhen their IdP is configured to sign SAML responses. -
#16456 Added support for TLS 1.3 session ticket resumption.
EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS sessions without server-side session state storage.
Node-level configuration:
node.tls_stateless_tickets_seedis the secret key seed for generating TLS 1.3 stateless session tickets. Listener-level configuration:listeners.ssl.<name>.ssl_options.session_ticketsenables TLS 1.3 session resumption using stateless session tickets.
Possible values aredisabled(default),stateless, andstateless_with_cert(includes certificate information).Session tickets are only generated when
node.tls_stateless_tickets_seedis configured (non-empty) andsession_ticketsis enabled in listener SSL options.
Ifsession_ticketsis enabled butnode.tls_stateless_tickets_seedis empty, session tickets will not be generated and an error log will be emitted when starting the listener.
Gateway
-
#16220 Added the
jt808.frame.parse_unknown_messageoption, enabling the JT808 gateway to transparently forward unknown messages. -
#16596 Added support for JT/T 808 protocol 2019.
-
#16627 Add GBK character encoding support for JT/T 808 gateway.
The JT/T 808 protocol specifies GBK encoding for STRING type fields. A new
frame.string_encodingconfiguration option is added:utf8(default): Pass through strings as-is (backward-compatible)gbk: Convert GBK-encoded strings from devices to UTF-8 for MQTT, and UTF-8 from MQTT to GBK for devices
This affects string fields including license plates, driver names, text messages, area names, and client parameters.
MQTT payloads always use UTF-8 encoding regardless of this setting.
Data Integration
- #16511 Added support for the IoTDB Table Model in the data integration.
Bug Fixes [39/760]
Core MQTT Functionalities
-
#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
-
#16514 Fixed a bug that caused WebSocket connections to crash when receiving broker messages larger than the client's advertised
Maximum-Packet-Size.
Rule Engine
-
#16489 Fixed an issue where the following rule functions always returned
undefined:
msgid/0,qos/0,topic/0,topic/1,flags/0,flag/1,
clientid/0,username/0,peerhost/0,payload/0,payload/1.Note: This is a backward compatibility fix for EMQX v4. These functions are not documented in EMQX v5 and later. The encouraged usage is to directly reference fields from the rule evaluation context. For example,
SELECT clientid ...instead ofSELECT clientid().
Data Integration
-
#16263 Previously, the Kafka consumer connector performed health checks by verifying partition leader connectivity for all partitions.
In a clustered deployment, each EMQX node is assigned only a subset of partitions, causing leader connections for unassigned partitions to remain idle.
Since Kafka closes idle connections after a timeout (10 minutes by default), this behavior could trigger false connectivity alarms.The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.
-
#16336 Fixed a race condition which may cause timeout when testing connectivity or stopping a connector from the dashboard.
-
#16383 Previously, when using IoTDB Connector with its REST API driver, credentials would not be checked during health checks. Now, we send a no-op query during IoTDB connector health c
heck. This enables early detection of misconfigured client credentials. -
#16415 Upgraded Apache Pulsar client to 2.1.2.
When Pulsar producer action's
batch_sizeis configured to1, the producer will now encode single messages instead of single-element batches.
This enables consumers to share load using Key Share strategy. -
#16507 Previously, when an MQTT Source's Connector recovered after losing its connection, topics would not be re-subscribed and the Source would stop working until the Connector itself w
as restarted. Now, the Source will re-subscribe upon reconnect. -
#16585 Fixed an issue with GreptimeDB TLS connection failures.
-
#16618 The Kafka request timeout is now automatically set to at least twice the metadata request timeout (with a minimum of 30 seconds),
reducing unnecessary reconnections and retries when metadata requests take longer than expected.
This is especially beneficial when metadata request timeout is configured to a small value. -
#16622 Fixed an issue where, if an Action used async query mode and its Connector was disconnect after more than one health check, its Fallback Actions could be triggered twice.
Clustering
-
#16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
-
#16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replic
ation state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time. -
#16452 Upgraded
gen_rpcto3.5.1.Prior to the
gen_rpcupgrade, EMQX may experience a long tail of crash logs due to connection timeout if a peer node is unreachable.
The new version ofgen_rpcno longer has the long tail and converts crash logs to more readableerrorlogs,
and the frequent log"failed_to_connect_server"is also throttled to avoid log spamming. -
#16543 Improved robustness of cluster autoclean procedure.
Previously, if autoclean feature was disabled during initial start of the node, it would never activate after configuration change.
This fix resolves this issue.
Access Control
-
#16304 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.
-
#16541 Fixed an issue where OIDC issuer URLs were automatically normalized with a trailing slash when saved to the configuration file, causing issuer mismatch errors when the OIDC provid
er's discovery document returned the issuer without a trailing slash.
Observability
-
#16418 Reduced the volume of logs generated when a resource exception occurs (
resource_exception). These logs are now throttled, and some potentially large terms are redacted from the
m. -
#16535 Fixed formatter crash when logging gen_rpc errors.
Prior to this fix, EMQX would crash with "FORMATTER CRASH" errors when gen_rpc logged certain error messages (e.g., transmission timeout errors). The formatter now handles these error messages correctly without crashing.
Gateway
-
#16609 Fixed JT/T 808 gateway parameter setting (0x8103) and query response (0x0104) message handling for CAN bus ID parameters (0x0110~0x01FF), which should use BYTE[8] data type with b
ase64 encoding in JSON instead of string type. -
#16606 Fixed CoAP Gateway working in connection mode over DTLS.
Breaking Changes
Deployment
- #16491 Stop releasing packages for macOS 13 (Ventura)
- base64_encode/decode with no_padding and urlsafe options
- json_value and jwt_value functions for nested value extraction
- Sparkplug B metric alias tracking in spb_decode
Full changelog
6.0.2
Release Date: 2026-01-16
Make sure to check the breaking changes and known issues before upgrading to 6.0.2.
Enhancements
Security
-
#16461 EMQX now supports TLS 1.3 session resumption using stateless session tickets, allowing clients to resume TLS connections without requiring server-side session state.
Configuration
-
Node-level:
node.tls_stateless_tickets_seedSecret key seed used to generate TLS 1.3 stateless session tickets.
-
Listener-level:
listeners.ssl.<name>.ssl_options.session_ticketsEnables TLS 1.3 session resumption. Supported values:
disabled(default)statelessstateless_with_cert(includes certificate information in the ticket)
Notes
- Session tickets are generated only when
node.tls_stateless_tickets_seedis configured (non-empty), andsession_ticketsis enabled in listener SSL options. - If
session_ticketsis enabled butnode.tls_stateless_tickets_seedis empty, session tickets will not be generated and an error log will be emitted when starting the listener.
This PR also included a fix for the TLS 1.2 session resumption configuration. Previously, the
reuse_sessionsoption for SSL listener did not take effect, i.e. EMQX always tried to enable TLS 1.2 session resumption. It is now possible to turn it off. Please note that TLS 1.2 session resumption will be disabled by default starting version 6.2.0. -
Rule Engine
-
#16524 Enhanced base64 encoding and decoding functions in rule engine SQL with support for padding and URL-safe options.
The
base64_encodeandbase64_decodefunctions now support optional parameters to control encoding behavior:no_padding: Encode or decode without padding characters (=). Useful when you need to remove padding from encoded strings or decode strings that do not have padding.urlsafe: Use URL-safe base64 encoding/decoding. Replaces+with-and/with_, making the encoded string safe to use in URLs without encoding.
These options can be used individually or combined in any order.
Examples in rule SQL:
Encode without padding:
SELECT base64_encode(payload, 'no_padding') as encoded FROM "t/#"Encode with URL-safe characters:
SELECT base64_encode(payload, 'urlsafe') as encoded FROM "t/#"Encode with both options (no padding and URL-safe):
SELECT base64_encode(payload, 'no_padding', 'urlsafe') as encoded FROM "t/#"Decode URL-safe base64:
SELECT base64_decode(payload, 'urlsafe') as decoded FROM "t/#"Decode unpadded URL-safe base64:
SELECT base64_decode(payload, 'urlsafe', 'no_padding') as decoded FROM "t/#" -
#16533 Added two new variadic expression helper functions,
json_valueandjwt_value, for extracting values from JSON data and JWT tokens using dot-separated key paths.json_valueextracts values from JSON binary strings by navigating nested objects with a dot-separated key path.jwt_valuedecodes the payload of a JWT and extracts claim values using the same dot-separated path syntax.
Examples:
- If
usernamecontains a JSON object, you can access a nested field withjson_value(username, 'shop.floor'). - If
passwordcontains a JWT with a customized claim, you can access a nested value withjwt_value(password, 'client_attrs.unitid').
-
#16539 Added support for tracking Sparkplug B metric aliases when using the
spb_decodeRule Engine function.After a device or Edge of Network (EoN) node publishes its
NBIRTHorDBIRTHmessages, EMQX records the alias-to-name mappings defined in those messages. Whenspb_decodeis later applied toNDATAorDDATAmessages from the same session, the original metric names are automatically restored and included in the decoded output.Note: when executing fallback actions, the mapping is not available in the environment where they run. This means that, if a fallback action republishes the undecoded
DDATA/NDATApayload to a Sparkplug BDDATA/NDATAtopic, the metricnamefields will not be populated by the alias mapping.
Durable Storage
-
#16136 Improved resource management and performance for durable storage.
Introduced a concept of a durable storage database group. Certain resources (such as memtable size and disk usage quota) can be shared between the group members.
Added the following new metrics (per DB group):
emqx_ds_disk_usage: Total size of SST filesemqx_ds_write_buffer_memory_usage: RocksDB memtable sizeemqx_ds_total_trash_size: Disk usage by trash SST files
Added the following group configurations:
durable_storage.db_groups.<group>.storage_quota: Soft quota for the SST files sizedurable_storage.db_groups.<group>.write_buffer_size: Maximum memtable sizedurable_storage.db_groups.<group>.rocksdb_nthreads_highanddurable_storage.db_groups.<group>.rocksdb_nthreads_low: Size of RocksDB thread pools.
Added a new alarm that is raised when the quota is exceeded:
db_storage_quota_exceeded:<DB>. Please refer to the "Storage Quota" section of the documentation for more details.Default session checkpoint interval has been changed to 15s.
-
#16286 Optimized the default durable storage settings to reduce CPU load. This PR disables subscriptions for DBs that don't use them.
Performance
- #16413 Improved subscription handling performance by reducing redundant monitoring of MQTT session processes.
Bug Fixes
Core MQTT Functionalities
-
#16354 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
-
#16515 Fixed an issue where WebSocket connections could crash when the broker sent messages exceeding the client-advertised
Maximum-Packet-Size. -
#16569 Fixed a rare race condition that could cause the supporting
emqx_flappingprocess for flapping detection to crash under high system load.
Data Integration
-
#16265 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.
Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
-
#16542 Fixed an issue where Kafka producer connections could disconnect prematurely when Kafka was overloaded, leading to excessive produce request retries.
The produce request timeout is now automatically set to at least twice the metadata request timeout, with a minimum of 30 seconds. This reduces unnecessary reconnections and retries when metadata requests take longer than expected, especially when the metadata request timeout is configured to a small value.
-
#16352 Upgraded Apache Pulsar client to 2.1.2. When Pulsar producer action's
batch_sizeis configured to1, the producer will now encode single messages instead of single-element batch. This should allow consumers to share load using Key Share strategy. -
#16383 Improved the IoTDB Connector health check when using the REST API driver.
Previously, client credentials were not validated during health checks. The health check now sends a lightweight no-op query, allowing misconfigured credentials to be detected early.
-
#16507 Fixed an issue where an MQTT Source would stop receiving messages after its Connector reconnected.
Previously, when an MQTT Source’s Connector recovered from a connection loss, its topics were not re-subscribed, causing the Source to stop working until the Connector was restarted. The Source now automatically re-subscribes upon reconnect.
Clustering
-
#16269 Fixed an issue in the Cluster Linking route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
-
#16317 Fixed an issue in Cluster Linking garbage-collection logic that could incorrectly remove active routes from the internal routing table while cleaning up stale route replication state.
This issue could occur only in setups with multiple independent Cluster Links, where some links remained down for extended periods.
-
#16465 Upgraded
gen_rpcto3.5.1.Before the
gen_rpcupgrade, EMQX may experience a long tail of crash logs due to a connect timeout if a peer node is unreachable. The new version of gen_rpc no longer has the long tail and has converted crash logs to more readable error logs. Additionally, the frequent log"failed_to_connect_server"is also throttled to avoid spamming. -
#16544 Improved the robustness of the cluster autoclean procedure. Previously, if the autoclean feature was disabled during the initial startup of a node, it would not be activated after subsequent configuration changes.
Upgrade
- #16308 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.
Configuration Management
-
#16397 Added TLS certificate and key file validation before listener startup.
EMQX now performs basic validation when parsing SSL listener configuration and emits error-level logs if invalid PEM files are detected (for example,
invalid_pem_file_ignoredandbad_keyfile_ignored). This makes troubleshooting easier as administrators can observe errors when starting/reconfiguring, instead of troubleshooting TLS handshake failures.
Access Control
-
#16423 Added support for verifying the JWT
aud(audience) claim during authentication.When the
audclaim is configured inverify_claims, the JWT must include a validaudvalue. Both string and array formats are supported:- If
audis a string, it must exactly match the configured value. - If
audis an array, at least one element must match the configured value. - An empty string or empty array fails verification.
- The verification also fails if the
audclaim is missing when it is configured inverify_claims.
- If
-
#16459 Fixed the issue in SCRAM authentication HTTP API. Previously, incorrect user ID was returned for the created user in the user creation API call.
Observability
-
#16417 Reduced log volume for
resource_exceptionevents. Logs generated when a resource exception occurs are now throttled, and potentially large terms are redacted to prevent excessive log output. -
#16537 Fixed a formatter crash triggered by certain
gen_rpcerror messages.Previously, EMQX could crash with a “FORMATTER CRASH” error when
gen_rpclogged specific errors (such as transmission timeouts). The formatter now safely handles these messages without crashing.
Minor fixes and improvements.
Full changelog
Enhancements
-
#16491 Start releasing packages for macOS 15 (Sequoia)
-
#15944 Improved the information returned when a resource is marked as
disconnectedfor the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication. -
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as
resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds. -
#15944 Improved the information returned when a resource is marked as
disconnectedfor the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication. -
#15845 Extended the
static_clientidsconfiguration of MQTT Connector to allow specifying usernames and passwords associated with each clientid.
Bug Fixes
Core MQTT Functionalities
-
#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
-
#16081 Fixed an issue where, if a client used extended authentication mechanisms and memory sessions, they could crash with an
session_stepdown_request_exceptionerror andcalling_selfreason.e.g.:
2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ... -
#15872 Eliminate warning log
unclean_terminatewhen disconnected after CONNACK is sent with a non-zero reason code. -
#15902 Upgraded MQTT client library to 1.13.8
This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
-
#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long since left the cluster.
This also fixes a race condition that could cause accumulating inconsistencies in the routing table and shared subscription state when a large number of shared subscribers disconnect simultaneously.
Clustering
-
#16452 Upgraded
gen_rpcto3.5.1.Prior to the
gen_rpcupgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new versiongen_rpcno longer has the long tail and converted crash logs to more readableerrorlogs,
and the frequent log"failed_to_connect_server"is also throttled to avoid spamming.
Cluster Linking
- #16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replication state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.
- #16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
Data Integration
-
#16415 Upgrade Apache Pulsar client to 2.1.2.
When Pulsar producer action's
batch_sizeis configured to1, the producer will now encode single messages instead of single-element batch.
This should make consumers to share load using Key Share strategy. -
#16383 When using the IoTDB Connector with the REST API driver, credentials were previously not validated during health checks. Health checks now issue a no-op query to IoTDB, ensuring that invalid or misconfigured client credentials are detected early.
-
#16336 Fixed a race condition which may cause timeout when testing connectivity or stopping a connector from the dashboard.
-
#16263 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.
Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
-
#16138 Fixed Redis cluster failover issue. With this fix, failed
PINGresponses now trigger a cluster topology refresh, ensuring that connector management promptly recovers and updates its view of the Redis cluster after failovers.Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (e.g., GET) failed. However, periodic PING commands did not trigger a refresh when they failed.
This could cause the connector to remain in a “connecting” state and keep using outdated topology information if no new queries were made after a failover. -
#16043 Fixed log details for Kafka data integration when "not_all_kafka_partitions_connected" happened.
-
#15906 Upgraded Kafka producer library Wolff from 4.0.12 to 4.0.13`, which adds handling for the record_list_too_large error in ProduceResponse.
-
#15866 Upgraded Kafka producer lib wollf to 4.0.12 to improve handling of temporarily missing partitions in Kafka metadata responses.
In rare race conditions, Kafka may return an incomplete partition list.
Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing.
This gap could cause the partition producer to stall and block shutdown indefinitely. -
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
-
#15826 Now, if the Kafka broker returns an ACL denied response, the connection is considered healthy. Previously, if the user used in a Kafka Consumer Connector did not have permissions to read the special
____emqx_consumer_probegroup used for health checks, the health check would fail. -
#15827 Fixed atom and process leaks in the GreptimeDB driver.
Fixed a
function_clauseerror that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions. -
#15910 Fixed an issue with connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.
Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded
gunand related dependencies to 2.1.0.
Security and Authentication
-
#16237 Fixed an issue where logs related to OIDC SSO could still be emitted after OIDC SSO was disabled.
-
#16217 Fixed an issue where OIDC callback could fail to find the session during login in a multi-node cluster.
-
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.
If you have such an user and wish to delete it, run the following in an EMQX console:
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end). -
#15818 Corrected handling of
{allow|deny, all}ACL rules.Previously, these rules were internally translated to match
#, which incorrectly failed to match topics prefixed with$(e.g.$testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure{allow|deny, all}rules correctly match any topic, including$-prefixed ones. -
#15899 Improved memory usage: authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.
Rule Engine
-
#16028 Fixed rule engine
jqfunction memory leak.Previously if
jqbuilt-in functionindexis used (e.g..key | index("name")), it would result in memory leak.
Observability
- #15967 Prevented rapid memory growth caused by Mnesia transaction blocking when cleaning up large volumes of audit logs.
- #15963 Reduced excessive audit log generation triggered by operations from the remote console.
- #15863 Fixed license quota alarm text.
Durable Storage
- #14674 Limited the number and size of RocksDB info log files created by EMQX durable storage.
Breaking Changes
- macOS 13 (Ventura) packages no longer released
- macOS 15 (Sequoia) package support added
- HTTP Action timeout configurable via resource_opts.request_ttl
- MQTT Connector static_clientids supports per-client credentials
Full changelog
Enhancements
-
#16491 Start releasing packages for macOS 15 (Sequoia)
-
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as
resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds. -
#15845 Extended the
static_clientidsconfiguration of MQTT Connector to allow specifying usernames and passwords associated with each clientid.
Bug Fixes
Core MQTT Functionalities
-
#16349 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
-
#16081 Fixed an issue where, if a client used extended authentication mechanisms and memory sessions, they could crash with an
session_stepdown_request_exceptionerror andcalling_selfreason.e.g.:
2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ... -
#15872 Eliminate warning log
unclean_terminatewhen disconnected after CONNACK is sent with a non-zero reason code. -
#15902 Upgraded MQTT client library to 1.13.8
This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
-
#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long since left the cluster.
This also fixes a race condition that could cause accumulating inconsistencies in the routing table and shared subscription state when a large number of shared subscribers disconnect simultaneously.
Clustering
-
#16452 Upgraded
gen_rpcto3.5.1.Prior to the
gen_rpcupgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable.
The new versiongen_rpcno longer has the long tail and converted crash logs to more readableerrorlogs,
and the frequent log"failed_to_connect_server"is also throttled to avoid spamming.
Security and Authentication
-
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.
If you have such an user and wish to delete it, run the following in an EMQX console:
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end). -
#15818 Corrected handling of
{allow|deny, all}ACL rules.Previously, these rules were internally translated to match
#, which incorrectly failed to match topics prefixed with$(e.g.$testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure{allow|deny, all}rules correctly match any topic, including$-prefixed ones. -
#15899 Improved memory usage: authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.
Rule Engine
-
#16028 Fixed rule engine
jqfunction memory leak.Previously if
jqbuilt-in functionindexis used (e.g..key | index("name")), it would result in memory leak.
Durable Storage
- #14674 Limited the number and size of RocksDB info log files created by EMQX durable storage.
Breaking Changes
- #16491 Stop releasing packages for macOS 13 (Ventura)
- MQTT Streams with ordering guarantees and multiple consumption
- Namespaced metrics for messages, sessions, and integrations
- OAuth authentication for Kafka producers
Full changelog
Feature Highlights
EMQX 6.1.0 introduces MQTT Streams, enhanced namespace capabilities, new data integrations, and centralized certificate management.
MQTT Streams
MQTT Streams provide durable collections of messages identified by a topic filter, with explicit lifecycle management. Messages matching a stream's topic filter are automatically appended, enabling consumption with ordering guarantees and support for multiple consumers. Clients can subscribe to streams using the special topic format $s/<timestamp>/topic/filter to consume messages from a specific point in time.
Enhanced Namespace Capabilities
- Configurations for namespace and isolation settings are now grouped together in the dashboard.
- Expanded namespace functionality with namespaced metrics, authentication, and authorization.
- Namespaced metrics are now available for messages, sessions, and data integration operations, exposed via Prometheus endpoints.
- Built-in authentication and authorization backends now support namespace-specific users and rules, enabling better multi-tenant isolation.
- Added automatic topic isolation using client namespaces as mountpoints.
New Data Integrations
- AWS Timestream for InfluxDB connector
- EMQX Tables connector
- InfluxDB API v3 support for InfluxDB and AWS Timestream connectors
- OAuth authentication for Kafka and Confluent Producer connectors
- Parquet file support for Azure Blob Storage and S3 Actions in Aggregated mode
Certificate Management
Added centralized certificate management via HTTP API, allowing certificates to be managed independently and referenced in SSL options for listeners and connectors.
Enhancements
Message Queue and Streams
-
#16326 Implemented MQTT Streams.
MQTT Streams are durable collections of messages identified by a topic filter.
They have an explicit lifecycle, and any published message that matches the Stream's topic filter is automatically appended to the stream.
Streams allow consumption of messages with ordering guarantees and can be consumed multiple times.
To consume messages from a stream, clients can subscribe to a special topic of the form
$s/<timestamp>/topic/filter, wheretopic/filterrefers to an existing Stream. Subscribing with a timestamp allows consumption to begin at a specific point in time. The timestamp may be a Unix timestamp in microseconds or one of two special values:earliestorlatest. -
#16454 For Message Queues and Streams, reconfigured garbage collection interval is now applied immediately. Previously, the new interval was applied only after the next garbage collection cycle.
Core MQTT Functionalities
- #16099 Added a new rule engine event:
$events/client/ping. This is triggered when a client sends aPINGREQpacket.
Access Control
-
#16132 Added an HTTP API to manage certificates in a centralized manner.
-
#16154 Added support for referencing managed certificate files in SSL options of listeners and clients.
-
#16266 Added a new
authorization.include_mountpointconfiguration. When enabled, topics will be prefixed by the listener's mountpoint before being evaluated by authorization backends. -
#16272 Added support for specifying namespaced rules when using the built-in authorization backend. Now, MQTT clients that belong to a namespace will consider only their namespaced rules when authorizing actions.
-
#16345 Added support for specifying namespaced users when using the built-in authentication backend. Now, MQTT clients that belong to a namespace will consider only their namespaced data when authenticating.
Data Integration
-
#15905 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as
resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds. -
#16169 Updated our
parquerdependency to support encodingtimestampIceberg types to Parquet files. -
#16179 Added support for writing Parquet files when using the Aggregated mode in Azure Blob Storage and S3 Actions.
-
#16267 Added a new Connector and Action that appends data to AWS Timestream for InfluxDB.
-
#16290 Added support for OAuth authentication when using Kafka and Confluent Producer Connectors.
-
#16316 Changed the default batch size and time for multiple actions. Actions that previously supported batch operations had their defaults increased, so that now batching is the default behavior for them.
-
#16372 Added support for InfluxDB API v3 to InfluxDB and AWS Timestream Connectors.
-
#16396 Added a new Connector and Action that appends data to EMQX Tables.
Durable Storage
-
#16136 Improved resource management and performance for durable storage.
Introduced a concept of durable storage database group. Certain resources (such as memtable size and disk usage quota) can be shared between the group members.
Added the following new metrics (per DB group):
emqx_ds_disk_usage: Total size of SST filesemqx_ds_write_buffer_memory_usage: RocksDB memtable sizeemqx_ds_total_trash_size: Disk usage by trash SST files
Added the following group configurations:
durable_storage.db_groups.<group>.storage_quota: Soft quota for the SST files sizedurable_storage.db_groups.<group>.write_buffer_size: Maximum memtable sizedurable_storage.db_groups.<group>.rocksdb_nthreads_highanddurable_storage.db_groups.<group>.rocksdb_nthreads_low: Size of RocksDB thread pools.
Added a new alarm that is raised when the quota is exceeded:
db_storage_quota_exceeded:<DB>. Please refer to the "Storage Quota" section of the documentation for more details.Default session checkpoint interval has been changed to 15s.
-
#16286 Optimized the default durable storage settings to reduce CPU load. This PR disables subscriptions for DBs that don't use them.
Namespace
-
#16211 Added initial support for namespaced metrics.
- Messages received
- Count
- Bytes
- Messages sent
- Count
- Bytes
- Number of sessions
- Data integration
- Number of actions triggered
- DB records
- Number of AuthN records
- Number of AuthZ records
Clients in managed namespaces will bump the namespaced metrics above, as well as continue to bump the usual global metrics.
These metrics are exposed in Prometheus format to be scraped from the
GET /prometheus/ns/statsendpoint. By specifying thens=NAMESPACEquery parameter, only data fromNAMESPACEwill be returned. Omitting this parameter causes data from all namespaces to be scraped. Namespaces are added as labels to metrics. -
#16314 Now, global admin users will see resources from all namespaces (by default) when listing namespaced resources (connectors/sources/actions/rules). They may focus on one particular namespace when performing CRUD operations by passing the
ns=NSquery parameter. If they want to list only the global namespace resources, they omitnsand passonly_global=truequery parameter. Namespaced resources now return thenamespacefield to denote where they come from, withnamespacebeingnullfor global resources to distinguish them from a potential namespace called"global". -
#16360 Added a
GET /mt/ns/:ns/metricsendpoint that will return namespace-specific metrics in JSON format. -
#16472 Added a new configuration option
namespace_as_mountpointto enable automatic topic isolation using client namespaces.When enabled, EMQX uses the client's namespace (from
client_attrs.tns) as a topic mountpoint if no mountpoint is configured on the listener.Topics are automatically prefixed with the namespace for PUBLISH, SUBSCRIBE, UNSUBSCRIBE, and Will messages, and the prefix is stripped when delivering messages to clients.
This setting is ignored if the listener already has a mountpoint configured, ensuring existing configurations take precedence.
Observability
-
#16135 Added two new metrics and corresponding rates for the
GET /monitor_currentHTTP API:rules_matchedandactions_executed. They track the number of rules that matched and action execution rate (i.e., success + failure), respectively. -
#16213 Added MQTT client ID as a process label so crash logs (including max-heap and force-shutdown errors) now include the client ID for easier troubleshooting.
Performance
-
#16368 Upgraded the underlying runtime system from Erlang/OTP 27 to Erlang/OTP 28.
-
#16377 Reduced the number of pre-allocated metrics counters, which should contribute to reduced memory usage, especially in clusters using lots of namespaces.
MQTT over QUIC
-
#16133 MQTT over QUIC: Added support for connection probing using datagrams.
EMQX now supports zero-length datagram packets sent by clients to test connectivity. Clients can also send non-zero-length datagram packets, but they will be ignored by EMQX.
Bug Fixes
Core MQTT Functionalities
- #16344 Fixed a crash in MQTT v5 connections caused by a type mismatch when processing the request-response-information property.
Access Control
-
#16308 Fixed an issue where Multi-Factor Authentication (MFA) could not be enabled after upgrading EMQX from versions earlier than 5.3.0 due to incompatible login-user database records.
-
#16446 Fixed an issue with authenticator metrics when using SCRAM in which the 'Total' count would be incremented twice for each authentication attempt, and the 'Success' count would not be bumped.
Data Integration
-
#16265 The health check now verifies leader connectivity only for the partitions assigned to the current EMQX node, preventing unnecessary idle connections and false alarms.
Previously, the Kafka source connector checked leader connectivity for all partitions. In clustered deployments, each node owns only a subset of partitions, leaving connections to unassigned partition leaders idle. Because Kafka closes idle connections after a timeout (10 minutes by default), this could result in false connectivity alarms.
-
#16352 Upgraded Apache Pulsar client to 2.1.2. When Pulsar producer action's
batch_sizeis configured to1, the producer will now encode single messages instead of single-element batch. This should allow consumers to share load using Key Share strategy. -
#16383 Previously, when using IoTDB Connector with its RestAPI driver, credentials would not be checked during health checks. Now, we send a no-op query during IoTDB connector health-check. This enables early detection of misconfigured client credentials.
Message Queue
- #16270 Fixed a shutdown handling issue in the EMQX message queue consumer.
Clustering
-
#16453 Upgraded
gen_rpcto3.5.1.Prior to the
gen_rpcupgrade, EMQX may experience long tail of crash logs due to connect timeout if a peer node is unreachable. The new versiongen_rpcno longer has the long tail and converted crash logs to more readableerrorlogs,
and the frequent log"failed_to_connect_server"is also throttled to avoid spamming.
Cluster Linking
-
#16269 Fixed an issue in the Cluster Link route replication protocol recovery sequence where re-bootstrapping was incorrectly skipped even though the remote side needed it.
-
#16317 Fixed an issue in Cluster Link garbage-collection logic that could accidentally remove live routes from the internal routing table in the process of cleaning up stale route replication state. This problem occurred only when multiple independent Cluster Links were set up, and some of these links went down for relatively long periods of time.
Observability
-
#16417 Reduced the volume of logs generated when a resource exception occurs (
resource_exception). These logs are now throttled, and some potentially large terms are redacted from them. -
#16434 Previously, using the HTTP API to force deactivate an alarm would not clear it from all nodes. Now, clearing an alarm name will clear it from all nodes.
Gateway
- #16425 Improved the returned errors when creating or updating a Gateway via the HTTP API.
Miscellaneous
-
#16397 Added TLS certificate validation before listener start. Fail-fast if listener is misconfigured with invalid certificates.
-
#16311 Updated error codes to correct terminology from misspelled 'REST_FAILED' to 'RESET_FAILED'.
Breaking Changes
-
#16368 The internal regular expression engine has been upgraded to PCRE2, providing improved matching performance and stricter syntax enforcement.
If you use the
regex_match,regex_replace, orregex_extractfunctions in Rule Engine SQL, some existing regular expressions that relied on lenient or undefined behavior may no longer compile or match as expected.Key changes to be aware of include:
- Stricter escaping rules: Invalid or unnecessary escape sequences that were previously ignored are now treated as errors.
- Broken:
[\w-\.], escaping.inside a character class is unnecessary and no longer accepted; only metacharacters require escaping. - Broken:
\xwithout valid hexadecimal digits (for example,\xGG) now causes a compilation error instead of being interpreted as a literalx.
- Broken:
- Stricter group name validation: Regular expressions with duplicate or empty named capture groups are no longer permitted.
Action required: Review and validate all Rule Engine SQL definitions that use regular expressions. For complex patterns, verify compatibility with a PCRE2-compliant tester (most online regex tools support PCRE2) or test thoroughly in a staging environment before upgrading.
- Stricter escaping rules: Invalid or unnecessary escape sequences that were previously ignored are now treated as errors.
- Fixed TLS connection race condition during certificate renewal
- Fixed OIDC SSO login in multi-node clusters
- S3 IAM role support without manual credentials
- Kafka API support expansion
- HTTP request timeout configuration
Full changelog
Enhancements
Core MQTT Functionalities
- #15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code
137(Server Busy) in theCONNACKwith Reason-String"THROTTLED", and should retry after the cleanup completes. - Fixed the reason code returned when another connection registers the same client ID; now correctly returns
137instead of133.
Data Integration
-
#15542 Upgraded our
erlcloudlibrary to3.8.3.0. This allows one to set up a S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s). -
#15585 Updated the brod client to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation of
JoinGroupsAPI versionsv0-v1. -
#15845 The
static_clientidsconfiguration for the MQTT Connector now supports specifying a username and password for each client ID. This is particularly useful for scenarios like connecting to Azure IoT Hub, where each device (client ID) requires a unique set of credentials. This enhancement helps ensure successful connections across multiple nodes in a clustered environment. -
#15911 The HTTP request timeout for the HTTP Action is now configurable via the
resource_opts.request_ttlsetting. Previously, this timeout was fixed at 30 seconds and could not be adjusted.
Observability
-
#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
-
#15944 Improved the information returned when a resource is marked as
disconnectedfor the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication.
Performance
-
#15536 Disable the
node.global_gc_intervalconfiguration by default. -
#15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (
+zdbbl 32768) to preventbusy_dist_port alarmsduring intensive Mnesia operations. - Disabled scheduler busy-waiting (
+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system. - Set scheduler binding type to db (
+stbt db) to reduce message latency.
- Increased buffer size for distributed channels to 32 MB (
-
#15907 Improve system memory usage.
- Authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.
- Fields such as client ID, username, password, and topic are copied into new binaries (when more than 64 bytes) instead of being slices from the raw packet to reduce 'binary' part of memory usage in Erlang VM.
-
#15949 Changed the default value of the
parse_unitoption in listener configuration fromchunktoframe. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).Note: With
parse_unit = frame, if aPUBLISHpacket exceeds the maximum allowed size, EMQX will close the connection instead of sending aDISCONNECTpacket. -
#16165 Optimized the performance of the
GET /clients_v2API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.
Bug Fixes
Core MQTT Functionalities
-
#15884 Resolved an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
-
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.
-
#15872 Eliminated warning log
unclean_terminatewhen disconnected after CONNACK is sent with a non-zero reason code.
Deployment
-
#15553 Fixed an issue in the Helm chart where deploying EMQX with default values started multiple replicas and caused all nodes except one to crash. The chart now defaults to a single replica, since clustered deployments require an Commercial License.
-
#15580 Added a new
emqxLicenseSecretRefvariable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.This replaces the non-functional
emqxLicenseSecretNamevariable, which created and mounted a secret file but did not pass the license to EMQX. -
#15712 Fixed node boot-up failure during rolling upgrade from older versions (before 5.9)
In previous EMQX versions (before 5.9), a bug in the ZIP timestamp encoder could store an invalid “seconds” value in archive entries (values corresponding to the 30th or 31st 2-second slot in DOS time format).
-
#15863 Fixed the license quota alarm message to correctly reflect session quotas instead of live connections.
Security
-
#15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a
bad_certificate/invalid_signature error.
-
#16237 Fixed an issue where OIDC SSO–related logs might still be printed even after SSO was disabled.
-
#16217 Fixed an issue where the OIDC login callback could fail to locate the user session in multi-node cluster environments.
Access Control
-
#15818 Corrected handling of
{allow|deny, all}ACL rules.Previously, these rules were internally translated to match
#, which incorrectly failed to match topics prefixed with$(e.g.$testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure{allow|deny, all}rules correctly match any topic, including$-prefixed ones. -
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.
If you have such an user and wish to delete it, run the following in an EMQX console:
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end). -
#15899 Improved memory management by ensuring that the authorization (authz) cache is cleared immediately when a client disconnects, reducing unnecessary memory consumption.
-
#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a
Example error logsession_stepdown_request_exceptioncaused by acalling_selferror.2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...
Data Integration
-
#15616 Kafka connections are now considered healthy even if a
topic_authorization_failederror is returned for the default probing topic. -
#15826 Improved Kafka consumer connector health check behavior with restricted ACLs. Previously, Kafka Consumer Connector health checks could fail if the configured user lacked permission to access the internal
____emqx_consumer_probeconsumer group used for the check. With this fix, if the Kafka broker returns an "ACL denied" response, EMQX will treat the connection as healthy. -
#15827 Fixed atom and process leaks in the GreptimeDB driver.
Fixed a
function_clauseerror that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions. -
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
-
#15850 Fixed an issue where the MQTT bridge incorrectly showed a stale connection as
Connected, and failed to re-establish the connection. -
#15866 Upgraded Kafka producer lib wollf to
4.0.12to improve handling of temporarily missing partitions in Kafka metadata responses.In rare race conditions, Kafka may return an incomplete partition list. Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing. This gap could cause the partition producer to stall and block shutdown indefinitely.
-
#15906 Upgraded Kafka producer library Wolff from
4.0.12to4.0.13, which adds handling for therecord_list_too_largeerror inProduceResponse. -
#15902 Upgraded MQTT client library to 1.13.8. This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
-
#15910 Fixed an issue with Connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.
Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded
gunand related dependencies to 2.1.0. -
#16010 Fixed an issue where a Republish Fallback Action could fail with a
function_clauseerror if the originating rule's SQL did not include themetadatafield from the rule environment.Example error log:
[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">> -
#16043 Improved log details for Kafka data integration when
not_all_kafka_partitions_connectedevent occurs. -
#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
-
#16138 Fixed a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.
Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as
GET) failed. However, failures in periodicPINGcommands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.With this fix, failed
PINGresponses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly.
Rule Engine
-
#16028 Fixed rule engine
jqfunction memory leak.Previously if
jqbuilt-in functionindexis used (e.g..key | index("name")), it would result in memory leak.
Smart Data Hub
-
#15706 Fixed an indexing issue that could cause Message Transformations and Schema Validations to behave inconsistently. Deleting one item could corrupt the topic index, so that a subsequent item remained active even after being disabled.
-
#15708 Fixed an issue where external schema registries were not reloaded after a node restart.
-
#15810 Introduced
spb_{en,de}codefunctions to correct handling ofbytes_valueMetrics. Fixed an issue with the originalsparkplug_{en,de}codefunctions, which did not base64 encode/decodebytes_valuemetric values as required by the Protobuf specification. To address this, newspb_{en,de}codefunctions have been introduced for correct encoding/decoding of such fields. The oldsparkplug_{en,de}codefunctions are now deprecated to maintain backward compatibility.
Observability
-
#15639 Fixed incorrect counting of the
packets.subscribe.auth_errormetric. -
#15785 Resolved a crash that occurred when MQTT usernames containing non-ASCII characters were used in formatting network congestion alarm messages.
-
#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (
remsh). -
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.
Gateway
-
#15679 Fixed incorrect global chain names for the ExProto, JT/T 808, GB/T 32960, and OCPP gateways. Built-in authentication data for these gateways was previously grouped under
unknown:global, causing conflicts between gateways. -
#15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.
-
#15822 Fixed an issue where the OCPP connection would crash after sending a certain number of messages.
Rate Limit
- #15794 Improved the behavior of connection rate limit updates to ensure that changes (e.g., to burst rate or rate thresholds) are applied immediately after the listener configuration is updated. Previously, parts of the internal limiter state were not refreshed correctly, which could result in rate limits appearing stricter than configured.
ExHook
- #15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.
Breaking Changes
-
#15753 Listener connection rate limits (
max_conn_rateandmax_conn_burst) are now enforced per listener rather than per acceptor, restoring the behavior before 5.9.0.As a result, configurations from versions 5.9.0 and 5.9.1 are incompatible: the specified rate values must be scaled up by the number of acceptors configured for each listener to preserve the same effective limits.
-
#16062 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.
If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.
-
#16284 Stopped releasing packages for macOS 13 and CentOS 7.
- TLS certificate garbage collection now preserves actively used certificates
- Fixed RSA signature verification with missing default configurations
- Message Queue configuration options and auto-creation
- GreptimeDB ingester v0.2.3 upgrade with row-based gRPC
- Optimized GET /clients_v2 API performance
Full changelog
Enhancements
Message Queue
-
#16080 Added a configuration option to disable the Message Queues feature. Disabling Message Queues can slightly reduce the resource usage in the cluster. When Durable Sessions are also disabled, EMQX avoids maintaining Durable Storage, further reducing administrative overhead and improving performance.
-
#16096 Added support for automatic creation of message queues when clients subscribe to non-existent
$q/topics. Now configuration options are available to enable auto-creation for both regular and last-value semantics queues. -
#16097 Optimized message writing to regular message queues by replacing transactional appends with dirty append functions. For QoS 0 messages, asynchronous append operations are now used. These changes significantly improve the performance of message insertion into regular queues.
-
#16098 Added a maximum queue count configuration option to limit the total number of message queues in the system.
-
#16152 Introduced per-queue limits for maximum message count and total message size. Also added new metrics to monitor message append latency and help diagnose performance or queue-limiting issues.
Data Integration
-
#16121 Upgraded the GreptimeDB ingester client to v0.2.3, which fixes several bugs and introduces support for row-based gRPC protocol (the column-based protocol is now deprecated).
Additionally, updated the CI image to the latest stable version of GreptimeDB.
-
#16127 Fixed an invalid string value issue in the GreptimeDB connector, following the changes introduced in #16121.
Performance
-
#15949 Changed the default value of the
parse_unitoption in listener configuration fromchunktoframe. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).Note: With
parse_unit = frame, if aPUBLISHpacket exceeds the maximum allowed size, EMQX will close the connection instead of sending aDISCONNECTpacket. -
#16165 Optimized the performance of the
GET /clients_v2API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.
Bug Fixes
Core MQTT Functionalities
-
#15884 Resolve an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
-
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.
Upgrade
-
#16047 Added support to perform rolling upgrade from EMQX Enterprise base version 5.8.0 and newer to 6.0. During the upgrade, legacy configurations are automatically migrated to the new format supported in 6.0. Specifically, the deprecated
bridgesconfiguration root is converted into the newconnectors,sources, andactionsroots.However, the GCP PubSub Consumer and Kafka Consumer sources will still require manual changes. If any source configuration still includes the deprecated
topic_mappingfield, it must be removed. Then, for each entry previously defined intopic_mapping, a separate "Source + Rule" pair must be created manually.
Security
-
#16156 Fixed an issue where some dependencies were missing default configurations compared to EMQX 5.10, potentially causing RSA signature verification failures. The missing defaults could lead to errors, such as the following log message:
{sign_unsupported,[[{rsa_padding,rsa_pkcs1_padding}]]}, [{jose_jwa_unsupported,verify,5,[{file,"src/jwa/jose_jwa_unsupported.erl"},{line,55}]} -
#16175 Fixed an issue with periodic TLS certificate garbage collection. Previously, the garbage collection process incorrectly deleted certificate files that were actively used by configurations in managed namespaces.
Access Control
-
#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a
Example error logsession_stepdown_request_exceptioncaused by acalling_selferror.2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...
Clustering
-
#16123 Fix a bug in the component managing Mria replication that could cause cluster joins to hang or remain incomplete in core-replicant clusters.
During cluster changes involving adding new core nodes, those new core nodes could sometimes fail to start replication-related processes required by replicants. As a result, upgraded or newly added replicants could hang during startup.
In Kubernetes deployments, this often caused readiness probes to fail, leading the controller to repeatedly restart the affected replicant pods.
This issue typically affected upgrade rollouts involving the addition of new core and replicant nodes. For example, adding two cores and two replicants (running a newer EMQX version) to an existing cluster with 2 cores and 2 replicants.
Rule Engine
-
#16028 Fixed rule engine
jqfunction memory leak.Previously if
jqbuilt-in functionindexis used (e.g..key | index("name")), it would result in memory leak.
Data Integration
-
#16010 Fixed an issue where a Republish Fallback Action could fail with a
function_clauseerror if the originating rule's SQL did not include themetadatafield from the rule environment.Example error log:
[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">> -
#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
-
#16140 Fix a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.
Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as
GET) failed. However, failures in periodicPINGcommands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.With this fix, failed
PINGresponses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly.
MQTT Durable Sessions
-
#16105 Durable storage performance optimization. In particular, this fix reduces the latency of
CONNACKfor clients using a durable session. -
#16129 Durable storage transaction configuration can be changed in the runtime. Previously changing this configuration required a node restart.
Observability
-
#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (
remsh). -
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.
-
#16060 Fixed a logger formatter crash that could occur for some debug-level log messages containing deeply nested terms with non-ASCII characters.
Example error log2025-09-29T06:55:34.120640+00:00 debug: FORMATTER CRASH: {report,#{request => #{messages => [#{role => <<"user">>,content => <<"{\"msg\": \"hello\"}">>}],system => <<"将输入的 JSON 数据中,值为数字的 value 相加起来,并输出,只需返回输出结果。"/utf8>>,model => <<"claude-3-haiku-20240307">>,max_tokens => 100},msg => emqx_ai_completion_request}} 2025-09-29T06:55:34.120780+00:00 [debug] formatter_crashed: emqx_logger_textfmt, config: #{time_offset => [],chars_limit => unlimited,depth => 100,single_line => true,template => ["[",level,"] ",msg,"\n"],with_mfa => false,timestamp_format => auto,payload_encode => text}, log_event: #{meta => #{line => 44,pid => <0.281254.0>,time => 1759128934120640,file => "emqx_ai_completion_anthropic.erl",gl => <0.4317.0>,mfa => {emqx_ai_completion_anthropic,call_completion,3},report_cb => fun logger:format_otp_report/1,matched => <<"t/1">>,namespace => global,clientid => <<"c_emqx">>,trigger => <<"t/1">>,rule_id => <<"r1sczoo0">>,rule_trigger_ts => [1759128934120]},msg => {report,#{request => #{messages => [#{role => <<"user">>,content => <<"{\"msg\": \"hello\"}">>}],system => <<"将输入的 JSON 数据中,值为数字的 value 相加起来,并输出,只需返回输出结果。"/utf8>>,model => <<"claude-3-haiku-20240307">>,max_tokens => 100},msg => emqx_ai_completion_request}},level => debug}, reason: {error,badarg,[{erlang,iolist_to_binary,[["[",[["messages",": ",[[91,[[35,123,[["role"," => ",[60,60,"\"user\"",62,62]],44,["content"," => ",[60,60,"\"{\\\"msg\\\": \\\"hello\\\"}\"",62,62]]],125]],93]]],", ",["system",": ","将输入的 JSON 数据中,值为数字的 value 相加起来,并输出,只需返回输出结果。"],", ",["model",": ","claude-3-haiku-20240307"],", ",["max_tokens",": ","100"]],"]"]],[{error_info,#{module => erl_erts_errors}}]},{emqx_trace_formatter,format_term,2,[{file,"emqx_trace_formatter.erl"},{line,126}]},{emqx_logger_textfmt,format_term,2,[{file,"emqx_logger_textfmt.erl"},{line,230}]},{emqx_logger_textfmt,try_encode_meta,4,[{file,"emqx_logger_textfmt.erl"},{line,206}]},{lists,foldl_1,3,[{file,"lists.erl"},{line,2151}]},{emqx_logger_textfmt,enrich_report,3,[{file,"emqx_logger_textfmt.erl"},{line,102}]},{emqx_logger_textfmt,format,2,[{file,"emqx_logger_textfmt.erl"},{line,24}]}]} -
#16134 Fixed a backward compatibility issue that could prevent new Log Traces from being created in some cases.
Rate Limit
-
#16160 Improved the rate limiting algorithm for individual client connections. Previously, clients could temporarily exceed their publish rate limits, particularly just after connecting or after periods of inactivity.
This update makes the limiter behavior more predictable and consistent, ensuring rate limits are correctly enforced from the start of a connection.
Breaking Changes
-
#16061 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.
If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.
- RocketMQ actions now respect payload templates instead of sending entire rule output; existing templates may need updating
- Kafka API version deprecation updates
- GreptimeDB custom timestamp column support
Full changelog
Enhancements
Data Integration
-
#16183 EMQX now logs messages about dropped expired messages (
buffer_worker_dropped_expired_messages) at the warning level, and throttles such messages per resource ID. This helps identify when specific external resources are not keeping up with incoming message rates, potentially leading to message drops. -
#16206 Added the
allow_auto_topic_creationconfiguration option to the Kafka Producer Connector. When enabled, EMQX allows Kafka to automatically create a topic if it doesn’t exist when a client sends a metadata fetch request. -
#16209 Added support for specifying a custom timestamp column name (
ts_column) parameter to GreptimeDB Connector.
Performance
-
#15949 Changed the default value of the
parse_unitoption in listener configuration fromchunktoframe. This change can significantly reduce CPU usage when the payload size exceeds the socket buffer (default is 4 KB).Note: With
parse_unit = frame, if aPUBLISHpacket exceeds the maximum allowed size, EMQX will close the connection instead of sending aDISCONNECTpacket. -
#16165 Optimized the performance of the
GET /clients_v2API. Previously, when the cluster had around 50,000 clients or more, API calls to retrieve the client list could be extremely slow or even time out.
Bug Fixes
Core MQTT Functionalities
-
#15884 Resolve an issue where, in rare cases, the global routing table could indefinitely retain routing information for nodes that had long left the cluster.
-
#15518 Resolved a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.
Access Control
-
#16081 Fixed an issue where clients using extended authentication and memory-based sessions could crash with a
Example error logsession_stepdown_request_exceptioncaused by acalling_selferror.2025-09-24T07:13:08.973954+08:00 [error] clientid: someclientid, msg: session_stepdown_request_exception, peername: 127.0.0.1:41782, username: admin, error: exit, reason: calling_self, stacktrace: [{gen_server,call,3,[{file,"gen_server.erl"},{line,1222}]},{emqx_cm,request_stepdown,4,[{file,"emqx_cm.erl"},{line,427}]},{emqx_cm,do_takeover_begin,2,[{file,"emqx_cm.erl"},{line,398}]},{emqx_cm,takeover_session,2,[{file,"emqx_cm.erl"},{line,384}]},{emqx_cm,takeover_session_begin,2,[{file,"emqx_cm.erl"},{line,305}]},{emqx_session_mem,open,4,[{file,"emqx_session_mem.erl"},{line,210}]},{emqx_session,open,3,[{file,"emqx_session.erl"},{line,263}]},{emqx_cm,'-open_session/4-fun-1-',4,[{file,"emqx_cm.erl"},{line,290}]},{emqx_cm_locker,trans,2,[{file,"emqx_cm_locker.erl"},{line,32}]},{emqx_channel,post_process_connect,2,[{file,"emqx_channel.erl"},{line,575}]},{emqx_connection,with_channel,3,[{file,"emqx_connection.erl"},{line,852}]},{emqx_connection,process_msg,2,[{file,"emqx_connection.erl"},{line,470}]},{emqx_connection,process_msgs,2,[{file,"emqx_connection.erl"},{line,462}]},{emqx_connection,handle_recv,3,[{file,"emqx_connection.erl"},{line,406}]},{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,340}]}], action: {takeover,'begin'}, ...
Rule Engine
-
#16028 Fixed rule engine
jqfunction memory leak.Previously if
jqbuilt-in functionindexis used (e.g..key | index("name")), it would result in memory leak.
Data Integration
-
#16010 Fixed an issue where a Republish Fallback Action could fail with a
function_clauseerror if the originating rule's SQL did not include themetadatafield from the rule environment.Example error log:
[error] tag: RESOURCE, msg: failed_to_trigger_fallback_action, reason: {error,function_clause}, fallback_kind: republish, primary_action_resource_id: <<"action:type:name:connector:type:name">>, republish_topic: <<"republish/topic">> -
#16043 Improved log details for Kafka data integration when
not_all_kafka_partitions_connectedevent occurs. -
#16046 Fixed a potential out-of-memory (OOM) crash when loading or restarting a configuration containing a Connector with several hundred Actions.
-
#16138 Fix a Redis cluster failover issue that could cause the Connector to remain stuck in a "connecting" state.
Previously, EMQX’s Redis cluster client only refreshed the cluster topology when regular queries (such as
GET) failed. However, failures in periodicPINGcommands did not trigger a refresh. As a result, after a failover, the connector could continue using the outdated cluster topology if no other commands were issued, preventing recovery.With this fix, failed
PINGresponses now trigger a cluster topology refresh, ensuring that the connector can detect failovers and recover promptly. -
#16212 Removed Kafka producer linger time when the buffer queue is in memory mode.
Observability
-
#15963 Reduced excessive audit log entries generated during looped evaluations in the remote shell (
remsh). -
#15967 Fixed an issue where Mnesia transaction blocking during the cleanup of large volumes of audit logs could lead to rapid memory growth.
Breaking Changes
-
#16062 Fixed an issue where RocketMQ actions ignored the configured payload template and sent the entire rule output instead.
If you relied on the previous (incorrect) behavior, you may need to update your payload templates to ensure messages are formatted as expected.
- Message Queues with offline storage and last-value retention
- Namespace feature with role-based access control
- Enhanced LDAP with extended ACL rules and client-side caching
Full changelog
Feature Highlights
EMQX Enterprise 6.0.0 is the first release of the EMQX Enterprise version 6 series, bringing significant architectural improvements and new capabilities.
Message Queue
The native Message Queue feature unifies real-time MQTT publish/subscribe with persistent asynchronous queuing. The server buffers messages that match a topic filter, retaining them even when subscribers are offline. Clients can consume these messages through the special $q/{topic} topic, ensuring reliable message delivery.
Message Queues support offline message storage, last-value retention, and flexible dispatch strategies, enhancing MQTT with both real-time and durable messaging capabilities.
Namespace
The Namespace feature improves multi-tenancy and observability with namespace-level roles in the Dashboard. Users are restricted to their own resources (e.g., Rules, Actions, and Connectors) with fine-grained permissions such as Administrator or Viewer, and roles can be managed via the Dashboard, API, or CLI, simplifying multi-tenant operations.
Session count tracking has also been optimized: counts refresh on demand when there are fewer than 1,000 connections, and every 5 seconds otherwise. During rolling upgrades from older versions, counts may temporarily appear inconsistent, but will stabilize once all nodes are updated.
MQTT Durable Sessions
Durable storage has been optimized by separating session data from the broker’s other metadata, significantly reducing RAM usage and improving storage efficiency.
New configuration options provide finer control over RocksDB memory usage and performance. In addition, the default serialization schema for stored messages has been updated to ASN.1, further enhancing efficiency.
New Data Integrations
- Google BigQuery
- AWS AlloyDB
- CockroachDB
- AWS Redshift
Enhanced Integration
-
AWS:
- Support for Instance Metadata Service v2 APIs from EC2 instances when using S3 or S3Tables data integration. This enables seamless access to S3 buckets without manual AWS credential configuration, leveraging IAM roles for better security.
- Parquet format support for S3 Tables Action.
-
RabbitMQ: Define custom Headers and Properties Templates in RabbitMQ Sink to enhance message routing and compatibility within RabbitMQ.
-
Snowflake: Snowpipe Streaming upload mode for Snowflake Action (preview feature).
-
RocketMQ: New
keyandtagtemplate fields in Action, along with akey_dispatchoption for the Produce Strategy, allowing greater customization of message metadata.
Elixir Support
All packages now ship with Elixir support through the Mix build system, opening EMQX to the Elixir community and enabling better tooling with IEx console.
Enhanced LDAP Support
LDAP authorization now supports extended ACL rules in JSON format, and LDAP authentication can fetch ACL rules directly from LDAP with client-side caching.
Improved Tracing
Configurable limits for maximum traces (trace.max_traces) and trace file sizes (trace.max_file_size).
After max_file_size is reached, the trace log will rotate to a new file instead of halting.
Cluster Management
New cluster.description configuration option allows users to set and display custom cluster descriptions in the EMQX Dashboard.
Enhancements
Message Queue
- #15789 Implemented Message Queues, which are collections of messages identified by
topic_filter. Each queue has an explicit lifecycle and is automatically replenished with published messages matched with the queue's topic filter during the queue's lifetime. Clients can cooperatively consume messages from a queue by subscribing to a special topic in the format:$q/{topic}.
Core MQTT Functionalities
- #15805 Introduced a dedicated worker pool for handling sharded fanout message delivery.
Previously, the broker pool handled both subscription management and message dispatch, which could lead to scheduling contention. This change separates the fanout dispatch workload into its own pool to ensure more balanced and efficient handling of pub/sub operations.
Access Control
-
#15349 Optimize external resource management for authentication and authorization. Previously, EMQX could remain connected to a resource configured for a disabled authenticator or authorizer.
-
#15294 Enhanced LDAP authentication and authorization. LDAP authorization now supports extended ACL rules in JSON format. LDAP authentication can now fetch ACL rules from LDAP. These rules are cached in the client's metadata, so authorization is performed without additional LDAP queries.
-
#15730 Added support for overriding the client ID based on authentication results. If an authentication backend returns a
clientid_overrideattribute upon successful authentication, it will replace the client’s original client ID.The following backends now support
clientid_override:- HTTP
- JWT
- LDAP
- MongoDB
- MySQL
- Postgres
- Redis
-
#15820 Changed default value of config
authorization.no_matchfromallowtodenyfor better security defaults.
Clustering
- #15600 Introduced a new configuration option
cluster.descriptionthat allows you to add a descriptive label to the EMQX cluster. This description can be updated viaPUT /cluster, and retrieved with theGET /clusterAPI.
LLM-Based MQTT Data Processing
-
#15467 Exposed transport configuration options for AI Completion Providers. Users can now configure connection timeouts and the maximum number of connections to AI Completion Providers. This helps prevent
checkout_timeouterrors when message throughput is high and the provider is under load. -
Flow designer supports integrating with the Google Gemini model.
-
#15631 Added a new API endpoint to list all models available for an AI provider.
-
#15467 Exposed transport options for AI Completion Providers. These options allow configuring connection timeouts and maximum connections to an AI Completion Provider.
-
#15724 Introduced
openai_responsetype for AI Completion Providers and completion profiles to use OpenAI'sresponseAPI.
Data Integration
-
#15418 EMQX supports data integration with BigQuery.
-
#15401 Added support for the Snowpipe Streaming upload mode in the Snowflake Action.
Note: Snowpipe Streaming is currently a preview feature and is only available for Snowflake accounts hosted on AWS. -
#15387 Added rate limiting to Kinesis Producer Connector and Action health checks to comply with AWS API quotas and improve cluster behavior.
- Health check calls to
ListStreamsandDescribeStreamare now limited to 5/s and 10/s per Connector, respectively, matching AWS rate limits. - A distributed limiter is coordinated by a core node in the cluster to enforce these limits consistently.
- If a health check is throttled or times out, the Connector or Action will now retain its previous status instead of being marked as disconnected.
Also introduced a new
resource_opts.health_check_interval_jitter, which adds a uniform random delay toresource_opts.health_check_intervalto reduce the chance of multiple Actions under the same Connector running health checks at the same time. - Health check calls to
-
#15176 Upgraded the GreptimeDB Connector client and supported an optional new parameter
ttlto set the default time-to-live for automatically created tables. -
#15649 EMQX supports data integration with AWS AlloyDB, CockroachDB, and AWS Redshift.
-
#15635 Added new
keyandtagtemplate fields in the RocketMQ Action, allowing customization of the message's key and tag. Also, introduced a newkey_dispatchoption for theProduce Strategyfield. -
#15621 Now,
access_key_idandsecret_access_keyare optional fields for the S3 Tables Connector. If omitted, they'll be obtained from the Instance Metadata Service v2 APIs from the EC2 instance where EMQX is deployed. -
#15628 Removed HStreamDB data integration.
-
#15544 Added Arrow Flight SQL NIF driver support for Datalayers Integration.
-
#15637 Added support for templating message headers and properties for the RabbitMQ Action.
-
#15864 Removed the deprecated "Bridges V1" APIs and configuration schemas. All endpoints under
/bridges/*and configuration entries under thebridgesroot key are no longer available, as data integrations have fully migrated to the "Connectors/Actions/Sources" model. -
#15583 Updated the
brodclient to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation ofJoinGroupsAPI versionsv0-v1.
Smart Data Hub
- #15525 Prevented deletion of internal schemas that are still in use. If a schema is referenced by a Schema Validation or Message Transformation, it can no longer be removed to avoid runtime errors and configuration inconsistencies.
Durable Storage
-
#15463 Improved durable storage RAM usage and storage efficiency.
- Introduced the following configuration parameters for the durable storage to improve control over RocksDB memory usage and storage performance:
durable_storage.messages.rocksdb.write_buffer_size: RocksDB memtable size per shard.durable_storage.messages.rocksdb.cache_size: RocksDB block size per shard.durable_storage.messages.rocksdb.max_open_files: Limits the number of file descriptors used by RocksDB per shard.durable_storage.messages.layout.wildcard_thresholds: Allows to tune wildcard thresholds for thewildcard_optimized_v2storage layout.
- Additionally, the default
serialization_schemafor stored messages has been changed toasn1.
- Introduced the following configuration parameters for the durable storage to improve control over RocksDB memory usage and storage performance:
-
#16044 Some of config fields for durable sessions have been removed or renamed, and old values are marked as deprecated:
durable_sessions.heartbeat_intervalhas been renamed todurable_sessions.checkpoint_interval.durable_sessions.idle_poll_intervalanddurable_sessions.renew_streams_intervalhave been removed, as sessions are now fully event-driven.durable_sessions.session_gc_intervalanddurable_sessions.session_gc_batch_sizehave been removed as obsolete.
CLI
- #15399 The
node_dumptool now exports the current system configuration in HOCON format, with sensitive information (such as passwords and secrets) automatically redacted for security.
Namespace
-
#15841 Improved the refresh rate of the session count for namespaced sessions.
- If a namespace has fewer than 1000 connections, its session count is now updated on demand.
- For namespaces with 1000 or more connections, the count is updated every 5 seconds.
During a rolling upgrade from versions prior to 6.0, session counts may appear inconsistent due to changes in the internal tracking tables. This is expected: as clients reconnect to upgraded nodes, the session counts will gradually stabilize and become accurate once all nodes are running version 6.0 or later.
Observability
-
#15594 Introduced a new configuration option
trace.max_tracesto control the maximum number of active cluster-wide traces. This limit does not apply to node-local traces managed usingemqx ctl trace.This update also optimized tracing implementation to eliminate potential atom leaks per created trace.
-
#15556 Introduced a new configuration option
trace.max_file_sizeto limit the maximum file size for each individual trace. -
#15650 Implemented automatic trace log rotation.
When a trace file size exceeds
trace.max_file_size, EMQX no longer discards all subsequent events and emits an incomprehensible warning tostderr. Instead, portions of the oldest events are discarded while the most recent ones are retained.As such, this also implies that:
- EMQX now maintains multiple trace log files per active trace. The layout of the trace directory has changed accordingly.
- Trace API has been updated to reflect this behavior. The Log Stream API may return new errors, such as when a stream becomes stale due to a slow consumer.
-
#15904 Support viewing and updating of tracing configuration through Trace API.
Performance
- #15451 Introduced an experimental
socketbackend for TCP listeners, aimed at improving message processing latency and reducing compute resource usage. The feature can be enabled with the newtcp_backendlistener option.
Build and Tooling
- #15484 Switched the build system to Elixir's Mix, enabling all packages to include native Elixir support. This change improves developer tooling, allows integration with Elixir dependencies when needed, and enables use of the IEx shell as a more powerful EMQX console.
License
- #15921 Introduced a license alarm for cluster-wide maximum transactions per second (TPS).
- Each node calculates TPS as the average number of MQTT messages sent and received over the past 10 seconds.
- The total cluster TPS is aggregated every 5 seconds.
- If the observed TPS exceeds the licensed limit, an alarm is triggered.
- The alarm remains active until a license with a higher TPS allowance is applied.
MQTT over QUIC
- #15997 Added support for disabling QUIC stack loading by setting the environment variable
QUICER_SKIP_NIF_LOAD=1.
Bug Fixes
Core MQTT Functionalities
-
#15396 Removed redundant cleanup operations for shared subscriptions of disconnected clients. These operations were prone to crashes under high disconnect volumes and could lead to inconsistencies in the global broker state.
-
#15361 Fixed a
function_clauseerror when parsing a malformedUser-Propertypair with invalid (too short) length. -
#15783 Ensure that any changes to connection rate limits take effect immediately after the listener update has completed. Previously, parts of internal limiter state were not directly affected by configuration changes. For example, after increasing the burst rate, the effective rate limit could appear stricter than expected.
Access Control
- #15489 Fixed OIDC issuer URL validation in Single Sign-On (SSO) settings. Previously, issuer URLs containing a port number (for example,
https://xxxxxxxx:8443/webman/sso/.well-known/openid-configuration) were rejected with abad_port_numbererror. These URLs are now supported.
Rule Engine
- #15569 Fixed an issue where a Republish Rule Action could fail if the
direct_dispatchtemplate was empty or resolved to a non-boolean value. In these cases, the default valuefalseis now used.
Data Integration
- #15522 Fixed an issue where Snowflake Connector would fail to start correctly if
usernamewas not provided. - #15476 Fixed a missing callback in
emqx_connector_aggreg_deliverythat caused a crash when formatting delivery process status for aggregated-mode Actions (e.g., Azure Blob Storage, Snowflake, S3 Tables).
This occurred during failures or when inspecting delivery processes withgen_server:format_status/1. The issue is now resolved, and more detailed delivery status information will be logged. - #15394 Fixed a rare race condition where Action metrics could become inconsistent due to unexpected asynchronous replies.
- #15647 Fixed an issue where a MongoDB Connector was marked as
Disconnectedif the MongoDB account specified in the connector configuration lacked privileges to performfindqueries on thefoocollection. - #15603 Fixed an issue in the MQTT bridge where a stale connection could be shown as
Connectedand would not automatically reconnect. - #15383 Fixed a potential resource leak in MQTT bridge. When a bridge failed to start, the topic index table was not properly cleaned up.
- #15786 Fixed a potential atom leak when probing RocketMQ Connectors.
- #15806 Improved validation for Oracle Actions during creation. Previously, in rare cases, an Action containing an invalid SQL statement could be added successfully.
- #15848 Improved error reporting for the Oracle Connector. When the connector becomes disconnected, its status now includes a more specific reason, making diagnostics easier.
- #15693 Fixed a resource leak in Postgres-based bridges. Under certain race conditions during pool initialization, deleting a Connector could leave its connection pool behind. This has been corrected to ensure connection pools are properly cleaned up.
- #15543 Fixed an issue in HTTP Server data integration when sending large payloads. If the payload size was 10 MB or more, the HTTP request could fail.
Smart Data Hub
-
#15839 Fixed an encoding issue with Protobuf schemas that use
map<_, _>fields.
Previously, schemas containingmap<string, string>fields could fail to encode valid payloads, resulting in cryptic runtime errors.Example schema:
syntax = "proto3"; message test { map<string, string> args = 1; }Example rule:
SELECT schema_encode('xxx', json_decode(payload), 'test') as protobuf_test FROM "t/#"Example payload failed to be encoded:
{ "args": { "env": "stag" } }Previous error similar to:
2025-06-17T06:59:22.725785+00:00 [warning] tag: RULE_SQL_EXEC, clientid: c_emqx, msg: SELECT_clause_exception, reason: {error,{gpb_type_error,{bad_unicode_string,[{value,env},{path,"test.args.key"}]}},[{'$schema_parser_xxx',mk_type_error,3,[{file,"$schema_parser_xxx.erl"},{line,437}]},{'$schema_parser_xxx','-v_map<string,string>/3-lc$^0/1-0-',3,[{file,"$schema_parser_xxx.erl"},{line,429}]},{'$schema_parser_xxx','v_map<string,string>',3,[{file,"$schema_parser_xxx.erl"},{line,429}]},{'$schema_parser_xxx',v_msg_test,3,[{file,"$schema_parser_xxx.erl"},{line,404}]},{'$schema_parser_xxx',encode_msg,3,[{file,"$schema_parser_xxx.erl"},{line,73}]},{emqx_schema_registry_serde,with_serde,2,[{file,"emqx_schema_registry_serde.erl"},{line,212}]}...
Observability
-
#15931 Resolved a bug where spurious but harmless error logs could appear during node startup:
[error] Generic event handler emqx_alarm_handler crashed ... Reason: {aborted,{no_exists,[emqx_activated_alarm,runq_overload]}} -
#15973 Fixed a bug where an alarm activation timeout could crash the connection process under certain conditions.
MQTT over QUIC
- #15614 QUIC Listener: When TLS key logging (
SSLKEYLOGFILE) is enabled, EMQX now dumps TLS keys even if the handshake fails.
Clustering
- #16021 Fixed issues that occasionally prevented the DS Raft backend from functioning correctly when an existing node joined a new cluster and subsequently became member of DS replica sets.
Cluster Linking
- #15894 Previously, when listing all cluster links via
GET /cluster/links, disabled links would be returned having aninconsistentstatus. Now they are returned asdisconnected.
Performance
-
#15696 Added connection rate limiting support for WebSocket (WS) and WebSocket Secure (WSS) listeners.
Themax_conn_rateandmax_conn_burstconfiguration options are now enforced: incoming connections exceeding the defined rate are immediately closed upon acceptance, consistent with existing TCP listener behavior.Additionally, the behavior of
max_connectionshas been updated. When the connection limit is exceeded, WS/WSS listeners now close connections immediately before any HTTP handshake, resulting in an abrupt socket close instead of returning an HTTP 429 response. -
#15854 Reduced the default
active_nvalue from100to10to improve MQTT client responsiveness, especially under high message rates with small payloads.The lower
active_nintroduces more backpressure at the TCP layer, stricter than the defaultReceive-Maximumof32, which helps in the following scenarios:- The client process is blocked by external authorization checks
- Data integration operations are delaying message handling
- The system is under heavy load or nearing resource limits
-
#15981 Prevented excessive memory growth caused by Mnesia transaction blocking during cleanup of large volumes of audit logs. This improves system stability and memory efficiency during heavy audit log maintenance operations.
Breaking Changes
Deprecated Packages
-
#15939 Stopped releasing packages for systems that have already reached end-of-life:
- Debian 10 (Buster)
- Enterprise Linux (CentOS) 7
- Ubuntu 18.04
- Ubuntu 20.04
- macOS 13 (Ventura)
-
#16050 Stopped releasing packages for Amazon Linux 2. It will reach end-of-life on June 30, 2026.
Durable Sessions
If the durable sessions feature was not enabled before, you can ignore this section.
In EMQX 6.0, the internal representation of durable sessions and their messages has changed.
Clusters previously running on version 5.x with durable sessions enabled must be recreated from a clean state when upgrading to 6.0.
For detailed upgrade instructions, see the rolling upgrade documentation.
- #15496 The state of durable sessions has been migrated from Mnesia to a new database built on EMQX durable storage.
- As a result, all durable session states created before 6.0.0 will be lost during the migration.
- This change resolves potential session state corruption caused by Mnesia’s limited transaction isolation (see #14039).
- It also improves the performance and scalability of durable sessions through sharding and a more efficient data representation.
Will Message Behavior
Authorization checks for durable sessions are now performed at the moment of client disconnection to determine whether the will message may be published.
Previously, these checks were deferred until after the configured Will-Delay-Interval had expired.
Configuration Changes
Durable Sessions
durable_storage.messages.n_sitesparameter has been renamed todurable_storage.n_sites. This parameter has become common for all durable storage.durable_storage.sessionsanddurable_storage.timershave been added.- #15734 Improved the reliability and throughput of durable sessions.
Durable Storage
durable_storage.messages.n_siteshas been renamed todurable_storage.n_sites, which now applies to all durable storage types.- Added new configuration entries for
durable_storage.sessionsanddurable_storage.timers.
RocketMQ
- #15635 The
parameters.strategyfield no longer accepts key templates (which previously implied thekey_dispatchstrategy).
Instead, setparameters.strategy = key_dispatchexplicitly and specify the key template inparameters.key.
Rate Limit
- #15743 Listener connection rate limits (
max_conn_rateandmax_conn_burst) are now enforced per listener rather than per acceptor, restoring the behavior before 5.9.0. As a result, configurations from versions 5.9.0, 5.9.1, and 5.10.0 are incompatible: the specified rate values must be scaled up by the number of acceptors configured for each listener to preserve the same effective limits.
- Authorization cache cleared on client disconnect reduces memory consumption
- Kinesis health check rate limiting complies with AWS API quotas
- MQTT bridge stale connection detection and automatic reconnection
Full changelog
Enhancements
Performance
-
#15907 Improve system memory usage. Fields such as client ID, username, password, and topic are copied into new binaries (when more than 64 bytes) instead of being slices from the raw packet to reduce 'binary' part of memory usage in Erlang VM.
-
#15899 Authorization (authz) cache is now cleared immediately when a client disconnects, reducing unnecessary memory consumption.
Observability
-
#15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
-
#15364 Added HTTP header configuration items to the OpenTelemetry integration to adapt to collectors with HTTP authentication.
Access Control
-
#15294 Enhanced LDAP authentication and authorization.
LDAP authorization now supports an extended ACL rule format using JSON, in addition to the existing simple topic list. ACL rules can also be fetched from LDAP during authentication based on client information, and are cached in the client’s metadata to avoid repeated LDAP queries during authorization. -
#15349 Optimized external resource management for authentication and authorization. Previously, EMQX could remain connected to a resource configured for a disabled authentication or authorization provider.
Data Integration
-
#15360 Added support for writing data files in Parquet format for Amazon S3 Tables Action.
-
#15387 Added rate limiting to Kinesis Producer Connector and Action health checks to comply with AWS API quotas and improve cluster behavior.
- Health check calls to
ListStreamsandDescribeStreamare now limited to 5/s and 10/s per Connector, respectively, matching AWS rate limits. - A distributed limiter is coordinated by a core node in the cluster to enforce these limits consistently.
- If a health check is throttled or times out, the Connector or Action will now retain its previous status instead of being marked as disconnected.
Also introduced a new
resource_opts.health_check_interval_jitter, which adds a uniform random delay toresource_opts.health_check_intervalto reduce the chance of multiple Actions under the same Connector running health checks at the same time. - Health check calls to
-
#15542 Upgraded our
erlcloudlibrary to3.8.3.0. This allows one to setup a S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s). -
#15845 The
static_clientidsconfiguration for the MQTT Connector now supports specifying a username and password for each client ID. This is particularly useful for scenarios like connecting to Azure IoT Hub, where each device (client ID) requires a unique set of credentials. This enhancement helps ensure successful connections across multiple nodes in a clustered environment. -
#15944 Improved the information returned when a resource is marked as
disconnectedfor the following Connectors: LDAP, Syskeeper, IoTDB, Snowflake (aggregated), JWKS Authentication. -
#15911 Now, for the HTTP Action, the HTTP request timeout is taken to be the same as
resource_opts.request_ttl. Previously, it was a fixed, non-configurable value of 30 seconds. -
#15371 Added
tagsfields to the return ofGET /actions_summaryandGET /sources_summary, and to the fallback actions returned inGET /actions/:id.
CLI
- #15399 The
node_dumptool now exports the current system configuration in HOCON format, with sensitive information (such as passwords and secrets) automatically redacted for security.
Bug Fixes
API
-
#15547 Resolved an issue where EMQX would fail to process HTTP requests with large bodies (e.g., 10MB) in the REST API.
-
#15797 To improve compatibility with EMQX 4.x, the
encodingparameter has been reintroduced in the batch publish HTTP API (/api/v5/publish/bulk) as an alias forpayload_encoding. This change addresses migration issues for users relying on the originalencodingparameter, and ensures existing integrations using EMQX v4 APIs can continue working without requiring software-level changes.
Observability
- #15785 Resolved a crash that occurred when MQTT usernames containing non-ASCII characters were used in formatting network congestion alarm messages.
Gateway
- #15342 Fixed a crash in the NATS gateway caused by client info override templates referencing undefined packet fields. The system now returns an empty binary instead of undefined atom.
Core MQTT Functions
-
#15361 Fixed a
function_clauseerror when parsing a malformedUser-Propertypair with invalid (too short) length. -
#15396 Removed redundant cleanup operations for shared subscriptions of disconnected clients. These operations were prone to crashes under high disconnect volumes and could lead to inconsistencies in the global broker state.
-
#15416 Fixed occasional warning-level log events and crashes during session expiration of WebSocket connections. This issue was introduced by recent WebSocket performance improvements. If did not affect broker capacity, but produced log entries like the following:
error: {function_clause,[{gen_tcp,send,[closed,[]],[{file,“gen_tcp.erl”},{line,966}]},{cowboy_websocket_linger,commands,3,[{file,“cowboy_websocket_linger.erl”},{line,665}]},...message: {tcp,#Port<0.364>,<<136,130,...>>}, msg: emqx_session_mem_unknown_message
-
#15872 Eliminated warning log
unclean_terminatewhen disconnected after CONNACK is sent with a non-zero reason code. -
#15518 Resolve a race condition that may lead to accumulating inconsistencies in the routing table and shared subscriptions state in the cluster when a large number of shared subscribers disconnect simultaneously.
Data Integration
-
#15394 Fixed a rare race condition where Action metrics could become inconsistent due to unexpected asynchronous replies.
-
#15603 Fixed an issue in the MQTT bridge where a stale connection could be shown as
Connectedand would not automatically reconnect. -
#15826 Improved Kafka consumer connector health check behavior with restricted ACLs. Previously, Kafka Consumer Connector health checks could fail if the configured user lacked permission to access the internal
____emqx_consumer_probeconsumer group used for the check. With this fix, if the Kafka broker returns an "ACL denied" response, EMQX will treat the connection as healthy. -
#15827 Fixed atom and process leaks in the GreptimeDB driver.
Fixed a
function_clauseerror that could arise if certain incorrect write syntaxes were used in GreptimeDB Actions. -
#15836 Enriched the returned information when a Kafka Consumer Source fails to be added, for example, due to denied topic ACLs.
-
#15850 Fixed an issue with the MQTT bridge when a stale connection was displayed as
Connectedand the connection was not re-established. -
#15866 Upgrade Kafka producer lib wollf to
4.0.12to improve handling of temporarily missing partitions in Kafka metadata responses.In rare race conditions, Kafka may return an incomplete partition list.
Previously, this was only handled when a topic was recreated with fewer partitions, but not when partitions were temporarily missing.
This gap could cause the partition producer to stall and block shutdown indefinitely. -
#15906 Upgraded Kafka producer library Wolff from
4.0.12to4.0.13, which adds handling for therecord_list_too_largeerror inProduceResponse. -
#15902 Upgrade MQTT client library to 1.13.8
This improves MQTT bridge connectivity with:
- Connector will automatically reconnect when peer broker does not reply PINGRESP.
- Bridge over TLS failure is more promptly handled if connection breaks while waiting for CONNACK.
-
#15910 Fixed an issue with Connectors where a pool of workers could fail to recover from a failure if multiple workers crashed simultaneously in large worker pools.
Connectors affected and fixed:
- MySQL
- PostgreSQL
- Oracle
- SQLServer
- TDEngine
- Cassandra
- Dynamo
- HTTP
- Couchbase
- GCP PubSub
- Snowflake
Upgraded
gunand related dependencies to 2.1.0.
Deployment
-
#15553 Fixed an issue in the Helm chart where deploying EMQX with default values started multiple replicas and caused all nodes except one to crash. The chart now defaults to a single replica, since clustered deployments require an Commercial License.
-
#15712 Fix node boot-up failure during rolling upgrade from older versions (before 5.9)
In previous EMQX versions (before 5.9), a bug in the ZIP timestamp encoder could store an invalid “seconds” value in archive entries (values corresponding to the 30th or 31st 2-second slot in DOS time format).
-
#15863 Fixed license quota alarm text.
Clustering
- #15788 Fixed etcd cluster discovery issue. Resolved an issue where EMQX nodes from different clusters could mistakenly join each other when using a shared etcd server. This was caused by a bug in the etcd client library.
Rate Limit
- #15794 Improved the behavior of connection rate limit updates to ensure that changes (e.g., to burst rate or rate thresholds) are applied immediately after the listener configuration is updated. Previously, parts of the internal limiter state were not refreshed correctly, which could result in rate limits appearing stricter than configured.
Smart Data Hub
- #15810 Introduced
spb_{en,de}codefunctions to correct handling ofbytes_valueMetrics. Fixed an issue with the originalsparkplug_{en,de}codefunctions, which did not base64 encode/decodebytes_valuemetric values as required by the Protobuf specification. To address this, newspb_{en,de}codefunctions have been introduced for correct encoding/decoding of such fields. The oldsparkplug_{en,de}codefunctions are now deprecated to maintain backward compatibility.
Access Control
-
#15818 Corrected handling of
{allow|deny, all}ACL rules.Previously, these rules were internally translated to match
#, which incorrectly failed to match topics prefixed with$(e.g.$testtopic/1) due to MQTT spec restrictions.
Now, a special internal value is used to ensure{allow|deny, all}rules correctly match any topic, including$-prefixed ones. -
#15844 Added validation to forbid adding empty usernames to the built-in database authenticator. Such users cannot be deleted via the HTTP API later, since they mess up the API path.
If you have such an user and wish to delete it, run the following in an EMQX console:
mria:transaction(emqx_authn_shard, fun() -> mnesia:delete(emqx_authn_mnesia, {'mqtt:global',<<>>}, write) end).
Breaking Changes
- #15752 Listener connection rate limits (
max_conn_rateandmax_conn_burst) are now enforced per listener rather than per acceptor, restoring the pre-5.9.0 behavior. As a result, configurations from versions 5.9.0, 5.9.1 and 5.10.0 are incompatible: specified rates must be scaled up by the number of acceptors configured for respective listeners.
- Fixed TLS connection race condition during certificate renewal
- Added support for RSA-PSS certificate signatures
- Client ID registration throttling prevents aggressive reconnect instability
- Erlang VM parameters tuned for message latency and CPU usage
- Global garbage collection disabled by default
Full changelog
Enhancements
Deployment
- #15813 Added package release for Debian 13 (Trixie), and updated Docker images to use Debian 13 as the base.
Core MQTT Functionalities
- #15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code
137(Server Busy) in theCONNACKwith Reason-String"THROTTLED", and should retry after the cleanup completes. - Fixed the reason code returned when another connection registers the same client ID; now correctly returns
137instead of133.
Observability
- #15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
Performance
- #15536 Disabled the
node.global_gc_intervalconfiguration by default to improve overall performance stability, as it caused CPU fluctuations and higher message latency while providing little benefit over Erlang’s built-in garbage collector. - #15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (
+zdbbl 32768) to preventbusy_dist_port alarmsduring intensive Mnesia operations. - Disabled scheduler busy-waiting (
+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system. - Set scheduler binding type to db (
+stbt db) to reduce message latency.
- Increased buffer size for distributed channels to 32 MB (
Bug Fixes
Deployment
-
#15580 Added a new
emqxLicenseSecretRefvariable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.This replaces the non-functional
emqxLicenseSecretNamevariable, which created and mounted a secret file but did not pass the license to EMQX.
Clustering
- #14778 Fixed an issue where a node could not join a running cluster if that node had broken symlinks in its
data/certsordata/authzdirectories.
Security
- #15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a
bad_certificate/invalid_signature error.
Observability
- #15639 Fixed an issue where the
packets.subscribe.auth_errormetric was not incremented when subscription authentication failed.
Gateway
- #15679 Fixed incorrect global chain names for the ExProto gateways. Built-in authentication data for these gateways was previously grouped under
unknown:global, causing conflicts between gateways. - #15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.
ExHook
- #15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.
- Fixed TLS connection race condition during certificate renewal
- Added support for RSA-PSS certificate signatures
- Client ID registration throttling prevents aggressive reconnect instability
- Erlang VM parameters tuned for message latency and CPU usage
- Global garbage collection disabled by default
Full changelog
Enhancements
Deployment
- #15813 Added package release for Debian 13 (Trixie), and updated Docker images to use Debian 13 as the base.
Core MQTT Functionalities
- #15773 Throttled client ID registration during reconnects.
- When a previous session cleanup is still in progress, new connections using the same client ID are now throttled. This prevents instability when clients reconnect aggressively.
- Affected clients receive reason code
137(Server Busy) in theCONNACKwith Reason-String"THROTTLED", and should retry after the cleanup completes. - Fixed the reason code returned when another connection registers the same client ID; now correctly returns
137instead of133.
Data Integration
- #15542 Upgraded our
erlcloudlibrary to3.8.3.0. This allows users to set up an S3 Connector without specifying Access Key Id and Secret Access Key, so long as the EC2 instance EMQX is running in has the correct IAM permissions to read/write to the configured bucket(s). - #15585 Updated the brod client to version 4.4.4, expanding support for a wider range of Kafka APIs. This update addresses the deprecation of
JoinGroupsAPI versionsv0-v1.
Observability
- #15499 Added a force deactivate alarm API endpoint to allow administrators to forcibly deactivate active alarms.
Performance
- #15536 Disabled the
node.global_gc_intervalconfiguration by default to improve overall performance stability, as it caused CPU fluctuations and higher message latency while providing little benefit over Erlang’s built-in garbage collector. - #15539 Optimized Erlang VM parameters to improve performance and stability:
- Increased buffer size for distributed channels to 32 MB (
+zdbbl 32768) to preventbusy_dist_port alarmsduring intensive Mnesia operations. - Disabled scheduler busy-waiting (
+sbwt none +sbwtdcpu none +sbwtdio none) to lower CPU usage reported by the operating system. - Set scheduler binding type to db (
+stbt db) to reduce message latency.
- Increased buffer size for distributed channels to 32 MB (
Bug Fixes
Deployment
-
#15580 Added a new
emqxLicenseSecretRefvariable to the EMQX Enterprise Helm chart. This allows users to specify a Kubernetes Secret containing the EMQX license key, so the license is applied automatically.This replaces the non-functional
emqxLicenseSecretNamevariable, which created and mounted a secret file but did not pass the license to EMQX.
Clustering
- #14778 Fixed an issue where a node could not join a running cluster if that node had broken symlinks in its
data/certsordata/authzdirectories.
Security
- #15581 Upgraded Erlang/OTP version from 26.2.5.2 to 26.2.5.14. This upgrade includes two TLS-related fixes from OTP that affect EMQX:
- Fixed a crash in TLS connections caused by a race condition during certificate renewal.
- Added support for RSA certificates signed with RSASSA-PSS parameters. Previously, such certificates could cause TLS handshakes to fail with a
bad_certificate/invalid_signature error.
Data Integration
- #15616 Kafka connections are now considered healthy even if a
topic_authorization_failederror is returned for the default probing topic.
Smart Data Hub
- #15706 Fixed an indexing issue that could cause Message Transformations and Schema Validations to behave inconsistently. Deleting one item could corrupt the topic index, so that a subsequent item remained active even after being disabled.
- #15708 Fixed an issue where external schema registries were not reloaded after a node restart.
Observability
- #15639 Fixed an issue where the
packets.subscribe.auth_errormetric was not incremented when subscription authentication failed.
Gateway
- #15679 Fixed incorrect global chain names for the ExProto, JT/T 808, GB/T 32960, and OCPP gateways. Built-in authentication data for these gateways was previously grouped under
unknown:global, causing conflicts between gateways. - #15699 Fixed an issue where built-in authentication data for gateways (e.g., CoAP) was incorrectly removed when a node was stopped or restarted.
- #15822 Fixed an issue where the OCPP connection would crash after sending a certain number of messages.
ExHook
- #15683 Fixed ExHook TLS options so that gRPC clients can correctly verify the server hostname during the TLS handshake.
Minor fixes and improvements.
Full changelog
Bug Fixes
- #15383 Fixed a potential resource leak in the MQTT bridge. When the bridge failed to start, the topic index table was not properly cleaned up. This fix ensures that the index table is correctly deleted to prevent resource leaks.
- OpenTelemetry HTTP header configuration for authenticated collectors
- Snowflake Connector private key file path support
- Configuration key removal via emqx ctl conf remove command
Full changelog
Enhancements
-
#15364 Added support for custom HTTP headers in the OpenTelemetry gRPC (over HTTP/2) integration. This enhancement enables compatibility with collectors that require HTTP authentication.
-
#15160 Added the
DELETE /mt/bulk_delete_nsAPI for multi-tenancy management, which allows deleting namespaces in bulk. -
#15158 Added new
emqx ctl conf remove x.y.zcommand, which removes the configuration key pathx.y.zfrom the existing configuration. -
#15157 Added support for specifying private key file path for Snowflake Connector instead of using password.
Users should either use password, private key, or neither (set parameters in
/etc/odbc.ini). -
#15043 Instrument the DS Raft backend with basic metrics to provide insights into cluster status, database overview, shard replication, and replica transitions.
Bug Fixes
Data Integration
-
#15331 Fixed an issue in the InfluxDB action where line protocol conversion failed if the
timestampinWriteSyntaxwas left blank and no timestamp field was provided in the rule.
Now the system's current millisecond value is used instead, and millisecond precision is enforced. -
#15274 Improved the resilience of Postgres, Matrix, and TimescaleDB connectors by triggering a full reconnection on any health check failure. Previously, failed health checks could leave the connection in a broken state, causing operations to hang and potentially leading to out-of-memory issues.
-
#15154 Fixed a rare race condition in Actions running in aggregated mode (e.g., S3, Azure Blob Storage, Snowflake) that could lead to a crash with errors like:
** Reason for termination == ** {function_clause,[{emqx_connector_aggregator,handle_close_buffer,[...], ... -
#15147 Fixed an issue where some Actions failed to emit trace events during rule testing with simulated input data, even after request rendering.
Affected Actions:
- Couchbase
- Snowflake
- IoTDB (Thrift driver)
-
#15383 Fixed a potential resource leak in the MQTT bridge. When the bridge failed to start, the topic index table was not properly cleaned up. This fix ensures that the index table is correctly deleted to prevent resource leaks.
Smart Data Hub
- #15224 Fixed an issue where updating an External Schema Registry via the Dashboard would unintentionally overwrite the existing password with
******. The password is now correctly preserved during updates. - #15190 Enhanced Message Transformation by allowing hard-coded values for QoS and topic.
Observability
- #15299 Fixed a
badargerror that occurred when exporting OpenTelemetry metrics.
Telemetry
- #15216 Fixed a crash in the
emqx_telemetryprocess that could occur when plugins were activated.
Access Control
- #15184 Fixed the formatting of error messages returned when creating a blacklist fails.
Clustering
- #15180 Reduced the risk of deadlocks during channel registration by fixing improper handling of
badrpcerrors in theekka_lockermodule. These errors previously led to false positives in lock operations, potentially causing inconsistent cluster state and deadlocks.
Security
- #15159 Improved handling of Certificate Revocation List (CRL) Distribution Point URLs by stopping their refresh after repeated failures (default: 60 seconds). This prevents excessive error logs from unreachable URLs and improves overall system stability.
- NATS Gateway accepts NATS client connections with MQTT transformation
- Subscription maximum QoS control via topic matching rules
- WebSocket performance improvements (20% CPU reduction)
Full changelog
Release Date: 2025-06-10
Make sure to check the breaking changes and known issues before upgrading to EMQX 5.10.0.
Enhancements
Core MQTT Functionalities
- #15118 Provided a new configuration option
mqtt.subscription_max_qos_rulesto control the maximum QoS level allowed per client subscription. This allows administrators to limit the QoS requested in SUBSCRIBE packets based on matching rules for specific topics. Currently, only a limited set of matching rules (predicates) is supported, based on the topic in the SUBSCRIBE packet. - #15246 Improved WebSocket connections performance and resource consumption.
- Reduced CPU usage by approximately 20% and slightly lowered memory consumption, according to synthetic benchmarks measuring 1-on-1 MQTT messaging performance.
- Improved connection setup efficiency when the listener-wide connection limit is enabled, especially on nodes managing a large number of connections.
Deployment
- #14791 Added support for custom annotations on the EMQX StatefulSet in the Helm chart, enabling automated pod restarts on ConfigMap or Secret changes. This improves automation and reliability when managing EMQX on Kubernetes.
Access Control
-
#15250 Improved LDAP bind authentication to correctly extract the
is_superuserflag from LDAP entry attributes.
Previously, theis_superuservalue was always set tofalse, even when the LDAP entry included a validisSuperuserattribute. -
#15249 Improved the LDAP authentication and authorization.
- Validation for the LDAP
filter/base_dnsettings was added. - Fixed various variable interpolation issues.
- Validation for the LDAP
Rule Engine
-
#15001 Add
ai_completionfunction to the Rule Engine SQL that allows to use AI services to process the data. -
#15201 Add
base_urloption to AI completion provider configuration. -
#15188 Rule event topics now have namespaces.
| Previous event topic | New event topic |
| :-------------------------------------- | :-------------------------------------- |
|$events/client_connected|$events/client/connected|
|$events/client_disconnected|$events/client/disconnected|
|$events/client_connack|$events/client/connack|
|$events/client_check_authz_complete|$events/auth/check_authz_complete|
|$events/client_check_authn_complete|$events/auth/check_authn_complete|
|$events/session_subscribed|$events/session/subscribed|
|$events/session_unsubscribed|$events/session/unsubscribed|
|$events/message_delivered|$events/message/delivered|
|$events/message_acked|$events/message/acked|
|$events/message_dropped|$events/message/dropped|
|$events/delivery_dropped|$events/message/delivery_dropped|
|$events/message_transformation_failed|$events/message_transformation/failed|
|$events/schema_validation_failed|$events/schema_validation/failed|Previous event topics are kept for backwards compatibility.
-
#15175 Added support for matching event topics in Rule Engine using wildcards. Now, it's possible to use
$events/#,$events/sys/+and similar for matching multiple events at once.
Smart Data Hub
-
#15174 Added support to upload Protobuf source file bundles for Schema Registry.
For example, assuming that the Protobuf source file bundle is at
/tmp/bundle.tar.gzand has the following file structure, witha.protobeing the root Protobuf schema file:. ├── a.proto ├── c.proto └── nested └── b.protoThen, to create a new schema using that bundle via the HTTP API:
curl -v http://127.0.0.1:18083/api/v5/schema_registry_protobuf/bundle \ -XPOST \ -H "Authorization: Bearer xxxx" \ -F bundle=@/tmp/bundle.tar.gz \ -F name=my_cool_schema \ -F root_proto_file=a.proto
Data Integration
-
#15248 EMQX supports data integration with Doris, supporting data writing using SQL statements.
-
#15218 Added support for IAM authentication in Kafka Producer and Consumer Connectors when connecting to Amazon MSK (Managed Streaming for Apache Kafka). When EMQX runs on AWS EC2, it uses the AWS SDK to generate OAuth Bearer tokens for Kafka clients.
-
#15157 Added support for specifying private key file path for Snowflake Connector instead of using password.
Users should either use password, private key, or neither (set parameters in
/etc/odbc.ini). -
#14983 EMQX supports data integration with S3Tables.
Current limitations:
- Only S3Tables catalogs are supported (hence table data and metadata must live in S3).
- Only Iceberg table format version 2 is supported.
- Only the following partition transform functions are supported:
identityvoidbucket[N]
- Data files are written only in Avro.
-
#15331 Fixed the issue in influxdb action where the line protocol conversion failed when the
timestampinWriteSyntaxwas left blank and there was no timestamp field in the rule. Now the system's current millisecond value is used instead, and millisecond precision is enforced. -
#15348 Make
middlebox_comp_modeconfigurable for SSL clients. Themiddlebox_comp_modeoption, which was previously always enabled (true) for all TLS 1.3 connections, is now configurable. By default, it remainstrueto maintain compatibility with most network environments.In rare cases where TLS fails with an error such as:
unexpected_message, TLS client: In state hello_retry_middlebox_assert ..., try settingmiddlebox_comp_modetofalse.
Multi-Tenancy
- #15253 Added two new multi-tenancy APIs:
GET /mt/ns_list_detailsandGET /mt/ns_list_managed_details. Both work similarly to their existing counterpars, but returns extra metadata associated with the namespace besides its name. - #15160 Added the
DELETE /mt/bulk_delete_nsAPI for multi-tenancy management, which allows deleting namespaces in bulk.
CLI
- #15158 Added new
emqx ctl conf remove x.y.zcommand, which removes the configuration key pathx.y.zfrom the existing configuration.
Gateway
-
#15138 Introduced NATS Gateway for accepting NATS client connections over TCP/TLS, WS/WSS transport protocols.
For example, the NATS gateway will transform the following NATS message into an MQTT message with the topic
sub/tand payloadhello, while supporting seamless integration with existing EMQX features such as the rule engine, data integration, and more:PUB sub.t 5 hello
Durable Storage
- #15043 Instrument the DS Raft backend with basic metrics to provide insights into cluster status, database overview, shard replication, and replica transitions.
Bug Fixes
Access Control
- #15184 Fixed an issue where the error message format was incorrect when creating a new banned list record failed.
Clustering
-
#15304 Fixed the problem related to core node discovery by replicant nodes when using
staticdiscovery strategy.Previously, the replicants could ignore core nodes not explicitly listed in the
static_seedslist.
This could lead to an inconsistent cluster view and load imbalance. -
#15180 Fixed an issue in
ekka_lockerwhere RPC (badrpc) errors were not handled correctly, causing false-positive lock successes. This could lead to inconsistent lock states and deadlocks in clustered deployments.
Security
- #15159 Improved CRL Distribution Point (CDP) handling: If a CDP URL fails to refresh continuously (default timeout: 60 seconds), it will now be evicted and excluded from further refresh attempts to prevent repeated error logs.
Rule Engine
- #15247 Fixed an issue where
function_clauseerror logs would be printed when attempting to callemqx ctl conf remove dashboard.sso.<BACKEND_NAME>.
Smart Data Hub
- #15285 Added
content-typeheader to External HTTP Schema requests. - #15224 Fixed an issue where updating an External Schema Registry via the dashboard would inadvertently change the password to
******. - #15190 Support setting hard-coded QoS and topic in message transformation.
Data Integration
-
#15274 Now, any health check failure for Postgres, Matrix and TimescaleDB Connectors will trigger a full reconnection. Prior to this change, there were situations where the connection would become unusable and attempts to use it would hang, potentially leading to out of memory issues.
-
#15234 Added trace events for rule testing when either the Action is not installed yet, and for Republish Fallback actions. These will now appear in the frontend while testing Rules with simulated input data.
-
#15219 Reduced the amount of logs generated by Clickhouse Connector when a health check timeout occurs. Also, when a health check timeout occurs for this Connector, we now mark it as
connectinginstead ofdisconnected, meaning that a full reconnect attempt will no longer be triggered by such timeouts. -
#15154 Fixed a rare race condition in Actions that run in aggregated mode (S3, Azure Blob Storage, Snowflake) that could result in crash logs similar to the following:
** Reason for termination == ** {function_clause,[{emqx_connector_aggregator,handle_close_buffer,[...], ... -
#15147 When running Rule tests with simulated input data, some Actions would not emit trace events after rendering requests. This has been fixed.
Affected Actions:
- Couchbase
- Snowflake
- IoTDB (Thrift driver)
-
#15306 Fixed an issue where a Connector's health check response would always trigger health checks for all dependent Actions and Sources, regardless of their actual state.
Multi-Tenancy
-
#15242 Fixed an issue where, upon node restart after configuring limiters for multi-tenancy, logs like the following would be logged while initializing limiters:
2025-05-15T16:45:13.276895+08:00 [error] clientid: ns3mqttx_620053b2_100, msg: hook_callback_exception, peername: 127.0.0.1:39364, username: ns3, reason: {limiter_group_not_found,{mt_tenant,<<"ns3">>}}, stacktrace: [{emqx_limiter,connect,1,[{file,"emqx_limiter.erl"},{line,134}]}
Observability
- #15299 Fixed a
badargerror when exporting OpenTelemetry metrics.
Telemetry
- #15216 Fixed a crash of
emqx_telemetryprocess when there are plugins activated.
Breaking Changes
-
#15289 Added a new
resource_opts.health_check_timeoutconfiguration to all Connectors, Actions and Sources, with default value of 60 seconds. If a health check takes more than this to return a response, the Connector/Action/Source will be deemeddisconnected.Note: since the default is 60 seconds, this means that if a Connector/Action/Source previously could take more than that to return a healthy response, now it'll be deemed disconnected in such situations.
-
#15286 Configuration option
broker.routing.storage_schemais now deprecated and ignored. Legacyv1routing storage schema is no longer supported, and EMQX will refuse to start in a cluster running older versions that still use it. -
#15239 The type for the
multi_tenancy.default_max_sessionsis now eitherinfinityor a positive integer. Previously,0would be accepted. -
#15156 Schema validation was added to
dashboard.sso.oidc.issuerfield. Now, this value is checked to be a valid URL.