Skip to content

starrocks

Data Warehouses & Analytics

An open-source, sub‑second query engine for real‑time analytics that runs directly on data lakehouse formats without moving or rewriting data.

Java Latest 4.0.10 · 25d ago Security brief →

Features

  • Native vectorized SQL engine delivering 5–10× faster multi‑dimensional queries
  • Full ANSI SQL support with MySQL protocol compatibility for seamless client integration
  • Cost‑based optimizer (CBO) that auto‑selects intelligent materialized views and execution plans
  • Real‑time upsert/delete operations via primary‑key model while maintaining low‑latency reads

Security Response History

7 CVEs
CVE Severity Disclosed Patched (this tool) vs Ecosystem Median
CVE-2025-24813 KEV critical
CVSS 9.8
2025-04-01 2026-01-06 9mo / median 9mo
CVE-2023-44487 KEV medium
CVSS 7.5
2023-10-10 2026-01-06 2y 3mo / median 2y 3mo
CVE-2021-45046 KEV critical
CVSS 9.0
2023-05-01 2026-01-06 2y 8mo / median 2y 9mo
CVE-2017-12617 KEV high
CVSS 8.1
2022-03-25 2026-01-06 3y 10mo / median 3y 10mo
CVE-2017-12615 KEV high
CVSS 8.1
2022-03-25 2026-01-06 3y 10mo / median 3y 10mo
CVE-2020-1938 KEV critical
CVSS 9.8
2022-03-03 2026-01-06 3y 10mo / median 3y 10mo
CVE-2021-44228 KEV critical
CVSS 10.0
2021-12-10 2026-01-06 4y 1mo / median 4y 2mo

Recent releases

View all 14 releases →
4.0.10 Breaking risk patches CVE-2017-12615 patches CVE-2017-12617 patches CVE-2020-1938 +4 more
Breaking changes
  • StarRocks no longer permits queries against insert‑only ACID Hive tables; such queries now produce an explicit error instead of silently returning extra rows.
Security fixes
  • CVE-2026-42198 (pgjdbc) — upgraded org.postgresql:postgresql to 42.7.11
  • CVE-2026-5598 (BouncyCastle) — upgraded BouncyCastle to 1.84
  • netty CVE — upgraded netty to 4.1.133.Final
Full changelog

4.0.10

Release Date: May 9, 2026

Behavior Changes

  • Cloud storage credentials are now redacted in error messages produced by INSERT INTO FILES, preventing accidental exposure of secrets in error logs and SHOW LOAD output. #71245
  • StarRocks no longer permits queries against insert-only ACID Hive tables in Hive catalog. Previously such queries could silently return more rows than actually visible because INSERT OVERWRITE operations were not recognized. Affected tables now return an explicit error instead of incorrect results. #71460

Improvements

  • Added an Avro schema cache in Iceberg PartitionData construction to remove redundant Jackson ObjectMapper allocations during partition load on tables with many partitions. #72215
  • Optimized CatalogRecycleBin.getAdjustedRecycleTimestamp to avoid rebuilding the table-id map on every call, reducing recycle-bin cleanup and tablet scheduling overhead. #72128
  • OlapTableSink.createLocation now batches tablet-location lookups in shared-data mode, removing per-tablet StarOS RPCs that previously stalled the planner critical section. #72041
  • Java UDAF instances are now loaded and initialized once per query and reused across pipeline driver instances, removing the linear driver-preparation overhead at high pipeline_dop. #72038
  • Added BE metrics starrocks_be_staros_shard_info_fallback_total and starrocks_be_staros_shard_info_fallback_failed_total to track when the StarOS worker falls back to fetching shard info from starmgr because the local cache missed. #71620
  • File-bundle writes now prefer a tablet-local aggregator so the bundled tablet metadata path does not require cross-node shard-info lookups. #71613
  • Audit log entries now include the queried tables and views referenced by each query. #71596
  • INSERT INTO FILES CSV export now supports csv.enclose and csv.escape properties for controlling field quoting and escaping. #71589
  • Added LDAP direct bind authentication via DN pattern, removing the requirement for an admin search account in single-tenant LDAP setups. #71559
  • Added the starrocks_fe_tablet_num metric for shared-data clusters to match the shared-nothing metric set. #71444
  • star_mgr_meta_sync_interval_sec is now runtime-mutable via ADMIN SET FRONTEND CONFIG; the new interval takes effect on the next sync cycle without an FE restart. #71675

Bug Fixes

The following issues have been fixed:

  • A race in shared-data combined txn log mode where INSERT into per-partition coordinator dispatch could classify legitimate txn logs as orphan and drop them, leaving the transaction stuck in non-VISIBLE state. #72237
  • An issue where _incremental_open_node_channel channels in shared-data combined txn log mode silently dropped txn logs because the legacy "sender_id == 0 collects all logs" rule did not apply to incremental channels. #71992
  • An issue where RuntimeProfile::to_thrift() could crash BE with std::bad_optional_access when another thread reset counter min/max values during profile serialization. #72904
  • An inconsistency in flat JSON merge results when one side contributed empty values. #72973
  • An issue where CREATE TABLE for an Iceberg table failed with "Multiple entries with same key: format-version" when the user explicitly specified format-version in PROPERTIES. #72828
  • A CompactionScheduler.startCompaction lock scope that held a DB-wide READ lock across single-table critical work, blocking concurrent DDL on other tables in the same database. Switched to IS on DB plus READ on the target table. #72178
  • An issue where StarMgrMetaSyncer.syncTableMetaInternal and syncTableColocationInfo held DB READ/WRITE locks across external StarOS RPCs, freezing CREATE/DROP/ALTER/RENAME on every table in the database for the duration of each RPC. #72108
  • An issue where StarMgrMetaSyncer.getAllPartitionShardGroupId held the DB READ lock for full iteration over all cloud-native tables and physical partitions, stalling FE threads waiting for the DB write lock on large catalogs. #71614
  • A redundant DB READ lock in getTableNamesViewWithLock. The underlying nameToTable is a ConcurrentHashMap, so the enclosing lock added contention without correctness benefit. #72042
  • A DB WRITE lock in the read-only /api/{db}/{table}/_count REST endpoint that was unnecessary for computing proximateRowCount(). #72053
  • A batch publish deadlock caused by partition version gaps that operations like tablet split, schema change, and alter jobs reserved by advancing nextVersion without a matching publish. #71483
  • A deadlock in shared-nothing mode when warming up the LRU cache for rowset metadata while the cache was full. #71459
  • A PipelineTimerTask that could remain stuck in waitUtilFinished due to incorrect ordering between consumer registration and finished signaling. #72058
  • A condition race in ConnectorSinkPassthroughExchanger::accept that crashed BE with SIGSEGV via out-of-bounds vector access on _writer_count. #71848
  • A use-after-free in LoadChannel::get_load_replica_status caused by destruction of a temporary shared_ptr. #71843
  • A use-after-free in the information schema sink due to a missing reference count increment in async RPC closure handling. #71513
  • A BE crash in reverse(DecimalV3) caused by improper handling of decimal value width. #71834
  • A BE crash when UNNEST produced columns whose define-expression carried an ARRAY type, which was incompatible with global dictionary generation downstream. #72027
  • An NPE in FE when creating an Iceberg external table with invalid transform argument order such as bucket(4, region); FE now returns a normal analyzer error. #71917
  • An issue where Iceberg manifest data file cache entries were missing column statistics when the first query against a table did not request stats (for example SELECT *). #71913
  • An issue where the Iceberg min/max optimization was silently skipped when the table was partitioned by bucket(col, N) because PruneHDFSScanColumnRule injected a placeholder materialized column. #71863
  • An issue where AggregateJoinPushDownRule failed to rewrite materialized views over Iceberg base tables because Table.getId() was compared instead of identity, and connector-table ids can shift across plan rebuilds. #71856
  • An issue where INSERT OVERWRITE into Hive dynamic partitions failed when the metastore listed a partition whose location no longer existed on the file system; the missing partition directory is now created before commit. #71810
  • A Parquet scanner failure (Illegal converting from arrow type(dictionary) ...) when Arrow returned dictionary-typed columns, including dictionaries nested inside arrays, structs, and maps. #71855
  • An issue where stale scan ranges from earlier batches persisted across ColocatedBackendSelector.Assignment incremental batches, causing files to be re-deployed and re-scanned. #71789
  • An issue where PruneShuffleColumnRule did not update the Join outputProperty after pruning Exchange shuffle columns, leading to incorrect downstream distribution. #72003
  • Incorrect shuffle distribution caused by a missing project node when PushDownJoinOnExpressionToChildProject was disabled during the first stage of multi-stage MV rewrite. #71075
  • Duplicate Apply attachments in ReplaceSubqueryRewriteRule when predicate normalization made the same scalar-subquery placeholder appear multiple times. #71155
  • A short-circuit issue in EventScheduler where a finished join probe could prevent the pipeline from transitioning to the finished state. #71740
  • An issue where AWS assume-role configured via aws.s3.iam_role_arn was not applied to JNI scanners (RCFile / Avro / SequenceFile / Hudi), causing S3 403 errors. #71422
  • An issue where Oracle JDBC predicate pushdown produced invalid SQL because date literals did not match the Oracle NLS format; literals are now emitted as date '...'. #71412
  • An issue in shared-data mode where a follower FE forwarded DDL to the leader and waited only for FE journal replay, missing the StarMgr journal and producing "no queryable replica" errors for queries that immediately followed table creation. #71263
  • An issue where get_tablet_stats for Primary Key tablets repeatedly reloaded the entire TabletMetadata for every segment via get_del_vec_in_meta(). #71672
  • An Arrow Flight issue where empty result sets returned column names of r because the placeholder name was emitted instead of the actual schema. #71534
  • An issue where parallel_clone_task_per_path updates did not include the store-path count when resizing the CLONE thread pool. #71484
  • An issue where the resource group user classifier rejected digit-leading usernames that CREATE USER allowed. The classifier now uses the same validation rule as CREATE USER. #71470
  • An issue where HttpServerHandler.channelInactive skipped unregisterConnection when isRegistered() was false, leaking connection-map entries for early-failing requests. #72006
  • An issue where Java UDF JNI calls (NewObject, NewArray, NewStringUTF, etc.) did not check for exceptions or null returns, leading to silent failures or undefined behavior. #71734
  • An issue where be_tablets.DATA_SIZE reported total_disk_size (including rowset-embedded indexes and the persistent PK index for lake PK tablets) instead of rowset column data bytes. #70735
  • A noisy "Failed to batch drop tablets" warning printed by StarMgrMetaSyncer even when there were no shards to delete. #72209
  • CVE-2026-42198 (pgjdbc) and CVE-2026-5598 (BouncyCastle): bumped org.postgresql:postgresql to 42.7.11 and BouncyCastle to 1.84. #72797
  • CVE in netty: upgraded netty to 4.1.133.Final. #72905
  • Cleaned broker CVEs by upgrading netty / jetty / awssdk / jackson dependencies in the broker. #72184
  • Upgraded jetty-http to 9.4.58.v20250814 to address known CVEs in the previous jetty-http version. #71762
  • Temporarily masked CVE-2026-2332 to unblock the build, since jetty 9.x is EOL and no upstream fix is published. #71914
4.1.0 New feature
Security fixes
  • CVE-2026-33870
  • CVE-2026-33871
  • CVE-2025-27821
Notable features
  • Range-based data distribution with automatic tablet splitting and merging for multi-tenant workloads
  • Large-capacity tablet support phase 1 with parallel compaction and MemTable finalization
  • Fast Schema Evolution V2 with second-level DDL execution
Full changelog

4.1.0

Release Date: April 21, 2026

Shared-data Architecture

  • New Multi-Tenant Data Management

    Shared-data clusters now support range-based data distribution and automatic splitting and merging of tablets. Tablets can be automatically split when they become oversized or hotspots, without requiring schema changes, SQL modifications, or data re-ingestion. This feature can significantly improve usability, directly addressing data skew and hotspot issues in multi-tenant workloads. #65199 #66342 #67056 #67386 #68342 #68569 #66743 #67441 #68497 #68591 #66672 #69155

  • Large-Capacity Tablet Support (Phase 1)

    Supports significantly larger per-tablet data capacity for shared-data clusters, with a long-term target of 100 GB per tablet. Phase 1 focuses on enabling parallel Compaction and parallel MemTable finalization within a single Lake tablet, reducing ingestion and Compaction overhead as tablet size grows. #66586 #68677

  • Fast Schema Evolution V2

    Shared-data clusters now support Fast Schema Evolution V2, which enables second-level DDL execution for schema operations, and further extends the support to materialized views. #65726 #66774 #67915

  • [Beta] Inverted Index on shared-data

    Enables built-in inverted indexes for shared-data clusters to accelerate text filtering and full-text search workloads. #66541

  • Cache Observability

    Query-level cache hit ratio is now exposed in audit logs and the monitoring system for better cache transparency and latency diagnosis. Additional Data Cache metrics include memory and disk quota usage, and page cache statistics. #63964

  • Added segment metadata filter for Lake tables to skip irrelevant segments based on sort key range during scans, reducing I/O for range-predicate queries. #68124

  • Supports fast cancel for Lake DeltaWriter, reducing latency for cancelled ingestion jobs in shared-data clusters. #68877

  • Added support for interval-based scheduling for automated cluster snapshots. #67525

  • Supports pipeline execution for MemTable flush and merge, improving ingestion throughput for cloud-native tables in shared-data clusters. #67878

  • Supports dry_run mode for repairing cloud-native tables, allowing users to preview repair actions before execution. #68494

  • Added a thread pool for publish transactions in shared-nothing clusters, improving publish throughput. #67797

  • Supports dynamically modifying the datacache.enable property for cloud-native tables. #69011

Data Lake Analytics

  • Iceberg DELETE Support

    Supports writing position delete files for Iceberg tables, enabling DELETE operations on Iceberg tables directly from StarRocks. The support covers the full pipeline of Plan, Sink, Commit, and Audit. #67259 #67277 #67421 #67567

  • TRUNCATE for Hive and Iceberg Tables

    Supports TRUNCATE TABLE on external Hive and Iceberg tables. #64768 #65016

  • Incremental materialized view on Iceberg

    Extends the support for incremental materialized view refresh to Iceberg append-only tables, enabling query acceleration without full table refresh. #65469 #62699

  • VARIANT Type for Semi-Structured Data in Iceberg

    Supports the VARIANT data type in Iceberg Catalog for flexible, schema-on-read storage and querying of semi-structured data. Supports read, write, type casting, and Parquet integration. #63639 #66539

  • Iceberg v3 Support

    Added support for Iceberg v3 default value feature and row lineage. #69525 #69633

  • Iceberg Table Maintenance Procedures

    Added support for rewrite_manifests procedure and extended expire_snapshots and remove_orphan_files procedures with additional arguments for finer-grained table maintenance. #68817 #68898

  • Iceberg $properties Metadata Table

    Added support for querying Iceberg table properties via the $properties metadata table. #68504

  • Supports reading file path and row position metadata columns from Iceberg tables. #67003

  • Supports reading _row_id from Iceberg v3 tables, and supports global late materialization for Iceberg v3. #62318 #64133

  • Supports creating Iceberg views with custom properties, and displays properties in SHOW CREATE VIEW output. #65938

  • Supports querying Paimon tables with a specific branch, tag, version, or timestamp. #63316

  • Supports complex types (ARRAY, MAP, STRUCT) for Paimon tables. #66784

  • Supports Paimon views. #56058

  • Supports TRUNCATE for Paimon tables. #67559

  • Supports Partition Transforms with parentheses syntax when creating Iceberg tables. #68945

  • Supports ALTER TABLE REPLACE PARTITION COLUMN for Iceberg tables. #70508

  • Supports Iceberg global shuffle based on Transform Partition for improved data organization. #70009

  • Supports dynamically enabling global shuffle for Iceberg table sink. #67442

  • Introduced a Commit queue for Iceberg table sink to avoid concurrent Commit conflicts. #68084

  • Added host-level sorting for Iceberg table sink to improve data organization and reading performance. #68121

  • Enabled additional optimizations in ETL execution mode by default, improving performance for INSERT INTO SELECT, CREATE TABLE AS SELECT, and similar batch operations without explicit configuration. #66841

  • Added commit audit information for INSERT and DELETE operations on Iceberg tables. #69198

  • Supports enabling or disabling view endpoint operations in Iceberg REST Catalog. #66083

  • Optimized cache lookup efficiency in CachingIcebergCatalog. #66388

  • Supports EXPLAIN on various Iceberg catalog types. #66563

  • Supports partition projection for tables in AWS Glue Catalog tables. #67601

  • Added resource share type support for AWS Glue GetDatabases API. #69056

  • Supports Azure ABFS/WASB path mapping with endpoint injection (azblob/adls2). #67847

  • Added a database metadata cache for JDBC catalog to reduce remote RPC overhead and impact of external system failures. #68256

  • Added schema_resolver property for JDBC catalog to support custom schema resolution. #68682

  • Supports column comments for PostgreSQL tables in information_schema. #70520

  • Improved Oracle and PostgreSQL JDBC type mapping. #70315 #70566

Query Engine

  • Recursive CTE

    Supports Recursive Common Table Expressions (CTEs) for hierarchical traversals, graph queries, and iterative SQL computations. #65932

  • Improved Skew Join v2 rewrite with statistics-based skew detection, histogram support, and NULL-skew awareness. #68680 #68886

  • Improved COUNT DISTINCT over windows and added support for fused multi-distinct aggregations. #67453

  • Supports explicit skew hint for window functions, with automatic optimization of window functions with skewed partition keys by splitting into UNION. #68739 #67944

  • Supports materialization hints for CTEs. #70802

  • Enabled Global Lazy Materialization by default, improving query performance by deferring column reads until needed. #70412

  • Supports EXPLAIN and EXPLAIN ANALYZE for INSERT statements in Trino Parser. #70174

  • Supports EXPLAIN for query queue visibility. #69933

Functions and SQL Syntax

  • Added the following functions:
    • array_top_n: Returns the top N elements from an array ranked by value. #63376
    • arrays_zip: Combines multiple arrays element-wise into an array of structs. #65556
    • json_pretty: Formats a JSON string with indentation. #66695
    • json_set: Sets a value at a specified path within a JSON string. #66193
    • initcap: Converts the first letter of each word to uppercase. #66837
    • sum_map: Sums MAP values across rows with the same key. #67482
    • current_timezone: Returns the current session timezone. #63653
    • current_warehouse: Returns the name of the current warehouse. #66401
    • sec_to_time: Converts the number of seconds to a TIME value. #62797
    • ai_query: Calls an external AI model from SQL for inference workloads. #61583
    • min_n / max_n: Aggregate functions that return the top N minimum/maximum values. #63807
    • regexp_position: Returns the position of a regular expression match in a string. #67252
    • is_json_scalar: Returns whether a JSON value is a scalar. #66050
    • get_json_scalar: Extracts a scalar value from a JSON string. #68815
    • raise_error: Raises a user-defined error in SQL expressions. #69661
    • uuid_v7: Generates time-ordered UUID v7 values. #67694
    • STRING_AGG: Syntactic sugar for GROUP_CONCAT. #64704
  • Provides the following function or syntactic extensions:
    • Supports a lambda comparator in array_sort for custom sort ordering. #66607
    • Supports USING clause for FULL OUTER JOIN with SQL-standard semantics. #65122
    • Supports DISTINCT aggregation over framed window functions with ORDER BY/PARTITION BY. #65815 #65030 #67453
    • Supports ARRAY type in lead/lag/first_value/last_value window functions. #63547
    • Supports VARBINARY for count distinct-like aggregate functions. #68442
    • Supports MULTIPLY/DIVIDE for interval operations. #68407
    • Supports date and string type casting in IN expressions. #61746
    • Supports WITH LABEL syntax for BEGIN/START TRANSACTION. #68320
    • Supports WHERE/ORDER/LIMIT clauses in SHOW statements. #68834
    • Supports ALTER TASK statements for task management. #68675
    • Supports SQL UDF creation via CREATE FUNCTION ... AS <sql_body>. #67558
    • Supports loading UDFs from S3. #64541
    • Supports named parameters in Scala functions. #66344
    • Supports multiple compression formats (GZIP/SNAPPY/ZSTD/LZ4/DEFLATE/ZLIB/BZIP2) for CSV file exports. #68054
    • Supports STRUCT_CAST_BY_NAME SQL mode for name-based struct field matching. #69845
    • Supports last_query_id() in ANALYZE PROFILE for easy query profile analysis. #64557

Management & Observability

  • Supports warehouses, cpu_weight_percent, and exclusive_cpu_weight attributes for resource groups to improve multi-warehouse CPU resource isolation. #66947
  • Introduces the information_schema.fe_threads system view to inspect the FE thread state. #65431
  • Supports SQL Digest Blacklist to block specific query patterns at the cluster level. #66499
  • Supports Arrow Flight Data Retrieval from nodes that are otherwise inaccessible due to network topology constraints. #66348
  • Introduces the REFRESH CONNECTIONS command to propagate global variable changes to existing connections without reconnecting. #64964
  • Added built-in UI functions to analyze query profiles and view formatted SQL, making query tuning more accessible. #63867
  • Implements ClusterSummaryActionV2 API endpoint to provide a structured cluster overview. #68836
  • Added a global read-only system variable @@run_mode to query the current cluster run mode (shared-data or shared-nothing). #69247
  • Enabled query_queue_v2 by default for improved query queue management. #67462
  • Supports user-level default warehouse for Stream Load and Merge Commit operations. #68106 #68616
  • Added skip_black_list session variable to bypass backend blacklist verification when needed. #67467
  • Added enable_table_metrics_collect option for the metrics API. #68691
  • Added impersonate user support for query detail HTTP API. #68674
  • Added table_query_timeout as a table-level property. #67547
  • Added FE profile logging with configurable latency threshold. #69396
  • Supports adding FE observer nodes. #67778
  • Supports Merge Commit information in information_schema.loads for better load job visibility. #67879
  • Supports showing tablet status in cloud-native tables for better troubleshooting. #69616
  • Added per-catalog-type query metrics for external catalog observability. #70533
  • Added Debian (.deb) packaging support for FE and BE. #68821

Security

  • [CVE-2026-33870] [CVE-2026-33871] Replaced AWS bundle and bumped Netty to 4.1.132.Final. #71017
  • [CVE-2025-27821] Upgraded Hadoop to v3.4.2. #68529
  • [CVE-2025-54920] Upgraded spark-core_2.12 to 3.5.7. #70862

Bug Fixes

The following issues have been fixed:

  • Fixed data loss after tablet split by skipping data file deletion for range distribution tablets. #71135
  • Fixed a memory leak in DefaultValueColumnIterator for complex types. #71142
  • Fixed a memory leak caused by shared_ptr cycle between BatchUnit and FetchTaskContext. #71126
  • Fixed use-after-free in parallel segment/rowset loading on error path. #71083
  • Fixed potential hash table data loss in aggregation spill set_finishing. #70851
  • Fixed double-free crash in SystemMetrics due to concurrent getline access. #71040
  • Fixed crash in SpillMemTableSink when eager merge consumes all blocks. #69046
  • Fixed NPE in visitDictionaryGetExpr when dictionary backing table is dropped. #71109
  • Fixed NPE when analyzing generated columns in Stream Load/Broker Load if a referenced column is missing. #71116
  • Fixed NPE when auto-created partition is dropped by TTL cleaner. #68257
  • Fixed NPE in IcebergCatalog.getPartitionLastUpdatedTime when snapshot is expired. #68925
  • Fixed incorrect predicate rewrite for outer join with constant-side column reference. #67072
  • Fixed PK tablet rowset meta loss caused by GC race during disk re-migration (A→B→A). #70727
  • Fixed DB read lock leak in SharedDataStorageVolumeMgr. #70987
  • Fixed error query results after modify CHAR column length in shared-data. #68808
  • Fixed MV refresh bug in the case of multiple tables. #61763
  • Fixed incorrect MV recycle time if force refreshed. #68673
  • Fixed all-null value handling bug in sync MV. #69136
  • Fixed duplicate column id error when querying MV after fast schema change ADD COLUMN. #71072
  • Fixed IVM refresh recording incomplete PCT partition metadata. #71092
  • Fixed low-cardinality rewrite NPE caused by shared DecodeInfo. #68799
  • Fixed low-cardinality join predicate type mismatch. #68568
  • Fixed Segfault in Parquet Page Index Filter when null_counts empty. #68463
  • Fixed JSON flatten array and object conflict on identical paths. #68804
  • Fixed Iceberg cache weigher inaccuracies. #69058
  • Fixed Iceberg table cache memory limit. #67769
  • Fixed Iceberg delete column nullability issue. #68649
  • Fixed Azure ABFS/WASB FileSystem cache key to include container. #68901
  • Fixed deadlock when the HMS connection pool is full. #68033
  • Fixed incorrect length for VARCHAR field type in Paimon Catalog. #68383
  • Fixed Paimon catalog refresh crash with ClassCastException on ObjectTable. #70224
  • Fixed PaimonView resolving table references against default_catalog instead of the Paimon catalog. #70217
  • Fixed FULL OUTER JOIN USING with constant subqueries. #69028
  • Fixed join on clause bug with CTE scope. #68809
  • Fixed missing partition predicate in short-circuit point lookup. #71124
  • Fixed ConnectContext memory leaks by using bindScope() pattern. #68215
  • Fixed memory leak in CatalogRecycleBin.asyncDeleteForTables for shared-nothing clusters. #68275
  • Fixed Thrift accept thread from exiting when it encounters any exception. #68644
  • Fixed UDF resolution in routine load column mappings. #68201
  • Fixed DROP FUNCTION IF EXISTS ignoring ifExists flag. #69216
  • Fixed scan result error when dict page is too large. #68258
  • Fixed range partition overlap. #68255
  • Fixed query queue allocation time and pending timeout. #65802
  • Fixed array_map crash when processing null literal array. #70629
  • Fixed stack overflow for to_base64. #70623
  • Fixed optimizer timeout issue. #70605
  • Fixed case-insensitive username normalization for LDAP authentication. #67966
  • Mitigated SSRF risk for API proc_file. #68997
  • Masked user auth strings in audit and SQL redaction. #70360

Behavior Changes

  • ETL execution mode optimizations are now enabled by default. This benefits INSERT INTO SELECT, CREATE TABLE AS SELECT, and similar batch workloads without explicit configuration changes. #66841
  • The third argument of lag/lead window functions now supports column references in addition to constant values. #60209
  • FULL OUTER JOIN USING now follows SQL-standard semantics: the USING column appears once in the output instead of twice. #65122
  • Global Lazy Materialization is now enabled by default. #70412
  • query_queue_v2 is now enabled by default. #67462
  • SQL transactions are gated behind the session variable enable_sql_transaction by default. #63535
4.0.9 Breaking risk
Breaking changes
  • VARBINARY columns in nested types (ARRAY, MAP, STRUCT) now correctly encode as binary in MySQL result sets; previously emitted raw bytes
Security fixes
  • CVE-2026-33870
  • CVE-2026-33871
  • CVE-2025-54920
Notable features
  • Added session variables to control VARBINARY encoding behavior in MySQL protocol responses
  • Added Force Drop recovery mechanism for stuck materialized views in error state
  • Added support for dumping query execution plans when a query encounters an exception
Full changelog

4.0.9

Release Date: April 16, 2026

Behavior Changes

  • When VARBINARY columns appear inside nested types (ARRAY, MAP, or STRUCT), StarRocks now correctly encodes the values in binary format in MySQL result sets. Previously, raw bytes were emitted directly, which could break text-protocol parsing for null bytes or non-printable characters. This change may affect downstream clients or tools that process VARBINARY data inside nested types. #71346
  • Routine Load jobs now automatically pause when a non-retryable error is encountered, such as a row causing the Primary Key size limit to be exceeded. Previously, the job would retry indefinitely because such errors were not recognized as non-retryable by the FE transaction status handler. #71161
  • SHOW CREATE TABLE and DESC statements now display the Primary Key columns for Paimon external tables. #70535
  • Cloud-native tablet metadata fetch operations (such as get_tablet_stats and get_tablet_metadatas) now use a dedicated thread pool instead of the shared UPDATE_TABLET_META_INFO pool. This prevents metadata fetch contention from impacting repair and other tasks. The new thread pool size is configurable via a new BE parameter. #70492

Improvements

  • Added session variables to control the encoding behavior of VARBINARY values in MySQL protocol responses, providing fine-grained control over binary result encoding in client connections. #71415
  • Added a snapshot_meta.json marker file to cluster snapshots to support integrity validation before snapshot restoration. #71209
  • Added warning logs for silently swallowed exceptions in WarehouseManager to improve observability of silent failures. #71215
  • Added metrics for Iceberg metadata table queries to support performance monitoring and diagnosis. #70825
  • The regexp_replace() function now supports constant folding during FE query planning, reducing planning overhead for queries with constant string arguments. #70804
  • Added categorized metrics for Iceberg time travel queries to improve monitoring and performance analysis. #70788
  • Added log output when update compaction is suspended, improving visibility into compaction lifecycle. #70538
  • SHOW COLUMNS now returns column comments for PostgreSQL external tables. #70520
  • Added support for dumping query execution plans when a query encounters an exception, improving diagnosability of runtime failures. #70387
  • Tablet deletion during DDL operations is now batched, reducing write lock contention on tablet metadata. #70052
  • Added a Force Drop recovery mechanism for synchronous materialized views that are stuck in an error state and cannot be dropped through normal means. #70029

Bug Fixes

The following issues have been fixed:

  • An issue where the profile START_TIME and END_TIME were not displayed in the session timezone. #71429
  • A shared-object mutation bug in PushDownAggregateRewriter when processing CASE-WHEN/IF expressions, which could cause incorrect query results. #71309
  • A use-after-free bug in ThreadPool::do_submit triggered when thread creation fails. #71276
  • An issue where information_schema.tables did not properly escape special characters in equality predicates, causing incorrect results. #71273
  • An issue where the materialized view scheduler continued to run after the materialized view became inactive. #71265
  • Fixed a task signature collision in UpdateTabletSchemaTask across concurrent ALTER jobs that could cause schema update tasks to be skipped. #71242
  • An issue where row count estimation produced NaN values for histograms that contained only MCV (Most Common Values) entries. #71241
  • A missing dependency on the AWS S3 Transfer Manager in the AWS SDK integration. #71230
  • An issue where TaskManager scheduler callbacks did not verify whether the current node is the leader, potentially causing duplicate task execution on follower nodes. #71156
  • A thread-local context pollution issue where ConnectContext information was not cleared after a leader-forwarded request completed. #71141
  • An issue where the partition predicate was missing in short-circuit point lookups, causing incorrect query results. #71124
  • A NullPointerException when analyzing generated columns during Stream Load or Broker Load if a column referenced by the generated column expression was absent from the load schema. #71116
  • A use-after-free bug in the error handling path of parallel segment and rowset loading. #71083
  • An issue where delvec orphan entries were left behind when a write operation preceded compaction in the same publish batch. #71049
  • An issue where queries appeared in the current_queries result via HTTP loopback when checking query progress internally. #71032
  • CVE-2026-33870 and CVE-2026-33871. #71017
  • A read lock leak in SharedDataStorageVolumeMgr. #70987
  • An issue where the input and result columns of the locate() function shared the same NullColumn reference inside BinaryColumns, causing incorrect results. #70957
  • An issue where safe tablet deletion checks were incorrectly applied during ALTER operations in share-nothing mode. #70934
  • A race condition in _all_global_rf_ready_or_timeout that could prevent global runtime filters from being applied correctly. #70920
  • An int32 overflow in the ACCUMULATED metric macro that caused metric values to silently overflow. #70889
  • Incorrect aggregation results in dictionary-encoded merge GROUP BY queries. #70866
  • CVE-2025-54920. #70862
  • A potential data loss issue in aggregation spill caused by incorrect hash table state handling during set_finishing. #70851
  • An issue where the content-length header was not reset when proxy_pass_request_body is disabled. #70821
  • An issue where the spill directory for load operations was cleaned up in the object destructor rather than during DeltaWriter::close(), potentially causing premature deletion of spill data. #70778
  • An issue where INSERT INTO ... BY NAME from FILES() did not correctly push down the schema for partial column sets. #70774
  • An issue where connector scan nodes did not reset the scan range source on query retry, causing incorrect results upon retry. #70762
  • A potential rowset metadata loss for Primary Key model tablets caused by a GC race during disk re-migration of the form A→B→A. #70727
  • An issue where a query-scoped warehouse hint leaked the ComputeResource object in ConnectContext, potentially affecting subsequent queries on the same connection. #70706
  • An issue where redundant conjuncts in MySqlScanNode and JDBCScanNode caused BE errors related to VectorizedInPredicate type mismatches. #70694
  • A missing libssl-dev dependency in the Ubuntu runtime environment. #70688
  • An issue where Iceberg manifest cache completeness was not validated on read, leading to incorrect scan results when the cache was partially populated. #70675
  • A duplicate closure reference in _tablet_multi_get_rpc that could cause use-after-free. #70657
  • Partial manifest cache writes in the Iceberg ManifestReader that could result in incomplete cache entries and incorrect scan behavior. #70652
  • A crash in array_map() when processing arrays that contain null literal elements. #70629
  • A stack overflow in the to_base64() function when processing large inputs. #70623
  • An issue where INSERT INTO ... BY NAME from FILES() used positional column mapping instead of name-based mapping, causing data to be written to incorrect columns. #70622
  • An issue where NOT NULL constraints were incorrectly pushed down into the schema inferred from FILES(), causing load failures for nullable columns. #70621
  • An issue where precise external materialized view refresh did not fall back correctly for Iceberg-like connectors. #70589
  • A num_short_key_columns mismatch when constructing a partial tablet schema, which could cause data read errors. #70586
  • A BE crash that occurred when the child iterator was exhausted in MaskMergeIterator. #70539
  • An issue where materialized view refresh jobs repeatedly refreshed partitions whose corresponding Iceberg snapshots had expired. #70523
  • An issue where starlet configuration parameters could not be set. #70482
  • An issue where the lock-free materialized view rewrite path incorrectly fell back to live metadata, causing inconsistent rewrite behavior. #70475
  • An issue in JoinHashTable::merge_ht where dummy rows were not skipped for expression-based join key columns, causing incorrect join results. #70465
  • An incorrect equality comparison in InformationFunction that could produce wrong results in certain queries. #70464
  • A column type mismatch in the __iceberg_transform_bucket internal function. #70443
  • An issue where Iceberg materialized view refresh failed when Iceberg snapshot timestamps were non-monotonic. #70382
  • An issue where user authentication credentials were exposed in audit logs and SQL redaction output. #70360
  • A CN crash that occurred when scanning an empty tablet with physical split enabled. #70281
  • An issue where the VARCHAR column length was not preserved after a redundant CAST was eliminated during query optimization. #70269
  • An issue where brpc connection retry logic did not correctly handle a wrapped NoSuchElementException, causing connection failures after the retry attempt. #70203
  • An issue where null fractions for outer join columns were not preserved during statistics estimation, leading to suboptimal query plans. #70144
  • A memory tracker leak in connector sink operations running on poller threads. #70121
3.5.15 Breaking risk
Breaking changes
  • Division by zero and date parse failures now return errors instead of silent failures
  • Invalid dates in INSERT VALUES rejected when FORBID_INVALID_DATE mode set
  • REFRESH EXTERNAL TABLE FORCE option removed
Notable features
  • Expression partition columns hidden from DESC and SHOW CREATE TABLE
  • EXPLAIN and EXPLAIN ANALYZE support for INSERT statements
4.0.8 Breaking risk
Breaking changes
  • sql_mode DIVISION_BY_ZERO/FAIL_PARSE_DATE now return errors instead of silent failures
  • sql_mode FORBID_INVALID_DATE now rejects invalid dates in INSERT VALUES
  • Client ID removed from audit logs

Weekly OSS security release digest.

The CVE patches and breaking changes that affected production tools this week. One email, every Sunday.

No spam, unsubscribe anytime.

About

Stars
11,743
Forks
2,434
Languages
Java C++ Python

Install & Platforms

Platforms
linux

Community & Support

Alternative to

ClickHouse Apache Druid

Beta — feedback welcome: [email protected]