Amazon Redshift introduces row-level operations for Apache Iceberg tables

Amazon Redshift now supports row-level UPDATE, DELETE, and MERGE operations on Apache Iceberg tables, enhancing data manipulation capabilities directly within the platform.

Amazon Redshift has announced new capabilities allowing users to perform row-level UPDATE, DELETE, and MERGE operations on Apache Iceberg tables. This development is particularly beneficial for customers utilizing Iceberg to construct interoperable data lakes, as they can now execute data manipulation language (DML) tasks directly from Amazon Redshift without the need to transfer data to external processing engines.

In the past, altering individual rows within Iceberg tables necessitated the use of separate engines, which introduced additional complexity and latency into data workflows. With the introduction of this feature, users can now execute UPDATE, DELETE, and MERGE (UPSERT) statements on both partitioned and unpartitioned Iceberg tables, including those stored in S3. The supported Iceberg partition transformations encompass identity, bucket, truncate, year, month, day, and hour.

The MERGE operation allows for the integration of insert and update logic within a single statement, facilitating common data integration scenarios such as change data capture and slowly changing dimensions. Importantly, tables altered by Redshift remain compatible with other Iceberg-compliant engines like Amazon EMR and Amazon Athena, thereby maintaining cross-engine interoperability. Additionally, AWS Lake Formation permissions extend to Iceberg write operations.

This enhancement to Amazon Redshift is accessible across all AWS Regions where the service is available. For further information and to begin utilizing these new features, users are encouraged to consult the Writing to Apache Iceberg tables section found in the Amazon Redshift Database Developer Guide, which includes comprehensive documentation on the SQL syntax required for these operations.