Amazon 2026-05-13
Product Launch Impact: Major Strength: Too Weak Conf: 0%

AWS Redshift RG Instances Integrate Data Lake Query, Kill Spectrum Fees to Lock In Users

Summary

AWS launches Redshift RG instances powered by Graviton, featuring an integrated data lake query engine. Performance is up to 2.2x faster than RA3 at 30% lower per-vCPU cost. The engine executes S3 data lake queries directly on cluster nodes, eliminating the separate Redshift Spectrum service and its $5/TB scanning fees. This simplifies architecture but locks users deeper into the Redshift compute lifecycle.

Key Takeaways

On May 12, 2026, AWS announced the Amazon Redshift RG instance family, powered by AWS Graviton processors. Compared to the current RA3 instances, RG instances claim up to 2.2x performance improvement for data warehouse workloads at 30% lower per-vCPU price.

The key architectural change is the integrated data lake query engine. This engine executes SQL queries on Amazon S3 data lakes (supporting Apache Iceberg and Apache Parquet) directly on the Redshift cluster's compute nodes. This completely replaces the standalone Amazon Redshift Spectrum service, eliminating the $5/TB scanning fee. Queries now stay within the user's VPC and use existing IAM roles.

Migration paths include Elastic Resize (10-15 min downtime) and Snapshot and Restore. Existing external tables, schemas, and query syntax (including Spectrum queries) require no modification. AWS specifically highlights this engine as ideal for handling the high concurrency and low latency demands of AI agent-driven workloads.

Why It Matters

On the surface, this is a performance and cost optimization. The real strategy is to shift the control and billing point for data lake queries from the independent Spectrum layer into the Redshift RG instance lifecycle. This is a defensive move against Snowflake and Databricks, which offer cross-engine data lake query capabilities. By embedding the engine into the cluster node, AWS locks the query optimization and performance to the Graviton hardware, creating a new hardware+software lock-in.

AWS downplays the hidden cost: while eliminating the $5/TB scanning fee, users now pay for RG compute for every data lake query. For infrequent, bulk scans of cold data, this model can be more expensive. The 10-15 minute downtime for Elastic Resize is unacceptable for AI inference pipelines requiring 99.99%+ availability.

The performance of the integrated engine is highly dependent on optimizations like S3 Express One Zone; otherwise, tail latency on data lake queries becomes a bottleneck. Merging query execution with warehouse compute also means data lake queries compete for memory and CPU with warehouse workloads, causing unpredictable performance jitter. This architecture aims to encircle the open table format (e.g., Apache Iceberg) ecosystem, stripping users of engine portability.

PRO Decision

[Vendors/Competitors] Snowflake and Databricks should immediately launch targeted benchmarks demonstrating resource isolation and performance predictability under mixed workloads (concurrent data warehouse and data lake queries) compared to Redshift RG. Attack the point that RG instances cannot prevent CPU and memory contention between lake and warehouse queries. Aggressively promote open table formats (e.g., Apache Iceberg) as the key to engine portability, escaping lock-in to any single instance family.

[Enterprises/CIOs] Conduct a zero-trust audit. Don't be fooled by the elimination of Spectrum fees. Calculate your data lake query density: high-frequency, low-latency queries may favor RG; infrequent, bulk historical scans will likely be cheaper on a pay-per-scan model. Demand an SLA for performance isolation under mixed workloads from AWS. Test the real-world impact of Elastic Resize downtime on production pipelines. Assess cross-cloud portability: deep use of this integrated engine will tightly couple your data lake query logic to Redshift, dramatically increasing migration costs.

[Investors] See through this move as a vendor consolidation risk. AWS is internalizing all data lake analytics revenue by eliminating the quasi-third-party Spectrum service. This is a short-term positive for AWS earnings but intensifies competition with Snowflake and Databricks. Monitor the Apache Iceberg community: if it successfully delivers an engine-agnostic query optimizer, AWS's lock-in strategy is severely undermined.

Source: Amazon Press Center
View Original →

Get 3-5 key AI infrastructure signals weekly →

💬 Comments (0)