TPU - AI Infrastructure Intelligence Search

Google Cloud Other 2026-07-29

Google Cloud Launches Managed Distillation, Slashing TCO for Enterprise Reasoning AI

Google Cloud GA's AlphaEvolve evolutionary code search API and unveils a managed distillation service to train custom Gemini 2.5 Flash models from Gemini 3.1 Pro outputs, enabling specialized reasoning at Flash-tier speed and cost. New scientific AI tools also launched.

Google Cloud Other 2026-07-28

Google Cloud Q2收入增长82% AI基础设施需求成核心驱动力

...

Google Other 2026-07-28

谷歌CEO皮查伊证实Gemini 4已投入训练有望年底发布

...

Anthropic Other 2026-07-26

Anthropic partners with SpaceX for 220K GPUs, shifting AI compute ecosystem beyond hyperscalers

Anthropic signs a deal with SpaceX for 300MW capacity and 220,000 NVIDIA GPUs at Colossus 1 data center, with exploration of orbital AI compute. This diversifies Anthropic's compute sources beyond hyperscalers and doubles Claude Code usage limits.

NVIDIA Other 2026-07-25

Tiered AI Chip Market Emerges as US Allows H200 Exports to China with 25% Levy

The US Commerce Department approved NVIDIA H200 exports to China with a 25% sales tax, while maintaining a ban on Blackwell. This formalizes a tiered AI chip market, making H200 the best available imported chip for China, but the performance gap and added tax burden increase deployment costs and complexity for Chinese AI infrastructure.

Google Other 2026-07-24

Google Begins Gemini 4 Pre-training with 4M+ Context and Monthly Releases

Alphabet confirms start of largest pre-training run for Gemini 4, featuring 4M+ context and native multimodality with near-monthly releases. 2026 capex raised to $195-205B, Google Cloud Q2 up 82%, signaling full-stack AI acceleration.

Microsoft Other 2026-07-24

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

Microsoft introduces MAI-Image-2.5-Pro and MAI-Voice-2-Flash, proprietary models now powering Bing, PowerPoint, OneDrive, and Dynamics 365, replacing third-party models. Claims up to 84% GPU cost reduction and 2x faster voice, signaling a strategic shift to in-house AI.

Google Cloud Other 2026-07-23

Google Cloud Unveils GKE AI Security Blueprint with Three-Layer Defense

Google Cloud launches a security blueprint for AI workloads on GKE, featuring a three-layer defense: Confidential GKE Nodes for hardware memory encryption, open-source k8s-aibom for AI bill of materials, and Model Armor for prompt injection and data leak detection.

AMD Other 2026-07-23

AMD Invests $5B in Anthropic, Secures 2GW MI450 Deployment, Reshaping AI Compute Ecosystem

AMD and Anthropic announce a strategic partnership: Anthropic will deploy up to 2GW of AMD Instinct MI450 GPUs, with AMD investing up to $5B in Anthropic. They will collaborate on ROCm optimization and Claude workload tuning, marking AMD's transition from chip vendor to AI ecosystem investor and accelerating multi-sourcing in AI compute.

Google Other 2026-07-20

Google's Frozen v2 Chip Hardwires Gemini Architecture for 6-10x TPU Efficiency, Set for 2028

Google is developing Frozen v2, a dedicated AI chip that hardwires the Gemini model architecture into silicon for 6-10x energy efficiency per token over current TPUs. It is a new product line, planned for 2028, with weight update flexibility but a frozen architecture. This validates the industry shift from general-purpose GPUs to dedicated ASICs for AI inference.

Meta Other 2026-07-20

Meta Launches Muse Spark 1.1 API at 25% Competitor Price, Ends Open-Source Era

Meta releases Muse Spark 1.1, a multimodal reasoning model with 1M token context window, and launches its first paid API at 25% of competitors' price. This ends the Llama open-source era, signaling a strategic shift to proprietary API monetization and aggressive market share capture.

Google Cloud Other 2026-07-20

Google DeepMind AlphaEvolve GA: AI Self-Evolution for Data Center and Algorithm Optimization

On July 19, 2026, Google DeepMind announced the GA of AlphaEvolve, a Gemini-based multi-agent evolution system for algorithmic discovery, mathematical discovery, and data center efficiency optimization, already used in Borg and Orca, aiming to reduce Capex in massive AI compute investments.

Other Other 2026-07-20

Moonshot AI Launches Kimi K3: 2.8T Parameter Open-Source MoE Model at $3/$15

Moonshot AI unveils Kimi K3, a 2.8T parameter open-source MoE model with 896 experts (16 active), native vision, and 1M context. API pricing at $3/$15 input/output per million tokens undercuts rivals. Open weights release July 27. GPU capacity exhausted within 2 days. Arena score 1679 tops Fable 5.

NVIDIA Other 2026-07-16

NVIDIA-Nokia Alliance Redefines RAN Ecosystem with GPU-Based AI Acceleration

NVIDIA and Nokia are jointly developing AI-powered RAN technology, using NVIDIA GPUs to accelerate baseband processing and AI algorithms for beamforming and spectrum optimization. Targeting commercial deployment by 2027 and 2x spectral efficiency by 2028, this partnership marks a fundamental shift from dedicated RAN hardware to GPU-based, software-defined AI networks.

NVIDIA Other 2026-07-16

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA CUDA 13.3 adds the clmad hardware instruction for carryless multiply-accumulate on Ampere+ GPUs. GHASH throughput reaches 6.3 TB/s on B200, up to 18.8x faster than bitsliced. Sum-check protocol accelerates 3-13x. The instruction also benefits CRC, Reed-Solomon, and post-quantum cryptography.

Intel Other 2026-07-15

Intel 18A Yield Hits 85%, Secures Orders from NVIDIA, OpenAI, Reshaping Foundry Landscape

Intel reports 18A process yield improvement to 85%, from 65% last quarter, nearing TSMC N2's 90%. Secured foundry deals with NVIDIA, AMD, OpenAI, etc. EMIB advanced packaging yield reaches 98%, used in NVIDIA Feynman, Google TPU. This marks a strategic inflection in AI chip manufacturing.

Google Other 2026-07-15

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Google Cloud enables streaming Gemini Enterprise app telemetry (prompts, responses, activity logs) into BigQuery for real-time analysis. Leveraging BigQuery's AI capabilities (Conversational Analytics, auto-schema), it automates auditing, compliance, and insights for large-scale AI deployments, driving data-driven AI observability.

Meta Other 2026-07-14

Meta Expands Hyperion to 5GW with $50B Investment, Pioneering Local-First AI Infrastructure

Meta expands its Louisiana Hyperion data center to 5GW capacity, raising total investment from $10B to $50B. Partnering with Entergy to build 10 power plants and 240 miles of transmission lines, and utilizing JV and financing structures, Meta pioneers a local-first model that reshapes the collaboration between AI infrastructure, energy, and capital.

TSMC Other 2026-07-13

TSMC CoWoS Capacity to Reach 200k Wafers by 2027, Diversifying from GPU to CPU and ASIC

TSMC targets 200k wpm CoWoS capacity by 2027, narrowing supply-demand gap from 20% to 10%. Customer base diversifies from NVIDIA GPU to include AI server CPUs (MediaTek, AMD) and ASICs (Broadcom). CoPoS panel-level packaging enters pilot production in 2027.

Other Other 2026-07-12

WhiteFiber and DriveNets Achieve 111.2 Tbps Cross-DC AI Fabric, Breaking Power Constraints

WhiteFiber announces Project Redwood, partnering with DriveNets Ethernet AI fabric (FSE, VOQ, deep buffers), WEKA storage, and NVIDIA H200 GPUs, achieving 111.2 Tbps bandwidth and 0.9ms latency over 83km dark fiber, treating two geographically separated GPU clusters as a single logical supercluster. Commercialization planned for Q3 2026.

Reports

Filter