GPU - AI Infrastructure Intelligence Search

Amazon Other 2026-07-10

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

AWS begins external sales of its Trainium 3 AI training chip, fabricated on TSMC 3nm process, delivering 2.52 PFLOPS per chip. Early customers include Anthropic and Uber. This move directly challenges NVIDIA's dominance and marks AWS's strategic shift from cloud provider to chip vendor.

AMD Other 2026-07-10

AMD's Experimental Topological Ghost Protocol Boosts MI300X Inference 10x

AMD introduces experimental Topological Ghost Protocol (TGP) on MI300X GPUs, achieving 431 tokens/sec with 100% success in high-concurrency inference, 10x improvement over standard vLLM. TGP uses KV-cache recycling and segmented state management, still experimental but potentially redefining AI inference benchmarks.

AMD Other 2026-07-10

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

...

OpenAI Other 2026-07-09

OpenAI Reopens with GPT-oss Models: Apache 2.0 License Hides Cloud Offload Control

OpenAI launches GPT-oss-120b and GPT-oss-20b under Apache 2.0 license, capable of running on a single 80GB GPU. However, a built-in cloud offload mechanism routes complex queries to proprietary models, masking a strategic control point shift behind the open-source facade.

Google Other 2026-07-09

Google Gemini 3.5 Pro Rebuilds from Scratch: 2M Token Context Window Reshapes AI Frontier

Google DeepMind targets July 17 for Gemini 3.5 Pro, a full architectural rewrite of its pretraining stack to overcome deficits in math reasoning, SVG generation, and image quality. Specs include a 2M token context window, Deep Think reasoning layer, and multi-step autonomous workflows, though unconfirmed by Google.

NVIDIA Other 2026-07-09

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

...

NVIDIA Other 2026-07-08

NVIDIA Rigel Core: Single-Threaded CPU as the New Control Plane for Agentic AI

NVIDIA unveils Rosa CPU architecture with custom Rigel core (Arm v9.2), targeting single-threaded performance for Agentic AI workloads, paired with Feynman GPU (1.6nm, 50 PFLOPS) in 2028. This shifts CPU design from core-count scaling to serial-latency optimization, directly challenging AMD EPYC and Intel Xeon dominance.

NVIDIA Other 2026-07-07

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

...

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

NVIDIA Other 2026-07-07

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

...

Cisco Other 2026-07-07

Cisco Locks AI Data Center Security Control Plane with Silicon One and Hypershield

Cisco launches next-gen security for AI data centers, deeply integrating Splunk SIEM with its Silicon One 51.2Tbps chip and Hypershield architecture to push security policies to the network edge. This move aims to shift the security control plane from standalone appliances to its proprietary ASIC and management platform, creating hardware lock-in.

MediaTek Other 2026-07-07

MediaTek and Alibaba Cloud Deploy Tongyi Qianwen LLM on Dimensity Chips

MediaTek partners with Alibaba Cloud to deploy a small version of the Tongyi Qianwen LLM on Dimensity 9300/8300 mobile platforms, enabling offline multi-turn conversations. This move aims to capture edge AI inference control via NPU optimization and SDK integration, directly challenging Qualcomm.

Amazon Other 2026-07-07

AWS Boosts Trainium3 ASIC Shipments, Accelerating Custom AI Chip Ecosystem Against NVIDIA

Amazon AWS has notified its supply chain to increase Q3 2026 shipments of Trainium3-based ASIC servers by 20-30%. This reflects growing confidence in its custom AI chips and a strategic push to reduce reliance on NVIDIA GPUs. AWS also partnered with OpenAI to develop a Stateful Runtime Environment on Bedrock.

NVIDIA Other 2026-07-07

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

NVIDIA officially denies reports of Kyber NVL144 rack delay to 2028, but SemiAnalysis revelations about a 78-layer ultra-high-density PCB midplane bottleneck and Rubin Ultra cancellation expose hard physical limits in signal integrity and manufacturing, opening a strategic window for AMD and Google.

Amazon Other 2026-07-06

AWS boosts Trainium 3 shipments, accelerating ASIC substitution for NVIDIA GPUs

Supply chain sources indicate Amazon AWS has instructed vendors to increase Trainium 3 shipments for Q3 2026 by 20-30%. This signals strong confidence in its custom ASIC strategy to reduce dependence on NVIDIA GPUs, leveraging superior cost and power efficiency for cloud AI training.

NVIDIA Other 2026-07-06

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

SemiAnalysis reveals NVIDIA's Kyber NVL144 delayed beyond 12 months to 2028 due to 78-layer Orthogonal Backplane manufacturing challenges. The interim NVL72x2 solution is cancelled due to operational burdens, and the 4-die Rubin Ultra is also scrapped, leaving a product gap in NVIDIA's scaling roadmap.

Anthropic Other 2026-07-06

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

Anthropic has initiated its own AI chip development and is in talks with Samsung for 2nm foundry services. The move aims to reduce reliance on NVIDIA GPUs, optimize inference costs, and strengthen its technology moat ahead of a potential IPO. It joins OpenAI, Google, and others in the custom ASIC race, signaling a shift from software to hardware competition.

AMD Other 2026-07-06

AMD Unveils Zen 6/7 CPU and MI400/500 GPU Roadmap, Targets NVIDIA Rubin with HBM4 and 2nm

AMD unveiled its Zen 6/7 CPU and MI400/500 GPU roadmap at its 2026 Financial Analyst Day, featuring TSMC 2nm process and HBM4 memory. The MI400 series boasts 432GB memory, 19.6TB/s bandwidth, and 40 PFLOPs FP4 performance, directly targeting NVIDIA's Vera Rubin architecture with an annual cadence to disrupt the AI hardware monopoly.

Google Cloud Other 2026-07-06

Google Cloud Launches Blackwell GPU Confidential VM & Open-Source Prompt Encryption SDK, Redefining AI Security

Google Cloud upgrades its confidential computing portfolio with Blackwell GPU-based confidential VMs (Confidential G4 VMs preview), open-source Prompt Encryption SDK, and enhanced Confidential Space featuring Intel Trust Authority and Hopper GPU support, addressing TEE vulnerability CVE-2026-33697 to bolster AI inference and cross-organization training security.

NVIDIA Other 2026-07-04

英伟达RTX 5080公版显卡将在BW2026限量发售，售价8299元

...

Reports

Filter

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

AMD's Experimental Topological Ghost Protocol Boosts MI300X Inference 10x

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

OpenAI Reopens with GPT-oss Models: Apache 2.0 License Hides Cloud Offload Control

Google Gemini 3.5 Pro Rebuilds from Scratch: 2M Token Context Window Reshapes AI Frontier

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

NVIDIA Rigel Core: Single-Threaded CPU as the New Control Plane for Agentic AI

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Cisco Locks AI Data Center Security Control Plane with Silicon One and Hypershield

MediaTek and Alibaba Cloud Deploy Tongyi Qianwen LLM on Dimensity Chips

AWS Boosts Trainium3 ASIC Shipments, Accelerating Custom AI Chip Ecosystem Against NVIDIA

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

AWS boosts Trainium 3 shipments, accelerating ASIC substitution for NVIDIA GPUs

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

AMD Unveils Zen 6/7 CPU and MI400/500 GPU Roadmap, Targets NVIDIA Rubin with HBM4 and 2nm

Google Cloud Launches Blackwell GPU Confidential VM & Open-Source Prompt Encryption SDK, Redefining AI Security

英伟达RTX 5080公版显卡将在BW2026限量发售，售价8299元

Reports

Filter

AWS Sells Trainium 3 Externally, Challenging NVIDIA's AI Training Chip Dominance

AMD's Experimental Topological Ghost Protocol Boosts MI300X Inference 10x

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

OpenAI Reopens with GPT-oss Models: Apache 2.0 License Hides Cloud Offload Control

Google Gemini 3.5 Pro Rebuilds from Scratch: 2M Token Context Window Reshapes AI Frontier

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

NVIDIA Rigel Core: Single-Threaded CPU as the New Control Plane for Agentic AI

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Cisco Locks AI Data Center Security Control Plane with Silicon One and Hypershield

MediaTek and Alibaba Cloud Deploy Tongyi Qianwen LLM on Dimensity Chips

AWS Boosts Trainium3 ASIC Shipments, Accelerating Custom AI Chip Ecosystem Against NVIDIA

NVIDIA Denies Kyber NVL144 Delay, But 78-Layer PCB Bottleneck Exposes AI Hardware Physics Limit

AWS boosts Trainium 3 shipments, accelerating ASIC substitution for NVIDIA GPUs

NVIDIA Kyber NVL144 Delayed to 2028: Midplane PCB Manufacturing Becomes AI Scaling Bottleneck

Anthropic Starts Custom AI Chip Development, Talks Samsung 2nm, Aims for Compute Independence

AMD Unveils Zen 6/7 CPU and MI400/500 GPU Roadmap, Targets NVIDIA Rubin with HBM4 and 2nm

Google Cloud Launches Blackwell GPU Confidential VM & Open-Source Prompt Encryption SDK, Redefining AI Security

英伟达RTX 5080公版显卡将在BW2026限量发售，售价8299元

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs