AI infrastructure - AI Infrastructure Intelligence Search

AMD Other 2026-07-26

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

At Advancing AI 2026, AMD unveiled the Helios rackscale platform integrating 72 MI455X GPUs and 18 EPYC Venice CPUs per rack, delivering 2.9 exaflops FP4 inference and 31TB HBM4 memory. The MI430X offers 288 TFLOPS FP64 for HPC. AMD claims up to 30% more inference tokens per dollar vs. competitors.

Meta Other 2026-07-26

Meta Transforms into AI Compute Landlord with $10B Anthropic Lease Deal

Meta is in early talks with Anthropic for a $10 billion compute lease deal, marking its strategic shift from internal AI infrastructure user to commercial landlord. This move monetizes Meta's massive capex and creates an inter-cloud leasing model, fundamentally reshaping the AI compute supply chain.

Cisco Other 2026-07-24

Cisco Proposes Logically Air-Gapped Model with eBPF, Shifting Security to Kernel

Cisco introduces a logically air-gapped governance model using eBPF and Cilium to create a software-defined cryptographic perimeter at the kernel level. Integrating Cisco Secure Workload with Isovalent, it aims to provide data residency and regulatory compliance for containerized, virtualized, and bare-metal environments without sacrificing cloud agility.

NVIDIA Other 2026-07-22

NVIDIA and Wistron Open US Factory for GB300 and Vera Rubin AI Superchips

Wistron opens its first US manufacturing facility in Fort Worth, producing NVIDIA GB300 Grace Blackwell Ultra and Vera Rubin superchips. The $700M plant aims for tens of thousands of boards monthly, marking NVIDIA's strategic shift to domestic AI hardware production.

Hewlett Packard Enterprise Other 2026-07-19

AMD and HPE Launch Helios Open AI Infrastructure to Rival NVIDIA Ecosystem

AMD and HPE expand partnership to launch Helios, an open-stack AI infrastructure platform integrating EPYC CPUs, Instinct MI455X GPUs, Pensando networking, and ROCm software. Each rack delivers up to 2.9 exaFLOPS FP4, built on OCP principles with Juniper switches, targeting simplified deployment and energy efficiency.

Microsoft Azure Other 2026-07-16

Microsoft Azure Cuts 200-400 Jobs in China Amid Cloud Growth, Reshapes Geopolitical Compliance

Microsoft Azure is cutting 200-400 jobs in China even as its cloud business grows 40% YoY, offering some employees relocation to Canada. This signals a strategic shift to move compliance and operational control out of China, reshaping enterprise multi-cloud and data sovereignty decisions.

NVIDIA Other 2026-07-16

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA unveils Jetson Thor T3000 and T2000 modules. The T3000 packs a Blackwell GPU and 8-core Neoverse CPU, delivering 865 FP4 TFLOPS at half the power of the T5000. New Jetson Agent Skills automate memory optimization, aiming to scale deployment of humanoid robots and edge AI.

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Anthropic Other 2026-06-25

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Anthropic alerted U.S. senators that Alibaba-linked operators conducted the largest known distillation attack, generating 28.8 million model exchanges via 25,000 fraudulent accounts to harvest Claude's frontier capabilities. The incident exposes a critical vulnerability in AI API security, forcing a rethinking of inference endpoint protection and usage monitoring.

OpenAI Other 2026-06-25

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Oracle announced the third cohort of its Defense Ecosystem at the Brussels summit, adding 10 companies. Concurrently, Whitespace's Saga AI system deployed on Oracle Roving Edge Devices during Royal Navy's Operation HIGHMAST, running classified AI workloads completely offline, proving sovereign edge AI is operational.

NVIDIA Other 2026-06-25

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.

OpenAI Other 2026-06-25

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

OpenAI, in collaboration with Broadcom, has developed Jalapeno, a custom LLM inference accelerator. The chip uses a multi-chip module with HBM3E memory and achieved tape-out in just nine months. Designed for OpenAI's model stack, it aims to reduce inference costs and dependency on NVIDIA GPUs, with initial deployment planned for late 2026.

Cisco Other 2026-06-24

Cisco Live US & InfoComm 2026 : la collaboration entre dans l’ère agentique

...

NVIDIA Other 2026-06-24

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA and AWS collaborate to embed cuVS as default GPU-accelerated vector search in OpenSearch Serverless, delivering 10x faster indexing at 1/4 cost. New EC2 G7 instances with RTX PRO 4500 Blackwell GPUs achieve up to 4.6x inference performance. AWS achieves GB300 Exemplar Cloud status for training.

NVIDIA Other 2026-06-23

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

NVIDIA announces a liquid cooling system for its Rubin GPUs running 45°C coolant (hotter than a hot tub), using dry coolers in a closed loop to cut electricity and eliminate water evaporation (100% reduction). However, chillers may still be needed in hot climates, and chip longevity impacts remain unaddressed.

NVIDIA Other 2026-06-23

NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents

NVIDIA unveils Agent Toolkit, an open modular foundation with Nemotron models, NemoClaw blueprints, and OpenShell runtime, enabling enterprises to build secure, specialized AI agents. It targets life sciences, cybersecurity, and industrial workflows, aiming to turn frontier models into domain-specific digital coworkers.

Anthropic Other 2026-06-23

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

Micron signed a long-term supply contract with Anthropic covering HBM, DRAM, and SSDs, with joint analysis of memory subsystems for AI workloads. Micron also participated in Anthropic's Series H. This aims to transform memory from a commodity to an AI infrastructure asset, but the stock has already run up, requiring proof of sustained scarcity premium.

NVIDIA Other 2026-06-23

NVIDIA Dominates TOP500 with Full-Stack Lock-in: Grace CPU, InfiniBand, and GPU Integration

NVIDIA powers 81% of TOP500 supercomputers, with Grace CPU adoption rising to 26 systems and Quantum InfiniBand connecting 376. The full-stack strategy (GPU+CPU+networking) shifts procurement from open components to single-vendor lock-in; top 8 Green500 systems use NVIDIA GPUs.

NVIDIA Other 2026-06-23

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

At DTW Ignite 2026, NVIDIA showcases its AI agent platform integrating NeMo synthetic data, NemoClaw secure runtime, OpenShell sandbox, and RTX PRO 6000-accelerated digital twins, aiming for autonomous telecom operations. Partners include SoftBank, Amdocs, NTT DATA, etc., moving from task automation to full autonomy.

Reports

Filter

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

Meta Transforms into AI Compute Landlord with $10B Anthropic Lease Deal

Cisco Proposes Logically Air-Gapped Model with eBPF, Shifting Security to Kernel

NVIDIA and Wistron Open US Factory for GB300 and Vera Rubin AI Superchips

AMD and HPE Launch Helios Open AI Infrastructure to Rival NVIDIA Ecosystem

Microsoft Azure Cuts 200-400 Jobs in China Amid Cloud Growth, Reshapes Geopolitical Compliance

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

Cisco Live US & InfoComm 2026 : la collaboration entre dans l’ère agentique

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

NVIDIA Dominates TOP500 with Full-Stack Lock-in: Grace CPU, InfiniBand, and GPU Integration

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane