AI Infra - AI Infrastructure Intelligence Search

NVIDIA Other 2026-07-22

NVIDIA and Wistron Open US Factory for GB300 and Vera Rubin AI Superchips

Wistron opens its first US manufacturing facility in Fort Worth, producing NVIDIA GB300 Grace Blackwell Ultra and Vera Rubin superchips. The $700M plant aims for tens of thousands of boards monthly, marking NVIDIA's strategic shift to domestic AI hardware production.

Hewlett Packard Enterprise Other 2026-07-19

AMD and HPE Launch Helios Open AI Infrastructure to Rival NVIDIA Ecosystem

AMD and HPE expand partnership to launch Helios, an open-stack AI infrastructure platform integrating EPYC CPUs, Instinct MI455X GPUs, Pensando networking, and ROCm software. Each rack delivers up to 2.9 exaFLOPS FP4, built on OCP principles with Juniper switches, targeting simplified deployment and energy efficiency.

Microsoft Azure Other 2026-07-16

Microsoft Azure Cuts 200-400 Jobs in China Amid Cloud Growth, Reshapes Geopolitical Compliance

Microsoft Azure is cutting 200-400 jobs in China even as its cloud business grows 40% YoY, offering some employees relocation to Canada. This signals a strategic shift to move compliance and operational control out of China, reshaping enterprise multi-cloud and data sovereignty decisions.

NVIDIA Other 2026-07-16

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA unveils Jetson Thor T3000 and T2000 modules. The T3000 packs a Blackwell GPU and 8-core Neoverse CPU, delivering 865 FP4 TFLOPS at half the power of the T5000. New Jetson Agent Skills automate memory optimization, aiming to scale deployment of humanoid robots and edge AI.

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Anthropic Other 2026-06-25

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Anthropic alerted U.S. senators that Alibaba-linked operators conducted the largest known distillation attack, generating 28.8 million model exchanges via 25,000 fraudulent accounts to harvest Claude's frontier capabilities. The incident exposes a critical vulnerability in AI API security, forcing a rethinking of inference endpoint protection and usage monitoring.

OpenAI Other 2026-06-25

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Oracle announced the third cohort of its Defense Ecosystem at the Brussels summit, adding 10 companies. Concurrently, Whitespace's Saga AI system deployed on Oracle Roving Edge Devices during Royal Navy's Operation HIGHMAST, running classified AI workloads completely offline, proving sovereign edge AI is operational.

NVIDIA Other 2026-06-25

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.

OpenAI Other 2026-06-25

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

OpenAI, in collaboration with Broadcom, has developed Jalapeno, a custom LLM inference accelerator. The chip uses a multi-chip module with HBM3E memory and achieved tape-out in just nine months. Designed for OpenAI's model stack, it aims to reduce inference costs and dependency on NVIDIA GPUs, with initial deployment planned for late 2026.

Cisco Other 2026-06-24

Cisco Live US & InfoComm 2026 : la collaboration entre dans l’ère agentique

...

NVIDIA Other 2026-06-24

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA and AWS collaborate to embed cuVS as default GPU-accelerated vector search in OpenSearch Serverless, delivering 10x faster indexing at 1/4 cost. New EC2 G7 instances with RTX PRO 4500 Blackwell GPUs achieve up to 4.6x inference performance. AWS achieves GB300 Exemplar Cloud status for training.

NVIDIA Other 2026-06-23

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

NVIDIA announces a liquid cooling system for its Rubin GPUs running 45°C coolant (hotter than a hot tub), using dry coolers in a closed loop to cut electricity and eliminate water evaporation (100% reduction). However, chillers may still be needed in hot climates, and chip longevity impacts remain unaddressed.

NVIDIA Other 2026-06-23

NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents

NVIDIA unveils Agent Toolkit, an open modular foundation with Nemotron models, NemoClaw blueprints, and OpenShell runtime, enabling enterprises to build secure, specialized AI agents. It targets life sciences, cybersecurity, and industrial workflows, aiming to turn frontier models into domain-specific digital coworkers.

Anthropic Other 2026-06-23

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

Micron signed a long-term supply contract with Anthropic covering HBM, DRAM, and SSDs, with joint analysis of memory subsystems for AI workloads. Micron also participated in Anthropic's Series H. This aims to transform memory from a commodity to an AI infrastructure asset, but the stock has already run up, requiring proof of sustained scarcity premium.

NVIDIA Other 2026-06-23

NVIDIA Dominates TOP500 with Full-Stack Lock-in: Grace CPU, InfiniBand, and GPU Integration

NVIDIA powers 81% of TOP500 supercomputers, with Grace CPU adoption rising to 26 systems and Quantum InfiniBand connecting 376. The full-stack strategy (GPU+CPU+networking) shifts procurement from open components to single-vendor lock-in; top 8 Green500 systems use NVIDIA GPUs.

NVIDIA Other 2026-06-23

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

At DTW Ignite 2026, NVIDIA showcases its AI agent platform integrating NeMo synthetic data, NemoClaw secure runtime, OpenShell sandbox, and RTX PRO 6000-accelerated digital twins, aiming for autonomous telecom operations. Partners include SoftBank, Amdocs, NTT DATA, etc., moving from task automation to full autonomy.

ARM Other 2026-06-23

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

IDC reports Q1 2026 global server revenue hit a record $122.6B, with Arm-based servers capturing >45% share (x86 at 52%). Accelerated servers (GPU/ASIC/FPGA) generated >70% revenue. Nvidia's Grace CPU (NVL72) and hyperscaler custom Arm chips drive the shift; x86 still leads in unit volume but faces supply constraints.

Anthropic Other 2026-06-22

Micron-Anthropic Deal: Memory Co-Architecture Locks in AI Supply Chain

Micron and Anthropic sign a strategic agreement covering joint memory/storage architecture design, multi-year supply, Claude adoption, and investment. This ties frontier AI model demands directly to infrastructure design, aiming to optimize token economics and power efficiency, but essentially locks in supply and restructures the ecosystem.

NVIDIA Other 2026-06-22

Dell PowerEdge XE8812: Liquid-Cooled Density Trap with NVIDIA Vera Rubin NVL4

Dell launches PowerEdge XE8812 with NVIDIA Vera Rubin NVL4, delivering 144 GPUs per rack, 300kW+ power, and 100% direct liquid cooling. It offers a generational leap in memory and compute density for HPC and AI, but deeply locks users into Dell's PowerRack, iDRAC, and ORv3 ecosystem from chip to rack.

Reports

Filter

NVIDIA and Wistron Open US Factory for GB300 and Vera Rubin AI Superchips

AMD and HPE Launch Helios Open AI Infrastructure to Rival NVIDIA Ecosystem

Microsoft Azure Cuts 200-400 Jobs in China Amid Cloud Growth, Reshapes Geopolitical Compliance

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

Cisco Live US & InfoComm 2026 : la collaboration entre dans l’ère agentique

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

NVIDIA Dominates TOP500 with Full-Stack Lock-in: Grace CPU, InfiniBand, and GPU Integration

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

Micron-Anthropic Deal: Memory Co-Architecture Locks in AI Supply Chain

Dell PowerEdge XE8812: Liquid-Cooled Density Trap with NVIDIA Vera Rubin NVL4