Reports
AI-generated structured vendor updates
NVIDIA and Telecom Operators Build AI Grids to Redistribute AI Inference
NVIDIA is partnering with global telecom operators like AT&T and Comcast to transform existing distributed network sites into 'AI Grids' for edge AI inference. This initiative aims to deploy AI compute closer to users and data, reducing latency and cost per token. It represents a strategic shift for telcos from being data carriers to distributed AI computing platforms.
NVIDIA Partners with Telecom Operators to Build Distributed AI Inference Grid
NVIDIA collaborates with telecom operators to transform 100,000 global network sites and 100GW backup power into a distributed AI computing platform for low-latency inference. The AI grid has been validated in IoT and cloud gaming scenarios, achieving sub-500ms latency and 50% cost reduction.
Samsung and AMD Deepen AI Hardware Collaboration with HBM4 Supply and Foundry Services
Samsung will be the primary HBM4 supplier for AMD's next-gen MI455X GPU, delivering 13Gbps bandwidth memory. The partners will also develop DDR5 solutions for 6th-gen EPYC CPUs and explore Samsung's foundry services for future AMD products.
HPE Unveils AI Grid Solution for AI WAN Fabric with NVIDIA
HPE announced a collaboration with NVIDIA to launch the AI Grid Solution, securely scaling edge AI. The solution transforms WAN into an AI WAN fabric, connecting distributed inference sites with AI factories for consistent policy and predictable performance. It enables service providers to evolve from connectivity to AI services.
NVIDIA Releases Open-Source Models and NemoClaw Stack for Local AI Agent Deployment
NVIDIA launches Nemotron 3 Super 120B and Nano 4B open-source models, plus NemoClaw software stack optimizing OpenClaw on NVIDIA devices. The stack enables local model deployment for enhanced security, privacy, and cost avoidance. Partners with Unsloth for web interface simplifying model fine-tuning.
NVIDIA cuDF Accelerates Spark Data Processing for Enterprise A/B Testing
NVIDIA accelerates Apache Spark workflows on Google Kubernetes Engine using cuDF GPU DataFrame and CUDA-X libraries, delivering 4x performance gain and 76% cost reduction for Snap. The solution enables code-free migration of Spark applications and processes over 10PB data.
Project Rheo: NVIDIA Shifts Robot Training Control from Real Hospitals to Simulation
NVIDIA unveils Project Rheo, a blueprint combining Isaac Sim, GR00T VLA models, and synthetic data generation for hospital robotics. Developers train Physical AI policies in digital twins—loco-manipulation (surgical tray pick-and-place) and precision bimanual tasks (trocar assembly)—with Cosmos Transfer 2.5 for cross-scene generalization.
Cisco Expands Secure AI Factory with NVIDIA to Edge and Security
Cisco expands its Secure AI Factory with NVIDIA to enable AI deployment from data centers to edge sites, adding security capabilities like firewall policy enforcement on DPUs and AI Defense integration, offering flexible architecture options to accelerate production scaling.
Intel Xeon 6 Selected as Host CPU for NVIDIA DGX Rubin, Enhancing AI Inference Infrastructure
Intel Xeon 6 is chosen as host CPU for NVIDIA DGX Rubin NVL8 AI system, delivering 3x memory bandwidth and full-path confidential computing. This collaboration highlights CPU's architectural role in data orchestration and security for AI inference workloads.
HPE Deepens AI Factory Partnership with NVIDIA, Unveils Full-Stack Supercomputing Solutions
At GTC 2026, HPE announced enhancements to its NVIDIA AI Computing portfolio, introducing full-stack solutions for large-scale AI factories and supercomputers. The offerings integrate compute, GPUs, networking, liquid cooling, software, and services to improve deployment efficiency and time-to-insight.
HPE Alletra MP X10000 Becomes First NVIDIA-Certified Object Storage Platform for Enterprise AI
HPE announces its Alletra Storage MP X10000 is the first object-based platform certified by NVIDIA for enterprise AI. This signifies the extension of AI performance certification standards from the compute layer to the data layer, aiming to address data access bottlenecks in large-scale AI training, fine-tuning, and inference.
NVIDIA Warp: Differentiable Physics Simulation for AI Training on GPU
NVIDIA Warp is a framework for GPU-accelerated, differentiable physics simulation. It enables writing high-performance kernels in Python, with automatic differentiation, and integrates with PyTorch/JAX. The 2D Navier-Stokes example demonstrates end-to-end optimization, reducing the cost of generating training data for physics AI.
NVIDIA and Thinking Machines Lab Form Gigawatt-Scale AI Infrastructure Partnership
NVIDIA and Thinking Machines Lab announced deployment of at least one gigawatt of next-gen Vera Rubin systems for cutting-edge AI model training. This collaboration sets a new benchmark for hyperscale AI compute demand, signaling a move towards gigawatt-scale AI infrastructure.
NVIDIA Launches RTX PRO Server Virtualization for Game Development AI Infrastructure
NVIDIA introduces RTX PRO Server, a centralized virtualized GPU platform using RTX PRO 6000 GPU and vGPU software. It leverages MIG technology to partition a single GPU into up to 48 user instances, enhancing resource utilization and team collaboration. The solution integrates AI training with graphics workflows for dynamic resource allocation and unified cross-region development.
NVIDIA Enhances AI Video Generation Platform via ComfyUI Optimization and Hardware Synergy
NVIDIA announced major updates for local AI video generation at GDC, featuring ComfyUI interface simplification, native NVFP4/FP8 format support delivering 2.5x performance gains, and RTX Video Super Resolution nodes for efficient 4K upscaling. These optimizations significantly lower barriers and enhance efficiency through deep software-hardware synergy.
TSMC Shifts to System-Level Foundry Services via Technology Platform Strategy
TSMC introduces a technology platform strategy combining advanced processes and 3D packaging to deliver customized semiconductor solutions for mobile, HPC, automotive, and IoT. This marks a shift from pure-play foundry to system-level solutions, enhancing customer lock-in and service barriers through vertical integration.
AMD Releases Complete ROCm Technical Documentation to Strengthen AI Development Ecosystem
AMD released comprehensive ROCm technical documentation covering installation, system optimization, and performance tuning guides, with specialized optimization for MI300X GPUs. The documentation supports multiple programming models including HIP and OpenCL, improving GPU utilization efficiency for AI/HPC workloads.
NVIDIA Extends CUDA Tile Programming Model to Julia Language
NVIDIA introduces its CUDA Tile high-level GPU programming model to the Julia ecosystem via the cuTile.jl package. This move aims to lower the barrier to high-performance GPU kernel development by abstracting low-level thread and memory management with a tile-based data model, while maintaining high syntax and performance parity with the Python version.
Apple Introduces M5 Chip with Enhanced AI Compute Capabilities
Apple launches new MacBook Air with in-house M5 chip, claiming world's fastest CPU cores and 4x AI processing boost over M4. Features neural accelerators, Wi-Fi 7 support, and doubled base storage to 512GB.
Apple M5 Chips Integrate Neural Accelerators for Enhanced Local AI Inference
Apple launches M5 Pro and M5 Max chips with Fusion architecture integrating dual-die SoC, featuring neural accelerators per GPU core for 4x AI performance boost. Unified memory bandwidth up to 614GB/s supports 128GB RAM, optimized for local LLM processing and AI model training.