Reports
AI-generated structured vendor updates
Cisco Deepens Nutanix Partnership, Extending HCI to AI and Edge
Cisco announced multiple advancements in its partnership with Nutanix, focusing on integrating the Nutanix Cloud Platform into Cisco AI PODs, Cisco Unified Edge, and FlashStack. The goal is to provide a unified, validated blueprint and operational model for both AI and traditional workloads from core to edge.
Microsoft Partners with Domestic Operators to Build Sovereign AI Infrastructure in Japan
Microsoft announced a $10B investment in Japan over four years, with a key pillar being a collaboration with Sakura Internet and SoftBank. This partnership will offer GPU-based AI compute services through Azure, managed by domestic providers to ensure data residency within Japan. This addresses the demand for sovereign AI infrastructure for sensitive workloads.
Anthropic Locks in Multi-Gigawatt Next-Gen TPU Capacity with Google and Broadcom
Anthropic has signed a new agreement with Google and Broadcom to secure multiple gigawatts of next-generation TPU capacity, expected online starting 2027. This expansion aims to power frontier Claude models and meet surging global customer demand. The partnership significantly expands Anthropic's $50 billion U.S. compute infrastructure commitment.
NVIDIA and Google Optimize Gemma 4 for Enhanced Local AI Agent Infrastructure
NVIDIA announces collaboration with Google to deeply optimize the Gemma 4 series of open models for its RTX, DGX Spark, and Jetson platforms. This move aims to extend high-performance, multimodal AI inference from the cloud to edge devices and personal workstations, providing full-stack model support (2B to 31B) for local AI agents.
NVIDIA Optimizes Gemma 4 Models for Local Agentic AI Acceleration
NVIDIA collaborates with Google to optimize the Gemma 4 family of models for efficient performance across a range of NVIDIA hardware, from edge devices to high-performance GPUs. These models support various tasks including reasoning, coding, and agent capabilities, making them suitable for local agentic AI applications.
Google Launches Gemma 4 Open Models, Targeting Edge Inference and AI Agent Architecture
Google introduces the Gemma 4 open model family, with four sizes from 2B to 31B parameters, emphasizing breakthrough intelligence-per-parameter and native support for agentic workflows, multimodality, and long context. The small models are engineered for edge devices, aiming to bring frontier reasoning to mobile and IoT scenarios.
Google Launches Gemma 4 Open Model Family
Google introduces Gemma 4 open model family with four size variants, optimized for edge and mobile devices. The series supports multimodal processing, long context windows and 140+ languages under Apache 2.0 license.
Cisco Launches Validated AI Infrastructure Solution
Cisco introduced validated AI infrastructure designs in collaboration with NVIDIA and Red Hat, offering pre-integrated AI POD solutions to address compatibility and security challenges in enterprise DIY AI infrastructure. The solution encompasses complete compute, networking, storage and AI software stacks with modular scalability.
AMD Announces Breakthrough MLPerf Inference 6.0 Results, Showcasing Multinode Scaling and Multimodal Capabilities
AMD's MLPerf Inference 6.0 submission, powered by Instinct MI355X GPUs, surpassed 1 million tokens per second for the first time on models like Llama 2 70B and GPT-OSS-120B. The results highlight efficient multinode scaling, rapid enablement of new workloads (e.g., text-to-video model Wan-2.2-t2v), and reproducible performance across a broad partner ecosystem.
Intel Demonstrates AI Performance with Xeon 6 and Arc Pro GPUs in MLPerf Inference
Intel showcased the performance of its Xeon 6 CPUs and Arc Pro B-Series GPUs in the MLPerf Inference v6.0 benchmarks, particularly in handling large language models (LLMs). The results indicate that a system with four Arc Pro B70 GPUs can process 120B parameter models, delivering up to 1.8x higher inference performance in multi-GPU setups.
Qualcomm Launches NPU-Integrated Wearable Platform to Advance On-Device AI and Personal AI Ecosystem
Qualcomm unveiled the Snapdragon Wear Elite platform, its first wearable platform with an integrated NPU designed for on-device AI, capable of supporting up to two-billion-parameter models. It marks a strategic shift from smartphone-centric to agent-centric computing, leveraging wearables for continuous context and enabling intelligence to flow across a user's device ecosystem.
Cisco Proposes Unified AI Fabric Architecture for Training/Inference Traffic
Cisco introduces unified AI fabric architecture using N9000 switches to intelligently route both training and inference traffic, addressing resource inefficiencies in dual-fabric setups. The solution features silicon-level low latency, real-time telemetry and automated policy tuning, targeting neocloud providers' platform transformation.
NVIDIA Collaborates with Energy Leaders to Position AI Factories as Smart Grid Assets
NVIDIA, in collaboration with Emerald AI, proposes treating large-scale AI data centers (AI factories) as flexible, intelligent grid assets rather than static power loads. This architecture integrates accelerated computing, power networking, and control to enhance grid reliability and optimize energy efficiency. Several major energy companies plan to collaborate on this architecture to support AI workloads and accelerate power connection.
NVIDIA Collaborates with Energy Leaders on AI Factory-Grid Integration Architecture
NVIDIA and Emerald AI introduced a new architecture treating AI factories as intelligent grid assets, combining accelerated computing, real-time energy orchestration and reference designs. The Vera Rubin DSX-based approach enables dynamic grid response and has gained support from multiple energy providers.
AWS and TGS Strategic Partnership for Energy AI and HPC Transformation
TGS selected AWS as preferred cloud provider, leveraging AWS HPC and generative AI for energy exploration solutions. Collaboration includes modernizing TGS Imaging AnyWare platform and deploying multimodal Subsurface Foundation Model with AWS Nitro security.
Cisco Launches Nexus Hyperfabric AI with 800G Switch and HGX B300 GPU Integration
Cisco introduces Nexus Hyperfabric AI infrastructure, integrating 800G Ethernet switches and NVIDIA HGX B300 GPUs, offering both fully integrated and flexible 'bring-your-own' deployment models. The solution aligns with NVIDIA's Cloud Partner program to streamline AI infrastructure deployment and operations.
Nokia and Stelia Collaborate to Integrate Open Networking with AI Platform for Distributed AI
Nokia has partnered with AI platform company Stelia to deeply integrate open-standards-based networking technology with an enterprise AI platform. This move aims to address performance, governance, and security challenges in deploying production-grade AI across distributed environments, ensuring high-throughput, low-latency data flow.
Intel and CrowdStrike Deepen AI PC Security Integration for Enhanced Endpoint Threat Detection
Intel and CrowdStrike expanded collaboration to deeply integrate Falcon platform with Intel AI PC hardware, leveraging CPU/GPU/NPU on-device AI acceleration and chip-level telemetry. The solution aims to enable real-time threat detection and intrusion prevention without performance loss, addressing generative AI data leakage risks at enterprise scale.
NVIDIA Demonstrates AI Factories as Flexible Grid Assets for Peak Demand Management
NVIDIA, in collaboration with EPRI, National Grid, and Emerald AI, demonstrated how AI factories powered by Blackwell GPU clusters can dynamically adjust power consumption in response to grid signals. This allows them to act as 'shock absorbers' during peak demand while maintaining performance for high-priority AI workloads.
NVIDIA and Emerald AI Demonstrate Dynamic Energy Adjustment in AI Factories
NVIDIA partners with Emerald AI to demonstrate grid-responsive energy management on a 96 Blackwell Ultra GPU cluster, using NVIDIA System Management Interface for real-time power telemetry and Emerald AI Conductor to dynamically adjust energy use while maintaining high-priority AI workload performance.