inference - AI Infrastructure Intelligence Search

Google Other 2026-04-29

Google Opens TPU Hardware to On-Prem, 8th-Gen Chips Target Nvidia

Google announces 8th-gen TPUs (8t for training with 3x performance over Ironwood, 8i for inference with 80% better perf/dollar) and plans to deliver TPU hardware directly to customer data centers. Also closed Wiz acquisition to bolster AI security. This marks a strategic pivot from cloud-only to hardware supplier.

Cisco Other High Signal 2026-04-28

Cisco Leverages Hardware Refresh Cycle to Drive AI-Ready Data Center Architecture

Cisco argues that the core impediment to enterprise AI strategy is data center infrastructure. It advocates integrating AI readiness into routine hardware refresh cycles, emphasizing proactive operations, security embedded in the network fabric, end-to-end observability, and high-performance networking as foundational for AI infrastructure.

Microsoft Other High Signal 2026-04-28

Microsoft Scales Azure Local to Thousands of Nodes for Sovereign Private Cloud

Microsoft announced that its Azure Local platform now scales to support deployments of thousands of servers within a single sovereign boundary, providing infrastructure for large-scale sovereign private clouds. The platform operates in connected, intermittently connected, or fully disconnected environments and integrates hardware like Intel Xeon 6 processors, aiming to meet the combined demands for scale, control, and compliance from national infrastructure, regulated workloads, and on-premises AI inference.

AMD Other High Signal 2026-04-27

AMD Extends Edge AI Architecture to Space, Defining Orbital Computing Paradigm

AMD's CTO proposes applying the core principles of 'performance-per-watt' and 'mission-critical reliability' from terrestrial edge AI to space computing. The company is providing a repeatable platform foundation for in-orbit satellite intelligence and future orbital data centers through heterogeneous computing, open software stacks, and modular system design.

AMD Other High Signal 2026-04-27

AMD Highlights AI PC as Critical Infrastructure for Enterprise Agentic AI in IDC White Paper

AMD released an IDC white paper indicating that over 80% of enterprises are planning, piloting, or deploying AI PCs to support scaled Agentic AI. The report highlights high-performance NPUs and on-device AI processing as critical for enabling real-time, secure workflows, signaling a shift in enterprise AI infrastructure from cloud to endpoint.

NVIDIA Other High Signal 2026-04-24

NVIDIA Internalizes GPT-5.5 Powered AI Agents at Scale, Defining New Enterprise AI Infrastructure Paradigm

NVIDIA announced that over 10,000 employees have scaled the use of GPT-5.5 via the Codex app, running on NVIDIA GB200 NVL72 infrastructure. This demonstrates the technical feasibility of 'transformative' productivity gains from frontier model inference in enterprise workflows. It also provides a reference architecture for deploying AI agents with auditable, isolated security via dedicated cloud VMs.

NVIDIA Other High Signal 2026-04-22

NVIDIA and Google Cloud Deepen Collaboration to Build Cloud Infrastructure for AI Factories and Physical AI

NVIDIA and Google Cloud have announced an expanded collaboration, introducing new Vera Rubin and Blackwell GPU-powered instances to build "AI factories" scaling to nearly a million GPUs. The integration of Gemini, Nemotron, and other platforms aims to accelerate production deployment of agentic and physical AI, such as robotics and digital twins.

Google Other 2026-04-22

Google Cloud Next '26: Agent Gateway Seizes Control Plane, TPU 8i Locks Inference

Google Cloud Next '26 announces 8th-gen TPUs (8t for training, 8i for inference), Agent Platform with Agent Gateway, Agent Identity, Agent-to-Agent Orchestration, Agentic Data Cloud, and Agentic Defense integrating Wiz. The move shifts control from infrastructure to agent orchestration, locking enterprises into a vertically integrated stack.

Anthropic Other High Signal 2026-04-21

Anthropic Signs $100B+ Deal with AWS to Lock in Decade of AI Compute

Anthropic signed a new agreement with Amazon AWS, committing over $100 billion over the next decade to secure up to 5GW of AI compute capacity and deeply integrate the Claude Platform into AWS. This move aims to address explosive demand for its Claude models and solidify its position as a key AI model provider on AWS.

Cisco Other High Signal 2026-04-16

Cisco and NVIDIA Elevate Network to AI Media Processing Control Plane

Cisco and NVIDIA deepen collaboration with a validated design based on the open-standard Media Exchange Layer (MXL). This integration merges Cisco's IP media fabric with NVIDIA's Holoscan platform, transforming the network from a transport layer into an active processing layer that supports real-time AI inference, enabling low-latency, multilingual AI-driven live media production for broadcasters.

Microsoft Other High Signal 2026-04-16

Microsoft Activates Fairwater Hyperscale AI Datacenter Ahead of Schedule, Setting New Infrastructure Standard

Microsoft announced the early activation of its Fairwater datacenter in Wisconsin, positioned as the world's most powerful AI facility. It integrates hundreds of thousands of NVIDIA GB200 GPUs into a single seamless cluster via massive fiber interconnect, targeting unprecedented compute scale for next-generation AI training and inference workloads.

NVIDIA Other High Signal 2026-04-15

NVIDIA Shifts AI Infrastructure Metric from FLOPS to Cost Per Token

NVIDIA advocates for "cost per token" as the primary economic metric for AI infrastructure, replacing "FLOPS per dollar." This shift moves the focus from computational inputs to business outputs, requiring full-stack optimization across hardware, software, and networking to lower enterprise AI inference TCO.

Cisco Other High Signal 2026-04-14

Cisco Partners with Industrial Automation Leaders to Position Factory Floor as Unified AI Compute Platform

At Hannover Messe, Cisco, in partnership with Rockwell Automation and others, posits that the factory floor is evolving into a unified compute platform integrating control, visualization, and AI inference. The core is the Cisco Unified Edge architecture, which consolidates traditionally siloed PLCs, HMIs, SCADA, and AI workloads (e.g., vision inspection, predictive maintenance) to enable a shift from insight to real-time, closed-loop action.

Meta Other High Signal 2026-04-14

Meta-Broadcom Multi-Year 2nm AI Chip Partnership, Initial 1GW+ Deployment

Meta and Broadcom announced multi-year, multi-generation strategic partnership to co-develop MTIA (Meta Training and Inference Accelerator) chips through 2029. Initial deployment exceeds 1GW, with multi-gigawatt expansion planned. Industry-first 2nm AI compute accelerator, based on Broadcom XPU platform. Meta has planned MTIA 300/400/450/500 iterations for recommendation, ranking, and large-scale inference. Broadcom CEO Hock Tan to step down from Meta board, transition to strategic advisor.

Intel Other High Signal 2026-04-13

Intel, Nokia, and Dell Introduce Dedicated UPF Appliance for Far Edge

At MWC 2026, Intel, Nokia, and Dell previewed a far-edge UPF appliance powered by Intel Xeon 6 SoC. The solution aims to deliver high-performance, low-power 5G core user plane processing for telcos in space- and power-constrained far-edge environments, with integrated AI capabilities.

Intel Other High Signal 2026-04-09

Intel and Google Deepen Collaboration to Define Core of Heterogeneous AI Infrastructure

Intel and Google announced a multiyear collaboration to advance next-generation AI and cloud infrastructure. The core is reinforcing the central role of CPUs and custom IPUs in heterogeneous AI systems, optimizing performance and efficiency through multi-generational Xeon processors, and expanding co-development of ASIC-based IPUs to improve efficiency and predictable performance at hyperscale.

Intel Other High Signal 2026-04-09

Intel and Google Deepen Collaboration on CPU and IPU for Heterogeneous AI Infrastructure

Intel and Google announced a multi-year collaboration to advance next-generation AI and cloud infrastructure through aligned Xeon processor roadmaps and expanded co-development of custom ASIC-based IPUs. This reinforces the central role of CPUs in AI system orchestration and the critical value of IPUs in offloading infrastructure tasks to improve efficiency at hyperscale.

Intel Other High Signal 2026-04-08

Intel and SambaNova Announce Heterogeneous Inference Architecture for Agentic AI

Intel and SambaNova have announced a collaborative blueprint for Agentic AI production workloads. The heterogeneous design combines GPUs, SambaNova RDUs, and Intel Xeon 6 processors to address performance, efficiency, and software compatibility issues, with availability expected in H2 2026.

ARM Other 2026-04-07

Arm Partners with Monash University Malaysia to Advance Semiconductor Talent for AI Era

Arm announced a collaboration with Monash University Malaysia's School of Engineering, donating IC design development boards and appointing an executive as a guest lecturer. The initiative aims to cultivate semiconductor talent with hands-on Arm architecture and modern system design experience for the AI era.

NVIDIA Other High Signal 2026-04-05

NVIDIA Advances Physical AI Integration in Robotics

NVIDIA showcases physical AI breakthroughs for robotics, accelerating deployment via Isaac Sim simulation and Jetson Orin edge modules. Case study: Aigen leverages synthetic data training and open-world foundation models to enable solar-powered robots for precision weeding, reducing herbicide use by 90%.

Reports

Filter