AI inference - AI Infrastructure Intelligence Search

ARM Other High Signal 2026-05-07

Arm Reports Record Results, AGI CPU Emerges as New AI Infrastructure Focal Point

Arm reported record FY2026 results with $4.92B revenue and over 20% growth for three consecutive years. The core highlight is the Arm AGI CPU designed for agentic AI, securing over $2B in customer demand and backing from Meta, AWS, Google, and others.

Intel Other Medium Signal 2026-05-06

Intel at Computex 2026 Emphasizes CPU's Critical Role in AI Compute

Intel will outline its vision for the AI-driven computing era at Computex 2026, centering on the resurgence of the CPU as a critical AI engine. It emphasizes CPU-GPU/accelerator synergy to build efficient, scalable AI systems atop the broad x86 ecosystem.

Intel Other Medium Signal 2026-05-04

Intel Appoints Leadership to Integrate Client Computing and Physical AI

Intel appointed Alex Katouzian as EVP/GM of Client Computing and Physical AI Group, and named Pushkar Ranade as CTO. This move aims to align traditional PC business with physical AI systems (robotics, autonomous machines) and advance frontier technologies like quantum computing.

Cisco Other High Signal 2026-05-01

Cisco Report Reveals Fundamental Impact of Agentic AI on WAN Traffic Patterns

Cisco released a research report based on real-world network traffic data, quantifying for the first time the disruptive impact of agentic AI on WAN traffic patterns, symmetry, and critical paths, and predicting AI inference traffic will comprise 25% of total network traffic by 2035.

Intel Other High Signal 2026-04-30

Intel Collaborates with ChatPPT to Launch Hybrid AI PC Edition, Driving AI Workload Localization

Intel partnered with AI app ChatPPT to launch a hybrid AI PC edition using Intel's AI Super Builder technology. This version offloads certain AI workloads (e.g., formatting) from the cloud to the local PC, reducing cloud token costs by over 50%, boosting usage duration by 32%, and enhancing data privacy.

NVIDIA Other High Signal 2026-04-30

NVIDIA Releases Enterprise AI Factory Reference Architectures, Standardizing On-Premises AI Infrastructure

NVIDIA has released Enterprise AI Factory Reference Architectures, offering three standardized configurations from RTX PRO to NVL72 for on-premises deployments. This architecture integrates compute, networking, storage, and software, aiming to transform AI infrastructure from experimental setups into predictable, scalable industrial operational platforms.

AMD Other High Signal 2026-04-29

AMD and Liquid AI Discuss Efficient AI Architecture from Silicon to Systems

AMD's CTO and Liquid AI's CEO discuss the evolution of AI architecture, emphasizing efficiency as key to extending AI from the cloud to edge and endpoint devices. They argue that co-design from silicon to systems enables low-power, responsive AI inference, supporting always-on agents and multi-model orchestration.

Cisco Other High Signal 2026-04-28

Cisco Leverages Hardware Refresh Cycle to Drive AI-Ready Data Center Architecture

Cisco argues that the core impediment to enterprise AI strategy is data center infrastructure. It advocates integrating AI readiness into routine hardware refresh cycles, emphasizing proactive operations, security embedded in the network fabric, end-to-end observability, and high-performance networking as foundational for AI infrastructure.

Microsoft Other High Signal 2026-04-28

Microsoft Scales Azure Local to Thousands of Nodes for Sovereign Private Cloud

Microsoft announced that its Azure Local platform now scales to support deployments of thousands of servers within a single sovereign boundary, providing infrastructure for large-scale sovereign private clouds. The platform operates in connected, intermittently connected, or fully disconnected environments and integrates hardware like Intel Xeon 6 processors, aiming to meet the combined demands for scale, control, and compliance from national infrastructure, regulated workloads, and on-premises AI inference.

Google Other 2026-04-22

Google Cloud Next '26: Agent Gateway Seizes Control Plane, TPU 8i Locks Inference

Google Cloud Next '26 announces 8th-gen TPUs (8t for training, 8i for inference), Agent Platform with Agent Gateway, Agent Identity, Agent-to-Agent Orchestration, Agentic Data Cloud, and Agentic Defense integrating Wiz. The move shifts control from infrastructure to agent orchestration, locking enterprises into a vertically integrated stack.

Cisco Other High Signal 2026-04-16

Cisco and NVIDIA Elevate Network to AI Media Processing Control Plane

Cisco and NVIDIA deepen collaboration with a validated design based on the open-standard Media Exchange Layer (MXL). This integration merges Cisco's IP media fabric with NVIDIA's Holoscan platform, transforming the network from a transport layer into an active processing layer that supports real-time AI inference, enabling low-latency, multilingual AI-driven live media production for broadcasters.

NVIDIA Other High Signal 2026-04-15

NVIDIA Shifts AI Infrastructure Metric from FLOPS to Cost Per Token

NVIDIA advocates for "cost per token" as the primary economic metric for AI infrastructure, replacing "FLOPS per dollar." This shift moves the focus from computational inputs to business outputs, requiring full-stack optimization across hardware, software, and networking to lower enterprise AI inference TCO.

Intel Other High Signal 2026-04-13

Intel, Nokia, and Dell Introduce Dedicated UPF Appliance for Far Edge

At MWC 2026, Intel, Nokia, and Dell previewed a far-edge UPF appliance powered by Intel Xeon 6 SoC. The solution aims to deliver high-performance, low-power 5G core user plane processing for telcos in space- and power-constrained far-edge environments, with integrated AI capabilities.

Intel Other High Signal 2026-04-09

Intel and Google Deepen Collaboration to Define Core of Heterogeneous AI Infrastructure

Intel and Google announced a multiyear collaboration to advance next-generation AI and cloud infrastructure. The core is reinforcing the central role of CPUs and custom IPUs in heterogeneous AI systems, optimizing performance and efficiency through multi-generational Xeon processors, and expanding co-development of ASIC-based IPUs to improve efficiency and predictable performance at hyperscale.

Intel Other High Signal 2026-04-09

Intel and Google Deepen Collaboration on CPU and IPU for Heterogeneous AI Infrastructure

Intel and Google announced a multi-year collaboration to advance next-generation AI and cloud infrastructure through aligned Xeon processor roadmaps and expanded co-development of custom ASIC-based IPUs. This reinforces the central role of CPUs in AI system orchestration and the critical value of IPUs in offloading infrastructure tasks to improve efficiency at hyperscale.

Intel Other High Signal 2026-04-08

Intel and SambaNova Announce Heterogeneous Inference Architecture for Agentic AI

Intel and SambaNova have announced a collaborative blueprint for Agentic AI production workloads. The heterogeneous design combines GPUs, SambaNova RDUs, and Intel Xeon 6 processors to address performance, efficiency, and software compatibility issues, with availability expected in H2 2026.

ARM Other 2026-04-07

Arm Partners with Monash University Malaysia to Advance Semiconductor Talent for AI Era

Arm announced a collaboration with Monash University Malaysia's School of Engineering, donating IC design development boards and appointing an executive as a guest lecturer. The initiative aims to cultivate semiconductor talent with hands-on Arm architecture and modern system design experience for the AI era.

NVIDIA Other High Signal 2026-04-03

NVIDIA and Google Optimize Gemma 4 for Enhanced Local AI Agent Infrastructure

NVIDIA announces collaboration with Google to deeply optimize the Gemma 4 series of open models for its RTX, DGX Spark, and Jetson platforms. This move aims to extend high-performance, multimodal AI inference from the cloud to edge devices and personal workstations, providing full-stack model support (2B to 31B) for local AI agents.

NVIDIA Other Medium Signal 2026-04-03

NVIDIA Optimizes Gemma 4 Models for Local Agentic AI Acceleration

NVIDIA collaborates with Google to optimize the Gemma 4 family of models for efficient performance across a range of NVIDIA hardware, from edge devices to high-performance GPUs. These models support various tasks including reasoning, coding, and agent capabilities, making them suitable for local agentic AI applications.

AMD Other High Signal 2026-04-02

AMD Announces Breakthrough MLPerf Inference 6.0 Results, Showcasing Multinode Scaling and Multimodal Capabilities

AMD's MLPerf Inference 6.0 submission, powered by Instinct MI355X GPUs, surpassed 1 million tokens per second for the first time on models like Llama 2 70B and GPT-OSS-120B. The results highlight efficient multinode scaling, rapid enablement of new workloads (e.g., text-to-video model Wan-2.2-t2v), and reproducible performance across a broad partner ecosystem.

Reports

Filter