inference - AI Infrastructure Intelligence Search

AMD Other High Signal 2026-03-18

AMD and NAVER Cloud Collaborate on Sovereign AI Infrastructure in Korea

AMD and NAVER Cloud announced a strategic collaboration to accelerate sovereign AI infrastructure in Korea. NAVER Cloud will expand deployment of AMD EPYC "Venice" CPUs and gain early access to next-gen Instinct MI455X GPUs, with joint optimization of AI services and software stacks on AMD platforms.

AMD Other High Signal 2026-03-18

AMD and Samsung Deepen Collaboration, Locking HBM4 Supply and Exploring Foundry Partnership

AMD and Samsung signed an MOU, designating Samsung as the primary HBM4 supplier for the next-gen Instinct MI455X GPU and collaborating on DDR5 memory optimized for 6th Gen EPYC CPUs. The companies will also explore opportunities for Samsung to provide foundry services for future AMD products.

NVIDIA Other High Signal 2026-03-18

NVIDIA and Telecom Operators Build AI Grids to Redistribute AI Inference

NVIDIA is partnering with global telecom operators like AT&T and Comcast to transform existing distributed network sites into 'AI Grids' for edge AI inference. This initiative aims to deploy AI compute closer to users and data, reducing latency and cost per token. It represents a strategic shift for telcos from being data carriers to distributed AI computing platforms.

Hewlett Packard Enterprise Other High Signal 2026-03-17

HPE Launches AI Grid with NVIDIA to Unify Distributed Inference Clusters

HPE announced the AI Grid at NVIDIA GTC, an end-to-end solution built on NVIDIA's reference architecture to securely connect distributed AI factories and inference clusters into a single intelligent system. It enables service providers to deploy and operate thousands of edge inference sites, meeting the predictable, low-latency infrastructure requirements of AI-native applications.

Hewlett Packard Enterprise Other High Signal 2026-03-17

HPE Unveils AI Grid Solution for AI WAN Fabric with NVIDIA

HPE announced a collaboration with NVIDIA to launch the AI Grid Solution, securely scaling edge AI. The solution transforms WAN into an AI WAN fabric, connecting distributed inference sites with AI factories for consistent policy and predictable performance. It enables service providers to evolve from connectivity to AI services.

Cisco Other High Signal 2026-03-17

Cisco Expands Secure AI Factory with NVIDIA to Edge and Security

Cisco expands its Secure AI Factory with NVIDIA to enable AI deployment from data centers to edge sites, adding security capabilities like firewall policy enforcement on DPUs and AI Defense integration, offering flexible architecture options to accelerate production scaling.

Hewlett Packard Enterprise Other High Signal 2026-03-16

HPE Alletra MP X10000 Becomes First NVIDIA-Certified Object Storage Platform for Enterprise AI

HPE announces its Alletra Storage MP X10000 is the first object-based platform certified by NVIDIA for enterprise AI. This signifies the extension of AI performance certification standards from the compute layer to the data layer, aiming to address data access bottlenecks in large-scale AI training, fine-tuning, and inference.

NVIDIA Other High Signal 2026-03-14

NVIDIA Releases Cosmos World Model Suite, Enhancing Synthetic Data and Reasoning for Physical AI

NVIDIA has released significant updates to its Cosmos World Foundation Models (WFM) suite, including Transfer 2.5, Predict 2.5, and Reason 2. These models are designed to accelerate the generation of high-fidelity, physics-aware synthetic data and support downstream fine-tuning and reasoning for physical AI systems like robotics and autonomous vehicles, addressing the bottleneck of real-world data scarcity.

Trend Micro Other High Signal 2026-03-03

Trend Micro Report Highlights AI Supply Chain Risks and Model Attack Surfaces

Trend Micro's 'Fault Lines in the AI Ecosystem' report systematically analyzes security risks in the AI supply chain, including training data poisoning, third-party plugin vulnerabilities, and model theft attacks. It indicates that enterprise AI security boundaries have expanded from traditional IT infrastructure to the model layer and data pipelines.

Cisco Other High Signal 2026-02-10

Cisco Launches AI Infrastructure Chip and AgenticOps Platform to Strengthen Unified Architecture Strategy

Cisco introduced Silicon One G300 chip and AgenticOps platform to optimize AI cluster network performance and job completion time, while simplifying hybrid cloud operations via unified Nexus One management plane. Its updated AI Defense solution focuses on AI supply chain governance and runtime protection.

Cisco Other High Signal 2026-02-10

Cisco Launches G300 Chip and Systems for AI Agent-Era Data Center Networking

Cisco introduces 102.4Tbps Silicon One G300 switching chip with liquid-cooled N9000/8000 systems delivering 70% energy efficiency, 1.6T optics support, and Nexus One unified management plane upgrade.

NVIDIA Other 2026-01-23

NVFP4 + TeaCache Drive 10x FLUX.2 Inference Speedup, Locking Blackwell Ecosystem

NVIDIA and BFL optimize FLUX.2 on DGX B200/B300 using NVFP4 4-bit quantization, TeaCache step skipping, CUDA Graphs, and torch.compile, achieving 6.3x (single GPU) to 10.2x (dual GPU) latency reduction vs H200, with 40% memory savings. The stack is tightly coupled to TensorRT-LLM visualgen and Blackwell hardware.

OpenAI Other Medium Signal 2026-01-14

OpenAI Partners with Cerebras to Enhance AI Inference Infrastructure

OpenAI partners with Cerebras to add 750MW of high-speed AI compute, targeting reduced inference latency and improved real-time performance for ChatGPT workloads. This underscores OpenAI's strategy of investing in specialized AI hardware for large-scale model services.

NVIDIA Other 2025-11-08

NVIDIA Launches Interactive AI Agent for GPU-Accelerated Data Science with Nemotron Nano-9B

NVIDIA unveils an interactive AI agent powered by Nemotron Nano-9B-v2 and CUDA-X libraries, enabling natural language orchestration of ML workflows. It achieves 3x-43x GPU acceleration over CPU for data processing, model training, and hyperparameter optimization.

Microsoft Other Medium Signal 2025-02-27

Microsoft Launches Phi-4 SLM Series to Enhance Edge AI and Multimodal Reasoning

Microsoft introduced the Phi-4 family of small language models (SLMs), featuring the 5.6B-parameter Phi-4-multimodal capable of processing speech, vision and text. The models are now available in Azure AI Foundry, HuggingFace and NVIDIA's API Catalog with optimized edge computing capabilities.

Google Other High Signal 2020-10-11

Google Cloud Integrates MCP with Apigee and Advances Agentic Platform to Evolve Enterprise APIs for AI Agents

Google Cloud announced the general availability of Model Context Protocol (MCP) in Apigee and the advancement of its Agentic Platform, aiming to transform traditional enterprise APIs into secure, governed tools for AI agents at scale. This move integrates API governance, security layers, and AI inference infrastructure, providing core platform capabilities for enterprises shifting from API-driven to agent-driven architectures.

Reports

Filter