Reports
AI-generated structured vendor updates
Google Launches Gemma 4 Open Models, Targeting Edge Inference and AI Agent Architecture
Google introduces the Gemma 4 open model family, with four sizes from 2B to 31B parameters, emphasizing breakthrough intelligence-per-parameter and native support for agentic workflows, multimodality, and long context. The small models are engineered for edge devices, aiming to bring frontier reasoning to mobile and IoT scenarios.
Google Launches Gemma 4 Open Model Family
Google introduces Gemma 4 open model family with four size variants, optimized for edge and mobile devices. The series supports multimodal processing, long context windows and 140+ languages under Apache 2.0 license.
AMD Announces Breakthrough MLPerf Inference 6.0 Results, Showcasing Multinode Scaling and Multimodal Capabilities
AMD's MLPerf Inference 6.0 submission, powered by Instinct MI355X GPUs, surpassed 1 million tokens per second for the first time on models like Llama 2 70B and GPT-OSS-120B. The results highlight efficient multinode scaling, rapid enablement of new workloads (e.g., text-to-video model Wan-2.2-t2v), and reproducible performance across a broad partner ecosystem.
Cisco Discloses Memory Poisoning Attack Method in AI Coding Assistants
Cisco's security team discovered and validated a persistent memory poisoning attack method targeting AI coding assistants like Claude Code, demonstrating how tampering with MEMORY.md system files can persistently manipulate AI behavior. This vulnerability prompted Anthropic to remove user memory files' system prompt privileges in v2.1.50.
Intel Demonstrates AI Performance with Xeon 6 and Arc Pro GPUs in MLPerf Inference
Intel showcased the performance of its Xeon 6 CPUs and Arc Pro B-Series GPUs in the MLPerf Inference v6.0 benchmarks, particularly in handling large language models (LLMs). The results indicate that a system with four Arc Pro B70 GPUs can process 120B parameter models, delivering up to 1.8x higher inference performance in multi-GPU setups.
Arm Expands into Silicon Products with First Self-Designed AGI CPU
Arm is expanding its compute platform into production silicon for the first time, launching the self-designed Arm AGI CPU for AI data centers and agentic workloads. It targets over 2x performance per rack versus x86 platforms and is backed by lead partner Meta, customers like OpenAI, and a broad OEM/ODM ecosystem.
NVIDIA Introduces Physical AI Data Factory Blueprint, Transforming Compute into Synthetic Data
At GTC, NVIDIA introduced the Physical AI Data Factory Blueprint, an open reference architecture designed to transform compute into large-scale, high-quality synthetic training data. Built on Cosmos world models and the OSMO operator, it addresses the bottleneck of scaling real-world data, aiming to serve as the data engine for next-gen autonomous systems and robots.
ARM Launches AGI CPU for Agentic AI Infrastructure Era
ARM introduces the Arm AGI CPU, its first silicon product, designed for agentic AI infrastructure on Neoverse. Optimized for massively parallel workloads, it supports 272 cores per blade in a 1OU design, delivering 8160 cores per rack and over 2x performance vs. x86 systems.
Arm Neoverse Reshapes Control Layer in AI Infrastructure
ARM introduces Neoverse infrastructure CPU cores optimized for cloud, AI, and HPC workloads, adopted by NVIDIA, AWS, Microsoft, and Google for their AI platforms, delivering performance gains and energy efficiency. This architecture enables high-density AI workload deployment in cloud and edge environments with enhanced multi-tenant security.
NVIDIA Donates GPU Dynamic Resource Allocation Driver to Kubernetes Community
NVIDIA donated its GPU Dynamic Resource Allocation (DRA) driver to the CNCF, making it an upstream Kubernetes project. This move aims to shift the core control point of GPU orchestration from proprietary vendor layers to the open-source community, and drive standardization in collaboration with major cloud providers.
NVIDIA IGX Thor: 8x Edge AI Compute with ConnectX-7 Network Lock-In
NVIDIA launches IGX Thor edge AI platform with Blackwell GPU, up to 5,581 FP4 TFLOPS, dual 200GbE RDMA via ConnectX-7, and ISO 26262 safety. Pin-compatible with Jetson Thor and 10-year lifecycle enable seamless migration, but create vendor lock-in through proprietary networking and GPU dependencies.
ARM and NVIDIA Drive Localization Revolution in AI Workstations
ARM and NVIDIA jointly launch DGX Spark AI workstations based on GB10 Grace Blackwell chips, with eight major OEMs releasing products simultaneously. The solution features unified memory architecture supporting 200B parameter models locally, with third-party tests showing 41% faster rendering and 3.2x AI processing speed versus x86 alternatives, enabling seamless cloud-to-edge toolchain migration.
SK Hynix Jumps to TSMC 3nm for HBM4E Logic Die to Counter Samsung's 4nm Lead
SK Hynix plans to use TSMC's 3nm process for the logic die in its 7th-gen HBM4E, a leap from the 12nm used in HBM4. This aims to reverse the performance gap with Samsung (which used 4nm logic in HBM4) and deliver higher bandwidth and power efficiency for next-gen AI chips like NVIDIA's Vera Rubin Ultra.
AMD and NAVER Cloud Collaborate on Sovereign AI Infrastructure in Korea
AMD and NAVER Cloud announced a strategic collaboration to accelerate sovereign AI infrastructure in Korea. NAVER Cloud will expand deployment of AMD EPYC "Venice" CPUs and gain early access to next-gen Instinct MI455X GPUs, with joint optimization of AI services and software stacks on AMD platforms.
AMD and Samsung Deepen Collaboration, Locking HBM4 Supply and Exploring Foundry Partnership
AMD and Samsung signed an MOU, designating Samsung as the primary HBM4 supplier for the next-gen Instinct MI455X GPU and collaborating on DDR5 memory optimized for 6th Gen EPYC CPUs. The companies will also explore opportunities for Samsung to provide foundry services for future AMD products.
Project Rheo: NVIDIA Shifts Robot Training Control from Real Hospitals to Simulation
NVIDIA unveils Project Rheo, a blueprint combining Isaac Sim, GR00T VLA models, and synthetic data generation for hospital robotics. Developers train Physical AI policies in digital twins—loco-manipulation (surgical tray pick-and-place) and precision bimanual tasks (trocar assembly)—with Cosmos Transfer 2.5 for cross-scene generalization.
NVIDIA Warp: Differentiable Physics Simulation for AI Training on GPU
NVIDIA Warp is a framework for GPU-accelerated, differentiable physics simulation. It enables writing high-performance kernels in Python, with automatic differentiation, and integrates with PyTorch/JAX. The 2D Navier-Stokes example demonstrates end-to-end optimization, reducing the cost of generating training data for physics AI.
NVIDIA Extends CUDA Tile Programming Model to Julia Language
NVIDIA introduces its CUDA Tile high-level GPU programming model to the Julia ecosystem via the cuTile.jl package. This move aims to lower the barrier to high-performance GPU kernel development by abstracting low-level thread and memory management with a tile-based data model, while maintaining high syntax and performance parity with the Python version.
Trend Micro Report Highlights AI Supply Chain Risks and Model Attack Surfaces
Trend Micro's 'Fault Lines in the AI Ecosystem' report systematically analyzes security risks in the AI supply chain, including training data poisoning, third-party plugin vulnerabilities, and model theft attacks. It indicates that enterprise AI security boundaries have expanded from traditional IT infrastructure to the model layer and data pipelines.
AMD Launches Gaming PC Certification Framework to Strengthen Platform Strategy
AMD introduces Advantage Gaming Desktops certification program, requiring OEMs to adopt AMD's 3A platform combining processors, GPUs and software technologies. The program sets hardware performance standards including Ryzen 7/9 processors and Radeon RX 7000 GPUs, with integrated software optimization.