Scaling - AI Infrastructure Intelligence Search

Google Cloud Other 2026-06-17

ASUS Launches NVIDIA GB300 Deskside AI Supercomputer, Shifting Control from Cloud to On-Prem

ASUS launches the ExpertCenter Pro ET900N G3, powered by NVIDIA's GB300 Grace Blackwell Ultra Desktop Superchip, delivering 20 PFLOPS and 748GB of coherent memory for near-trillion parameter models. Concurrently, Coherent expands InP fab in Texas for optical interconnects, and NVIDIA plans a $20-25B debt offering, signaling a systemic shift of AI control from cloud to localized enterprise hardware.

NVIDIA Other 2026-06-17

NVIDIA & Coherent Expand 6-Inch InP Fab, Locking AI Optical Interconnect Supply Chain

Coherent breaks ground on the world's first 6-inch indium phosphide fab in Texas, backed by $2B from NVIDIA and multi-billion purchase commitments. The facility produces lasers, transceivers, and pluggable optics for silicon photonics interconnects, enabling NVIDIA's Vera Rubin Ultra NVL576 576-GPU clusters and signaling a mass shift from copper to optical backbones in AI data centers.

Huawei Other 2026-06-17

Huawei's LogicFolding: 3D Stacking Rewrites AI Chip Rules

Huawei's Tau Scaling Law and LogicFolding architecture boost transistor density by 55% and power efficiency by 41% via vertical logic stacking, targeting 1.4nm-class by 2031. Ascend 920/910C chips are now used for DeepSeek V4-Pro post-training, signaling real-world AI workload deployment and challenging Nvidia's dominance in China.

NVIDIA Other 2026-06-17

NVIDIA and Coherent Scale 6-Inch InP Fab, Optical Interconnect Becomes AI Infrastructure's New Bottleneck Breaker

NVIDIA invests $2B and commits multi-billion purchases to Coherent's expanded 6-inch indium phosphide fab in Texas, scaling production of lasers and optical modules for AI interconnects. This addresses copper's distance and power limitations in large GPU clusters (e.g., Vera Rubin Ultra NVL576), pushing co-packaged optics into volume manufacturing.

AMD Other 2026-06-17

AMD MLPerf 6.0: MI350 GPUs Achieve 3.5x Leap with MXFP4, Debut Multi-Node Training

AMD submitted its most comprehensive MLPerf Training 6.0 results, including first multi-node training (FLUX.1 on 512 GPUs) and MXFP4 training recipe. MI355X delivers 3.5x generational leap over MI300X on Llama 2-70B, within 5% of NVIDIA B200. 10 ecosystem partners validated reproducibility.

NVIDIA Other 2026-06-16

NVIDIA Blackwell Sweeps MLPerf: NVLink and NVFP4 Redefine AI Training Economics

NVIDIA Blackwell dominates MLPerf Training 6.0, submitting across all seven benchmarks including MoE workloads. GB300 NVL72 delivers up to 1.6x faster training than GB200, with fifth-gen NVLink unifying 72 GPUs as one giant GPU. NVFP4 low-precision training and massive scale (8,192 GPUs) set new industry standards.

Hewlett Packard Enterprise Other 2026-06-16

HPE Nonstop Embeds Agentic AI for Fraud: Control Shifts to Proprietary Inference Engine

HPE integrates Lusis TANGO AIF into Nonstop Compute, embedding Random Forest and deep learning models for real-time, adaptive anti-fraud operations. The solution offers self-healing infrastructure and linear scalability, shifting fraud detection from rule-based engines to AI-driven inference within the proprietary Nonstop environment.

Hewlett Packard Enterprise Other 2026-06-16

HPE Expands Self-Driving Networks: AI Control Plane Unifies Juniper & Aruba, Locks Management Stack

HPE integrates Juniper networking into its AI Data Center Solution, expanding self-driving networks across edge, campus, DC, and AI factories. New Mist support for CX switches, Marvis AIOps in Aruba Central, and QFX switches optimized for inferencing. Unified SASE platform aims to simplify operations via agentic AI automation, consolidating control under a single AI management plane.

Cisco Other 2026-06-16

Cisco Security Portfolio Moves to AWS Marketplace: Ecosystem Lock-in Accelerates, Multi-Cloud Neutrality Questioned

Cisco announces availability of its full SaaS security portfolio (Duo, Secure Access, Identity Intelligence, Hybrid Mesh Firewall) on AWS Marketplace, with deep integration with Amazon Bedrock and SageMaker for AI security and zero-trust agent management. This move simplifies procurement and accelerates deployment but deepens AWS dependency, potentially sacrificing multi-cloud flexibility.

AMD Other 2026-06-15

AMD Open-Sources AI Software Stack on Vultr, Taking on NVIDIA CUDA Ecosystem

AMD launches a suite of open-source, modular enterprise AI software components on Vultr Marketplace, including AMD Inference Microservices (AIMs), AI Workbench, Resource Manager, and Solution Blueprints. This aims to provide production-grade AI infrastructure without vendor lock-in, directly challenging NVIDIA's CUDA ecosystem.

NVIDIA Other 2026-06-15

NVIDIA Bets on World-Action Models: Control Shifts from VLM to Video Backbones

NVIDIA's blog introduces World-Action Models (WAMs) as a paradigm shift from VLM-based VLAs. WAMs leverage pretrained video/world-model backbones to jointly predict future states and robot actions, aiming to bridge the language-to-action grounding gap. This could redefine robot foundation model training but raises concerns about inference cost and latency.

Anthropic Other 2026-06-15

DXC and Anthropic Forge Multi-Year Alliance: Claude-Certified Engineers for Mission-Critical AI

DXC Technology and Anthropic announce a multi-year global partnership, making DXC a Global Premier partner in the Claude Partner Network. They will train tens of thousands of Claude-certified engineers to deploy Claude models in mission-critical environments via the DXC OASIS platform, using a 'Customer Zero' internal validation approach.

Cloudflare Other 2026-06-15

Cloudflare Absorbs Ensemble AI: Architectural Model Compression Reshapes Edge Inference Economics

Cloudflare integrates key Ensemble AI talent, bringing NdLinear and NdLinear-LoRA—architectural model compression techniques that preserve multidimensional activations to reduce parameters and compute. This aims to slash inference costs on Workers AI, boost GPU utilization, and accelerate global edge AI deployment.

NVIDIA Other 2026-06-13

NVIDIA GB300 NVL72 Delivers 20x Agentic Coding Efficiency, Setting New Inference Benchmark

NVIDIA's GB300 NVL72 achieves 20x more concurrent coding agents per megawatt than H200 on the new AA-AgentPerf benchmark, leveraging 72-GPU NVLink fabric, MXFP4 kernels, and MoE optimizations. This first standardized agentic inference benchmark redefines data center capacity planning for AI agents.

Anthropic Other 2026-06-11

Anthropic Locks Regulated Industries via DXC: Claude-Certified Engineers and OASIS Platform as New Control Points

Anthropic forms a global alliance with DXC Technology, training tens of thousands of Claude-certified forward-deployed engineers to embed Claude into mission-critical systems for banks, airlines, and regulated industries. DXC's OASIS platform defaults to Claude, with over 95% of its code generated by Claude, creating deep dependency.

NVIDIA Other 2026-06-11

NVIDIA Halos OS: A Certified Safety OS That Seizes Control of Autonomous Driving

NVIDIA introduces Halos OS, a full-stack safety system comprising ASIL D certified Halos Core, standardized Halos SDK, AI guardrails in Halos Applications, and cloud-based Safety Evaluation Framework. Built on DRIVE Hyperion, it aims to embed safety into L4 robotaxis from the ground up.

NVIDIA Other 2026-06-11

NVIDIA Optimizes Google's DiffusionGemma for 1,000 tok/s Parallel Text Generation

NVIDIA optimizes Google DeepMind's DiffusionGemma, a diffusion-based text model generating 256 tokens per step in parallel. On a single H100, it achieves 1,000 tok/s, with deployment via NIM and NeMo. This breaks the sequential token bottleneck, slashing serving costs and latency for real-time AI.

AMD Other 2026-06-11

AMD, Dell, Cambridge Launch UK Sovereign AI Lab to Challenge NVIDIA's CUDA Dominance with Open ROCm

AMD, Dell, and the University of Cambridge launch the Sovereign AI Innovation Lab (SAIL) in the UK, deploying Zenith supercomputer with 5th Gen EPYC and Instinct MI355X GPUs, plus the Sunrise fusion AI system. The lab promotes open, interoperable AI infrastructure based on AMD ROCm, challenging NVIDIA's CUDA lock-in and offering long-term technology choice for national AI initiatives.

ARM Other 2026-06-10

Arm's Neural Dawn: Dedicated Neural Accelerators Redefine Mobile GPU Roadmap

Arm and Sumo Digital unveil Neural Dawn, the first mobile game to use Unreal Engine MegaLights. By integrating dedicated neural accelerators into next-gen Mali GPUs, it delivers desktop-class ray-traced lighting within mobile power limits, signaling a shift from traditional to AI-native graphics pipelines.

Google Other 2026-06-10

Google Lightning Engine: 4.9x Spark Performance with Ecosystem Lock-in Risks

Google Cloud launches Lightning Engine GA for Apache Spark, delivering up to 4.9x faster performance via vectorized native execution on Gluten/Velox. Optimized Cloud Storage and BigQuery connectors boost throughput, but the premium tier and deep integration create vendor lock-in risks.

Reports

Filter