server - AI Infrastructure Intelligence Search

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Huawei Other 2026-06-25

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

At MWC Shanghai 2026, Huawei unveiled an AI-native network architecture integrating service, network, and compute, shifting from traffic-centric to intelligence-centric operations. The Unified Cache Manager (UCM) extends KV cache to petabyte-scale external storage, achieving 372% token throughput gains on GLM-5.1 at 128K sequence lengths. Token monetization frameworks and agentic operations enable carriers to charge for AI inference capacity and personalize services.

NVIDIA Other 2026-06-25

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.

NVIDIA Other 2026-06-24

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA and AWS collaborate to embed cuVS as default GPU-accelerated vector search in OpenSearch Serverless, delivering 10x faster indexing at 1/4 cost. New EC2 G7 instances with RTX PRO 4500 Blackwell GPUs achieve up to 4.6x inference performance. AWS achieves GB300 Exemplar Cloud status for training.

NVIDIA Other 2026-06-23

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

NVIDIA announces a liquid cooling system for its Rubin GPUs running 45°C coolant (hotter than a hot tub), using dry coolers in a closed loop to cut electricity and eliminate water evaporation (100% reduction). However, chillers may still be needed in hot climates, and chip longevity impacts remain unaddressed.

Anthropic Other 2026-06-23

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

Micron signed a long-term supply contract with Anthropic covering HBM, DRAM, and SSDs, with joint analysis of memory subsystems for AI workloads. Micron also participated in Anthropic's Series H. This aims to transform memory from a commodity to an AI infrastructure asset, but the stock has already run up, requiring proof of sustained scarcity premium.

AMD Other 2026-06-23

AMD MI430X GPU Delivers >200 TFLOPS Native FP64, Reshaping HPC-AI Convergence Baseline

AMD powers 4 of top 10 TOP500 supercomputers and previews MI430X GPU with >200 TFLOPS native FP64. This targets AI-for-science workloads, making double-precision compute a key metric for converged HPC-AI infrastructure, directly challenging NVIDIA and Intel.

NVIDIA Other 2026-06-23

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

At DTW Ignite 2026, NVIDIA showcases its AI agent platform integrating NeMo synthetic data, NemoClaw secure runtime, OpenShell sandbox, and RTX PRO 6000-accelerated digital twins, aiming for autonomous telecom operations. Partners include SoftBank, Amdocs, NTT DATA, etc., moving from task automation to full autonomy.

Amazon Other 2026-06-23

AWS Lambda MicroVMs: Stateful Isolated Sandboxes via Firecracker Snapshots

AWS launches Lambda MicroVMs, leveraging Firecracker for VM-level isolation, near-instant launch/resume, and stateful execution. Users build images from Dockerfiles in S3, launch from pre-initialized snapshots, and suspend/resume automatically, enabling multi-tenant AI code sandboxes and interactive analytics.

ARM Other 2026-06-23

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

IDC reports Q1 2026 global server revenue hit a record $122.6B, with Arm-based servers capturing >45% share (x86 at 52%). Accelerated servers (GPU/ASIC/FPGA) generated >70% revenue. Nvidia's Grace CPU (NVL72) and hyperscaler custom Arm chips drive the shift; x86 still leads in unit volume but faces supply constraints.

NVIDIA Other 2026-06-23

Nvidia Vera Rubin CPU: 10-Wide Core Redefines CPU for Agentic Computing

At GTC Taipei 2026, Nvidia unveiled the Vera Rubin CPU with a custom 10-wide fetch/decode/execute pipeline, claiming world-leading IPC and bandwidth. Designed for agentic computing, it complements Nvidia GPUs. Nvidia also announced a partnership with Microsoft to reinvent the PC as a Personal AI and committed to returning 50% of free cash flow to shareholders.

NVIDIA Other 2026-06-22

Dell PowerEdge XE8812: Liquid-Cooled Density Trap with NVIDIA Vera Rubin NVL4

Dell launches PowerEdge XE8812 with NVIDIA Vera Rubin NVL4, delivering 144 GPUs per rack, 300kW+ power, and 100% direct liquid cooling. It offers a generational leap in memory and compute density for HPC and AI, but deeply locks users into Dell's PowerRack, iDRAC, and ORv3 ecosystem from chip to rack.

NVIDIA Other 2026-06-22

NVIDIA Rubin 100% Liquid Cooling at 45°C Slashes Cooling Energy 40%

NVIDIA Rubin generation achieves 100% liquid cooling with coolant up to 45°C, eliminating fans and cold aisles. The DSX reference design uses closed-loop dry coolers, reducing cooling energy ~40% and water consumption to near zero. Rack density triples, marking a fundamental shift in AI factory cooling.

Amazon Other 2026-06-21

AWS Seizes Agent Control Plane with MCP Gateway and AgentCore

AWS launches managed web search for Bedrock AgentCore, autonomous agents in Amazon Quick, subagent MicroVM orchestration with LangChain, and MCP Gateway, shifting enterprise AI agents from prototypes to governed infrastructure with cloud-native control planes and execution isolation.

Cisco Other 2026-06-18

Cisco Leverages NVIDIA Spectrum Silicon and Nexus One to Reshape AI Network Control Plane

Cisco launches N9100 switches with NVIDIA Spectrum-6/4 silicon, delivering 102.4T throughput. It also introduces Nexus One unified management plane spanning NX-OS and SONiC, and extends Hybrid Mesh Firewall to BlueField DPUs for AI workload security offload, aiming for a turnkey AI fabric control plane.

Google Other 2026-06-18

Google AI Studio Starter Tier: Pre-wired Serverless Stack Trades Control for Zero-Friction Deployment

Google introduces Starter Tier for AI Studio, a pre-wired stack of Cloud Run, Firestore, Cloud SQL for PostgreSQL, and Firebase Authentication, deployable without a payment method. It locks users to a single region, limited APIs, and shared quotas, but offers zero-downtime upgrade to full GCP, aiming to lower AI deployment barriers while deepening ecosystem lock-in.

AMD Other 2026-06-18

AMD MEXT Acquisition Turns NAND Flash into DRAM-Class Memory, Halving AI Inference Cost

AMD acquires MEXT, whose technology makes cheap NAND flash behave like expensive DRAM, doubling to quadrupling usable memory capacity while halving costs. This targets inference and agentic AI memory bottlenecks. AMD also signs a 30MW AI compute deployment deal with Rackspace, rolling out from 2026 to 2028.

Amazon Other 2026-06-18

Tesco's £100M Lawsuit Exposes VMware Lock-In, Accelerates Enterprise Virtualization Exodus

Tesco sues Broadcom over a 237% price hike after VMware's perpetual license termination, covering ~40,000 workloads. The case undermines enterprise trust in software licensing and may trigger a mass migration to Nutanix, Red Hat OpenShift Virtualization, and Proxmox, reshaping the virtualization ecosystem.

NVIDIA Other 2026-06-17

NVIDIA & Coherent Expand 6-Inch InP Fab, Locking AI Optical Interconnect Supply Chain

Coherent breaks ground on the world's first 6-inch indium phosphide fab in Texas, backed by $2B from NVIDIA and multi-billion purchase commitments. The facility produces lasers, transceivers, and pluggable optics for silicon photonics interconnects, enabling NVIDIA's Vera Rubin Ultra NVL576 576-GPU clusters and signaling a mass shift from copper to optical backbones in AI data centers.

Huawei Other 2026-06-17

Huawei's LogicFolding: 3D Stacking Rewrites AI Chip Rules

Huawei's Tau Scaling Law and LogicFolding architecture boost transistor density by 55% and power efficiency by 41% via vertical logic stacking, targeting 1.4nm-class by 2031. Ascend 920/910C chips are now used for DeepSeek V4-Pro post-training, signaling real-world AI workload deployment and challenging Nvidia's dominance in China.

Reports

Filter

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

NVIDIA and AWS Default GPU Vector Search with cuVS, G7 Instances Deliver 4.6x Inference

NVIDIA Unveils 45°C Liquid Cooling for Rubin Chips, Slashes Water Use 100%

Micron-Anthropic Deal Locks AI Memory Demand, But Stock Price Already Priced In

AMD MI430X GPU Delivers >200 TFLOPS Native FP64, Reshaping HPC-AI Convergence Baseline

NVIDIA's AI Agents and Digital Twins Reshape Telecom Network Control Plane

AWS Lambda MicroVMs: Stateful Isolated Sandboxes via Firecracker Snapshots

Arm servers capture >45% data center revenue, x86 ecosystem under AI-driven assault

Nvidia Vera Rubin CPU: 10-Wide Core Redefines CPU for Agentic Computing

Dell PowerEdge XE8812: Liquid-Cooled Density Trap with NVIDIA Vera Rubin NVL4

NVIDIA Rubin 100% Liquid Cooling at 45°C Slashes Cooling Energy 40%

AWS Seizes Agent Control Plane with MCP Gateway and AgentCore

Cisco Leverages NVIDIA Spectrum Silicon and Nexus One to Reshape AI Network Control Plane

Google AI Studio Starter Tier: Pre-wired Serverless Stack Trades Control for Zero-Friction Deployment

AMD MEXT Acquisition Turns NAND Flash into DRAM-Class Memory, Halving AI Inference Cost

Tesco's £100M Lawsuit Exposes VMware Lock-In, Accelerates Enterprise Virtualization Exodus

NVIDIA & Coherent Expand 6-Inch InP Fab, Locking AI Optical Interconnect Supply Chain

Huawei's LogicFolding: 3D Stacking Rewrites AI Chip Rules