Reports
AI-generated structured vendor updates
Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps
Anthropic alerted U.S. senators that Alibaba-linked operators conducted the largest known distillation attack, generating 28.8 million model exchanges via 25,000 fraudulent accounts to harvest Claude's frontier capabilities. The incident exposes a critical vulnerability in AI API security, forcing a rethinking of inference endpoint protection and usage monitoring.
Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers
At MWC Shanghai 2026, Huawei unveiled an AI-native network architecture integrating service, network, and compute, shifting from traffic-centric to intelligence-centric operations. The Unified Cache Manager (UCM) extends KV cache to petabyte-scale external storage, achieving 372% token throughput gains on GLM-5.1 at 128K sequence lengths. Token monetization frameworks and agentic operations enable carriers to charge for AI inference capacity and personalize services.
Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification
Google Cloud introduces agent-scale data management with multi-agent verification to reduce human oversight. Deploys six Gemini agents with Nokia for autonomous network operations. Amazon plans to commercialize Trainium chips, intensifying AI hardware competition against Google TPU and Nvidia GPU.
Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO
Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.
TSMC Hikes Advanced Node Prices 5-10%, Squeezing AI Chip Margins
TSMC informs clients of 5-10% price hikes across all advanced nodes (7nm+), affecting 74% of wafer revenue. Apple, Nvidia, AMD, and others face higher costs, potentially raising AI infrastructure prices.
NVIDIA Launches Agent Toolkit: Nemotron Models, OpenShell Runtime for Specialized AI Agents
NVIDIA unveils Agent Toolkit, an open modular foundation with Nemotron models, NemoClaw blueprints, and OpenShell runtime, enabling enterprises to build secure, specialized AI agents. It targets life sciences, cybersecurity, and industrial workflows, aiming to turn frontier models into domain-specific digital coworkers.
AMD MI430X GPU Delivers >200 TFLOPS Native FP64, Reshaping HPC-AI Convergence Baseline
AMD powers 4 of top 10 TOP500 supercomputers and previews MI430X GPU with >200 TFLOPS native FP64. This targets AI-for-science workloads, making double-precision compute a key metric for converged HPC-AI infrastructure, directly challenging NVIDIA and Intel.
ASML CEO Validates Musk's Terafab, Reshaping AI Chip Supply Chain
ASML's CEO publicly acknowledges tracking Elon Musk's planned terawatt-scale AI supercomputer Terafab, comparing it to Korean DRAM megaprojects. This signals that the sole EUV lithography supplier is allocating capacity, potentially transforming AI chip supply chain and vertical integration.
Nvidia Vera Rubin CPU: 10-Wide Core Redefines CPU for Agentic Computing
At GTC Taipei 2026, Nvidia unveiled the Vera Rubin CPU with a custom 10-wide fetch/decode/execute pipeline, claiming world-leading IPC and bandwidth. Designed for agentic computing, it complements Nvidia GPUs. Nvidia also announced a partnership with Microsoft to reinvent the PC as a Personal AI and committed to returning 50% of free cash flow to shareholders.
AMD MEXT Acquisition Turns NAND Flash into DRAM-Class Memory, Halving AI Inference Cost
AMD acquires MEXT, whose technology makes cheap NAND flash behave like expensive DRAM, doubling to quadrupling usable memory capacity while halving costs. This targets inference and agentic AI memory bottlenecks. AMD also signs a 30MW AI compute deployment deal with Rackspace, rolling out from 2026 to 2028.
ASML CEO's EUV Supply Warning Signals a Physical Ceiling on AI Chip Expansion
ASML CEO Fouquet confirms talks with Musk on Terafab but stresses supply constraints. EUV lithography, the sole tool for advanced AI chips, cannot scale quickly. With TSMC, Samsung, Intel, and Musk all vying for limited machines, AI chip capacity allocation becomes a zero-sum game, capping the entire AI infrastructure buildout.
ASUS Launches NVIDIA GB300 Deskside AI Supercomputer, Shifting Control from Cloud to On-Prem
ASUS launches the ExpertCenter Pro ET900N G3, powered by NVIDIA's GB300 Grace Blackwell Ultra Desktop Superchip, delivering 20 PFLOPS and 748GB of coherent memory for near-trillion parameter models. Concurrently, Coherent expands InP fab in Texas for optical interconnects, and NVIDIA plans a $20-25B debt offering, signaling a systemic shift of AI control from cloud to localized enterprise hardware.
Google Cloud Embeds Legal Verifiability into AI Agents via SPIFFE and Kakunin
Google Cloud introduces SPIFFE-based Agent Identity for Gemini Enterprise and Vertex AI, then overlays Kakunin's compliance layer to map internal SPIFFE identifiers to X.509 certificates generated in AWS KMS, with all state changes committed to WORM audit logs. This converts secure cloud workloads into legally auditable market participants to meet EU AI Act and MiCA accountability mandates.
NVIDIA & Coherent Expand 6-Inch InP Fab, Locking AI Optical Interconnect Supply Chain
Coherent breaks ground on the world's first 6-inch indium phosphide fab in Texas, backed by $2B from NVIDIA and multi-billion purchase commitments. The facility produces lasers, transceivers, and pluggable optics for silicon photonics interconnects, enabling NVIDIA's Vera Rubin Ultra NVL576 576-GPU clusters and signaling a mass shift from copper to optical backbones in AI data centers.
Cisco AI Defense Adds Agent Harness Red Teaming for Agentic AI Security
Cisco introduces Agent Validation in AI Defense: Explorer Edition, a dedicated red-teaming capability for agentic AI systems. It autonomously probes agent harness attack surfaces, including tool routes, indirect content channels, and persistent state, providing verified findings beyond chat-based security assessments.
NVIDIA and Coherent Scale 6-Inch InP Fab, Optical Interconnect Becomes AI Infrastructure's New Bottleneck Breaker
NVIDIA invests $2B and commits multi-billion purchases to Coherent's expanded 6-inch indium phosphide fab in Texas, scaling production of lasers and optical modules for AI interconnects. This addresses copper's distance and power limitations in large GPU clusters (e.g., Vera Rubin Ultra NVL576), pushing co-packaged optics into volume manufacturing.
AMD MLPerf 6.0: MI350 GPUs Achieve 3.5x Leap with MXFP4, Debut Multi-Node Training
AMD submitted its most comprehensive MLPerf Training 6.0 results, including first multi-node training (FLUX.1 on 512 GPUs) and MXFP4 training recipe. MI355X delivers 3.5x generational leap over MI300X on Llama 2-70B, within 5% of NVIDIA B200. 10 ecosystem partners validated reproducibility.
HBM Bottleneck Reshapes AI Infrastructure: Asian Memory Makers Gain Leverage Over Nvidia
SK Hynix, Samsung, and Micron have crossed $1 trillion market cap as HBM becomes the hard limit in AI infrastructure. Asian suppliers now account for 90% of Nvidia's production costs, shifting the bottleneck from GPU compute to stacked memory and advanced packaging.
AMD and Rackspace Deploy 30MW Governed AI Stack: Ecosystem Restructuring from Silicon to Outcomes
AMD and Rackspace sign a definitive agreement to deploy 30MW of AMD AI compute (Instinct GPUs including MI355X, EPYC CPUs) across Rackspace's data centers, creating a governed enterprise AI stack with single accountability from silicon to outcomes, targeting regulated industries.
AMD Acquires MEXT: AI-Predicted Flash Nears DRAM Performance to Cut AI Memory TCO
AMD acquires MEXT, an AI-driven memory optimization startup. MEXT's predictive technology makes NAND Flash behave like DRAM, expanding effective memory capacity for AI workloads and lowering TCO. The tech will be integrated across AMD's data center portfolio (EPYC, Instinct) to address memory bottlenecks in large models.