Memory - AI Infrastructure Intelligence Search

AMD Other 2026-07-26

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

At Advancing AI 2026, AMD unveiled the Helios rackscale platform integrating 72 MI455X GPUs and 18 EPYC Venice CPUs per rack, delivering 2.9 exaflops FP4 inference and 31TB HBM4 memory. The MI430X offers 288 TFLOPS FP64 for HPC. AMD claims up to 30% more inference tokens per dollar vs. competitors.

NVIDIA Other 2026-07-26

NVIDIA and SK Group Lock HBM4 Supply and Launch Sovereign AI Factory Model with $500B+ Deal

NVIDIA and SK Group announced a $500B+ AI partnership including a 2GW AI factory using Vera Rubin and HBM4, long-term HBM4 supply lock, and a $1B NVIDIA investment in Naver (with $9B from Brookfield). Samsung and Broadcom signed a $200B deal. This signals a new era of sovereign AI infrastructure and supply chain deep-locking.

NVIDIA Other 2026-07-16

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA unveils Jetson Thor T3000 and T2000 modules. The T3000 packs a Blackwell GPU and 8-core Neoverse CPU, delivering 865 FP4 TFLOPS at half the power of the T5000. New Jetson Agent Skills automate memory optimization, aiming to scale deployment of humanoid robots and edge AI.

NVIDIA Other 2026-07-16

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA CUDA 13.3 adds the clmad hardware instruction for carryless multiply-accumulate on Ampere+ GPUs. GHASH throughput reaches 6.3 TB/s on B200, up to 18.8x faster than bitsliced. Sum-check protocol accelerates 3-13x. The instruction also benefits CRC, Reed-Solomon, and post-quantum cryptography.

NVIDIA Other 2026-07-14

NVIDIA's HVDC Power Shift Reshapes AI Data Center Energy Efficiency and Supply Chain

NVIDIA is driving a shift from AC to HVDC power systems for AI data centers, aiming to reduce conversion losses and improve efficiency. This move will reshape the entire supply chain for servers, power equipment, and cooling, but faces challenges in safety and standardization. It signals a generational change in AI infrastructure power delivery.

Other Other 2026-07-14

MemGhost Attack: Persistent False Memory Injection in AI Agents via Email

Researchers unveil MemGhost, a stealth memory injection attack that plants persistent false memories into AI agents via a single email without user notification. It exploits the persistent memory feature, highlighting critical security gaps and driving demand for memory auditing.

Meta Other 2026-07-13

Meta Iris Chip to Mass Produce in September: 6-Month Cadence Threatens NVIDIA GPU Hegemony

Reuters confirms Meta's Iris AI chip mass production in September, targeting 2.5GW by end-2026 and 14GW by 2027. Meta's 6-month MTIA generation cadence directly challenges NVIDIA's annual GPU cycle, signaling a hyperscaler shift from GPU dependency to custom ASIC sovereignty.

TSMC Other 2026-07-13

TSMC Hikes Sub-7nm Prices 8-12%, Extends Lead Times to 26 Weeks, Triggering AI Chip Cost Inflation

TSMC raises sub-7nm wafer prices by 8-12% and extends lead times to 26 weeks, effective July 2026. New v2.1 directive mandates EDA tool validation for PDK access. This directly inflates AI chip TCO, delays new product launches, and solidifies TSMC's control over the AI supply chain.

Intel Other 2026-07-12

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

...

Samsung Electronics Other 2026-07-10

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Samsung launches GAIA AI PC processor with 4nm process and memory-centric NPU, integrating LPDDR5X controller with NPU for near-memory computing, achieving 40% energy efficiency improvement and 50 TOPS. Certified for Microsoft Copilot+ PC, Lenovo to adopt in Q4 2026.

AMD Other 2026-07-10

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

...

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

NVIDIA Other 2026-07-07

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

...

Qualcomm Other 2026-07-02

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Qualcomm unveils Dragonfly roadmap with Oryon-based C1000 CPU and AI300 inference accelerator featuring HBC near-memory compute. Meta and Microsoft are early adopters. The strategy targets AI inference TCO reduction and memory wall breakthrough, bypassing Nvidia's training dominance.

Qualcomm Other 2026-06-25

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Qualcomm announces HBC Gen 1, a 3D-stacked LPDDR memory with integrated compute die, achieving 133 TB/s bandwidth and 6x energy efficiency over HBM. Aimed at replacing HBM in AI accelerators, shipping with AI250 in mid-2027, but supply chain and feasibility remain uncertain.

Huawei Other 2026-06-25

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

At MWC Shanghai 2026, Huawei unveiled an AI-native network architecture integrating service, network, and compute, shifting from traffic-centric to intelligence-centric operations. The Unified Cache Manager (UCM) extends KV cache to petabyte-scale external storage, achieving 372% token throughput gains on GLM-5.1 at 128K sequence lengths. Token monetization frameworks and agentic operations enable carriers to charge for AI inference capacity and personalize services.

Google Cloud Other 2026-06-25

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

Google Cloud introduces agent-scale data management with multi-agent verification to reduce human oversight. Deploys six Gemini agents with Nokia for autonomous network operations. Amazon plans to commercialize Trainium chips, intensifying AI hardware competition against Google TPU and Nvidia GPU.

NVIDIA Other 2026-06-25

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Qualcomm unveils full data center portfolio: Dragonfly C1000 250-core Oryon CPU (>5GHz, PCIe Gen7, CXL), HBC near-memory compute (133TB/s Gen1, 18x-54x effective BW), AI300 inference accelerator (UALink/ESUN scale-up), and 800G/1.6T connectivity. Multi-year Meta CPU deal. Commercial sampling 2027-2028. Targets inference TCO with tokens-per-watt leadership.

Cisco Other 2026-06-25

Cisco Launches AI Troubleshooting Agent for Industrial Networks, Shifting Control Plane

Cisco launches AI Troubleshooting for Industrial Networks, an ambient agent on Cisco Cloud Control. It monitors switch syslogs, uses deterministic logic to diagnose physical and network faults, and provides OT technicians with actionable fix steps, aiming to reduce MTTD and MTTR by minimizing escalations to network experts.

OpenAI Other 2026-06-25

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

OpenAI, in collaboration with Broadcom, has developed Jalapeno, a custom LLM inference accelerator. The chip uses a multi-chip module with HBM3E memory and achieved tape-out in just nine months. Designed for OpenAI's model stack, it aims to reduce inference costs and dependency on NVIDIA GPUs, with initial deployment planned for late 2026.

Reports

Filter

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

NVIDIA and SK Group Lock HBM4 Supply and Launch Sovereign AI Factory Model with $500B+ Deal

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA's HVDC Power Shift Reshapes AI Data Center Energy Efficiency and Supply Chain

MemGhost Attack: Persistent False Memory Injection in AI Agents via Email

Meta Iris Chip to Mass Produce in September: 6-Month Cadence Threatens NVIDIA GPU Hegemony

TSMC Hikes Sub-7nm Prices 8-12%, Extends Lead Times to 26 Weeks, Triggering AI Chip Cost Inflation

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Cisco Launches AI Troubleshooting Agent for Industrial Networks, Shifting Control Plane

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

Reports

Filter

AMD Launches Helios Rack-Scale AI Platform with MI400 GPUs, Targeting Inference TCO

NVIDIA and SK Group Lock HBM4 Supply and Launch Sovereign AI Factory Model with $500B+ Deal

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA's HVDC Power Shift Reshapes AI Data Center Energy Efficiency and Supply Chain

MemGhost Attack: Persistent False Memory Injection in AI Agents via Email

Meta Iris Chip to Mass Produce in September: 6-Month Cadence Threatens NVIDIA GPU Hegemony

TSMC Hikes Sub-7nm Prices 8-12%, Extends Lead Times to 26 Weeks, Triggering AI Chip Cost Inflation

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

Google Cloud Multi-Agent Architecture Shifts Control from Human to Autonomous Verification

Qualcomm Dragonfly: 250-core CPU, HBC memory, UALink interconnects target AI inference TCO

Cisco Launches AI Troubleshooting Agent for Industrial Networks, Shifting Control Plane

OpenAI and Broadcom Unveil Jalapeno Inference ASIC, Reshaping AI Hardware Landscape

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs