Filter

×
Active Filters Clear All
Keyword: Compute ×
208 Total Reports
1/11 Page
NVIDIA Other 2026-06-16

Lexar Offloads AI Models to SSD: DRAM Cut 40%, Latency Remains Hurdle

Lexar unveils AI Storage Core SSD with a custom SPU DRAM-less controller and software stack, offloading LLMs to NAND Flash. It runs Qwen 3.5 122B on 32GB DRAM at 15.6 tokens/s (3x improvement), but TTFM latency of 2-8 seconds hinders real-time use.

NVIDIA Other 2026-06-16

NVIDIA Blackwell Sweeps MLPerf: NVLink and NVFP4 Redefine AI Training Economics

NVIDIA Blackwell dominates MLPerf Training 6.0, submitting across all seven benchmarks including MoE workloads. GB300 NVL72 delivers up to 1.6x faster training than GB200, with fifth-gen NVLink unifying 72 GPUs as one giant GPU. NVFP4 low-precision training and massive scale (8,192 GPUs) set new industry standards.

NVIDIA Other 2026-06-16

SiMa.ai Palette Neat: Natural-Language Agentic Environment Dismantles NVIDIA's GPU Moat

SiMa.ai launches open-source Palette Neat, an agentic development environment for Physical AI, paired with its sub-10W Modalix SoM. It uses natural language to abstract compute complexity, slashing dev cycles from months to days. Pin-compatible with NVIDIA SoM, it targets breaking the GPU ecosystem lock-in.

AMD Other 2026-06-16

AMD Critical RCE Vulnerability Disclosed After 124 Days, Sparks AI Infrastructure Security Crisis

Security researcher mr.bruh publicly disclosed a critical remote code execution (RCE) vulnerability in AMD processors after 124 days without a fix, with AMD refusing a $10,000 bounty. The flaw affects AI servers running AMD EPYC and Instinct, likened to a Log4j moment for AI infrastructure, forcing enterprises to reassess chip-level security response and supply chain risk.

MediaTek Other 2026-06-16

MediaTek Doubles AI ASIC Target to $2B, Challenges Broadcom in Data Center Custom Silicon

MediaTek doubles its 2026 AI ASIC revenue target to $2B, leveraging Google hyperscaler deals and the NVIDIA RTX Spark chip (featuring MediaTek's N1X Arm CPU). It aims for 10-15% of the $70-80B custom AI chip market by 2027, directly challenging Broadcom's dominance.

AMD Other 2026-06-16

AMD and Rackspace Deploy 30MW Governed AI Stack: Ecosystem Restructuring from Silicon to Outcomes

AMD and Rackspace sign a definitive agreement to deploy 30MW of AMD AI compute (Instinct GPUs including MI355X, EPYC CPUs) across Rackspace's data centers, creating a governed enterprise AI stack with single accountability from silicon to outcomes, targeting regulated industries.

Google Cloud Other 2026-06-16

Apple Rebuilds Siri with Google Gemini, Cuts Legacy Hardware Support

Apple rebuilds Siri using Google Gemini-derived capabilities, introducing five new AFM 3 foundation models (including a 20B-parameter multimodal on-device model). The move is paired with the sharpest hardware support cut in watchOS 27, limiting to S9/S10 chips, signaling a strategic shift from vertical integration to hybrid AI partnerships and accelerated hardware refresh cycles.

NVIDIA Other 2026-06-16

AMD Ryzen 10000 Series to Swap iGPU for NPU: AI Boost at Cost of Basic Display

Leaks suggest AMD's next-gen Zen 6 desktop CPU 'Olympic Ridge' will replace the integrated GPU with an NPU, targeting >40 TOPS for Copilot+ AI PC certification. It also upgrades the client I/O die to support CUDIMM/CAMM and EXPO 1.2 for faster DDR5. The trade-off boosts local AI but forces nearly all users to rely on a discrete GPU for basic display.

NVIDIA Other 2026-06-16

ASML, TSMC, imec Demo 300mm 2D-Material Transistors at 50nm Pitch

imec, ASML, and TSMC demonstrate the first 300mm wafer integration of MoS2/WS2/WSe2-based n and pFETs with 50nm contacted poly pitch (CPP) using single-patterning EUV lithography, achieving 94% operational yield. This lab-to-fab breakthrough paves the way for 2D channel materials to extend Moore's Law.

AMD Other 2026-06-15

AMD Acquires MEXT: AI-Predicted Flash Nears DRAM Performance to Cut AI Memory TCO

AMD acquires MEXT, an AI-driven memory optimization startup. MEXT's predictive technology makes NAND Flash behave like DRAM, expanding effective memory capacity for AI workloads and lowering TCO. The tech will be integrated across AMD's data center portfolio (EPYC, Instinct) to address memory bottlenecks in large models.

NVIDIA Other 2026-06-15

NVIDIA's Desktop DGX Station with GB300 Shifts Control from Cloud to Local Hardware

ASUS launches ExpertCenter Pro ET900N G3, built on NVIDIA DGX Station GB300 architecture with GB300 Grace Blackwell Ultra chip, 748GB coherent memory, and 20 PFLOPS AI performance. This deskside AI supercomputer enables local LLM fine-tuning, inference, and agentic AI workflows via NVLink-C2C and the full NVIDIA AI software stack including NemoClaw.

Research Other 2026-06-15

Z.ai GLM-5.2 Ships Usable 1M-Token Context, No Benchmarks, Two Thinking Levels

Z.ai releases GLM-5.2 with a claim of usable 1M-token context and two thinking-effort levels. No standard benchmarks are provided, raising concerns about real-world performance. The model targets replacing chunking-based RAG with native long-context reasoning.

MediaTek Other 2026-06-15

Compute Futures Market: Financializing GPU Capacity Could Reshape AI Infrastructure Procurement

Carmen Li is building a GPU pricing index and spot marketplace via Silicon Data and Compute Exchange, aiming to launch compute futures. Backed by DRW, this initiative targets GPU price volatility by standardizing compute trading, potentially creating a trillion-dollar asset class and transforming AI compute procurement.

Cloudflare Other 2026-06-15

Cloudflare Absorbs Ensemble AI: Architectural Model Compression Reshapes Edge Inference Economics

Cloudflare integrates key Ensemble AI talent, bringing NdLinear and NdLinear-LoRA—architectural model compression techniques that preserve multidimensional activations to reduce parameters and compute. This aims to slash inference costs on Workers AI, boost GPU utilization, and accelerate global edge AI deployment.

NVIDIA Other 2026-06-14

NVIDIA & SK hynix Deepen Memory Co-Engineering: Custom HBM for Vera Rubin and Jetson Thor

NVIDIA and SK hynix have announced a multiyear partnership to co-develop next-generation custom memory for NVIDIA's AI factory ecosystem, including Vera Rubin supercomputers, Vera CPUs, RTX Spark PCs, and Jetson Thor robotic platforms. SK hynix will also use NVIDIA CUDA-X libraries and Omniverse to accelerate semiconductor design and build fab digital twins.

NVIDIA Other 2026-06-13

NVIDIA AgentPerf Benchmark: Blackwell Ultra Delivers 20x More Agents per Megawatt vs Hopper

NVIDIA and Artificial Analysis unveil AgentPerf, the first benchmark for agentic AI workloads. Results show the GB300 NVL72 platform delivers up to 20x more concurrent agents per megawatt than the HGX H200 when running DeepSeek V4 Pro, using real coding agent trajectories to measure throughput and responsiveness.

NVIDIA Other 2026-06-11

NVIDIA Halos OS: A Certified Safety OS That Seizes Control of Autonomous Driving

NVIDIA introduces Halos OS, a full-stack safety system comprising ASIL D certified Halos Core, standardized Halos SDK, AI guardrails in Halos Applications, and cloud-based Safety Evaluation Framework. Built on DRIVE Hyperion, it aims to embed safety into L4 robotaxis from the ground up.

Microsoft Other 2026-06-11

Microsoft & NVIDIA RTX Spark Brings 1 Petaflop AI to Windows, Reshaping Local Inference

At Computex 2026, Microsoft unveiled RTX Spark, an Arm-based AI superchip co-developed with NVIDIA and MediaTek, delivering up to 1 petaflop AI performance and 128GB unified memory for local 120B parameter models. Intel Arc G3 and Qualcomm Snapdragon X2 series also launched, accelerating the Windows AI PC ecosystem.

NVIDIA Other 2026-06-11

NVIDIA Locks Local AI Inference Control with DiffusionGemma Parallel Generation

NVIDIA optimizes Google DeepMind's DiffusionGemma open model, which generates 256 tokens in parallel for 4x speedup over autoregressive models. Achieves 1000 tokens/sec on H100, 150 tokens/sec on DGX Spark, running fully locally with no cloud cost. This reinforces NVIDIA GPU's centrality in compute-bound local AI inference.

AMD Other 2026-06-11

AMD, Dell, Cambridge Launch UK Sovereign AI Lab to Challenge NVIDIA's CUDA Dominance with Open ROCm

AMD, Dell, and the University of Cambridge launch the Sovereign AI Innovation Lab (SAIL) in the UK, deploying Zenith supercomputer with 5th Gen EPYC and Instinct MI355X GPUs, plus the Sunrise fusion AI system. The lab promotes open, interoperable AI infrastructure based on AMD ROCm, challenging NVIDIA's CUDA lock-in and offering long-term technology choice for national AI initiatives.