compression - AI Infrastructure Intelligence Search

MediaTek Other 2026-06-17

Databricks Margin Warning: AI Agents Break the Old Software Pricing Model

Databricks hit $6.9B annualized revenue but gross margin slipped from 80%+ to 74%, with CEO expecting further decline. AI agents drive exponentially more queries than humans, inflating cloud and compute costs. This reveals the fragility of human-activity-based pricing in the agent era.

NVIDIA Other 2026-06-17

NVIDIA RTX Remix 1.5: RTX IO Shrinks Game Sizes, AI Agents Reshape Modding

NVIDIA releases RTX Remix 1.5, featuring RTX IO compression that slashes Half-Life 2 RTX from 80GB to 50GB and reduces CPU overhead. The update also introduces AI agent integration via 'RTX Remix Skills,' allowing AI coding agents to automate complex modding tasks, lowering the barrier for non-programmers.

NVIDIA Other 2026-06-15

NVIDIA Bets on World-Action Models: Control Shifts from VLM to Video Backbones

NVIDIA's blog introduces World-Action Models (WAMs) as a paradigm shift from VLM-based VLAs. WAMs leverage pretrained video/world-model backbones to jointly predict future states and robot actions, aiming to bridge the language-to-action grounding gap. This could redefine robot foundation model training but raises concerns about inference cost and latency.

Research Other 2026-06-15

Z.ai GLM-5.2 Ships Usable 1M-Token Context, No Benchmarks, Two Thinking Levels

Z.ai releases GLM-5.2 with a claim of usable 1M-token context and two thinking-effort levels. No standard benchmarks are provided, raising concerns about real-world performance. The model targets replacing chunking-based RAG with native long-context reasoning.

Cloudflare Other 2026-06-15

Cloudflare Absorbs Ensemble AI: Architectural Model Compression Reshapes Edge Inference Economics

Cloudflare integrates key Ensemble AI talent, bringing NdLinear and NdLinear-LoRA—architectural model compression techniques that preserve multidimensional activations to reduce parameters and compute. This aims to slash inference costs on Workers AI, boost GPU utilization, and accelerate global edge AI deployment.

NVIDIA Other 2026-06-08

NVIDIA's UK Sovereign AI Play: From Chip Vendor to National Infrastructure Controller

NVIDIA partners with the UK government to deploy sovereign AI infrastructure via Isambard-AI (5,400 GH200 superchips) and the Sovereign AI Fund, backing local startups. This move establishes a national AI control plane, locking compute into NVIDIA's ecosystem and bypassing traditional hyperscalers like AWS and Azure.

NVIDIA Other 2026-05-27

NVIDIA Vera CPU Benchmark Crushes x86: Memory Bandwidth Hegemony for Agentic AI

Phoronix benchmarks show NVIDIA Vera CPU with 88 custom Olympus cores (Armv9.2), 1.2TB/s LPDDR5X bandwidth, and 450W TDP outperforming Intel/AMD x86 across agentic AI workloads. It achieves 1.5x overall performance vs 128-core x86, 90% STREAM TRIAD efficiency, and 20-second Linux kernel compilation.

Cisco Other High Signal 2026-04-30

Cisco Publishes Model Provenance Constitution, Defining Weight-Level Derivation Standards

Cisco published the 'Model Provenance Constitution' to provide a normative definition for AI model supply chain safety. The standard strictly hinges on the verifiable derivation history of model weights, clearly delineating five types of provenance links (e.g., direct descent, distillation) and eight exclusions (e.g., independent reproduction), aiming to resolve industry inconsistencies in model provenance definitions.

Reports

Filter