Compute - AI Infrastructure Intelligence Search

NVIDIA Other 2026-07-16

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA unveils Jetson Thor T3000 and T2000 modules. The T3000 packs a Blackwell GPU and 8-core Neoverse CPU, delivering 865 FP4 TFLOPS at half the power of the T5000. New Jetson Agent Skills automate memory optimization, aiming to scale deployment of humanoid robots and edge AI.

NVIDIA Other 2026-07-16

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA CUDA 13.3 adds the clmad hardware instruction for carryless multiply-accumulate on Ampere+ GPUs. GHASH throughput reaches 6.3 TB/s on B200, up to 18.8x faster than bitsliced. Sum-check protocol accelerates 3-13x. The instruction also benefits CRC, Reed-Solomon, and post-quantum cryptography.

Google Other 2026-07-15

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Google Cloud enables streaming Gemini Enterprise app telemetry (prompts, responses, activity logs) into BigQuery for real-time analysis. Leveraging BigQuery's AI capabilities (Conversational Analytics, auto-schema), it automates auditing, compliance, and insights for large-scale AI deployments, driving data-driven AI observability.

Intel Other 2026-07-12

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

...

Meta Other 2026-07-12

Meta Invests $9.17B in Canada AI Data Center, Iris AI Chip Mass Production Begins MTIA Roadmap

Meta announced a $9.17B AI data center in Canada with 1GW capacity, and its first in-house AI chip Iris will mass produce in September, kicking off the MTIA four-generation roadmap. Meta targets 14GW compute by 2027, using 6-month chip iterations to challenge NVIDIA's annual cadence and reduce GPU dependency.

Anthropic Other 2026-07-12

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

Broadcom's Q2 FY2026 filing reveals a 3.5GW TPU compute deal with Anthropic starting 2027. This marks a strategic shift from general-purpose GPUs to custom ASICs for AI workloads, with OpenAI and Meta making similar multi-GW commitments, signaling a fundamental change in AI infrastructure.

AMD Other 2026-07-10

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

...

NVIDIA Other 2026-07-09

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

...

NVIDIA Other 2026-07-07

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

...

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

NVIDIA Other 2026-07-07

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

...

Anthropic Other 2026-07-06

Anthropic Australia 1.4GW data center

...

NVIDIA Other 2026-07-02

NVIDIA AI Compute Partnership: Revenue Share and Credit Backstop to Lock Cloud Providers into DSX AI Factories

NVIDIA launches AI Compute Partnership with revenue sharing and credit backstop, shifting from hardware sales to recurring service revenue. Initial projects include 40K GB300 chips for Sharon AI and 170K GPUs for Firmus, totaling 200K+ high-end chips. NVIDIA is becoming the 'central bank' of AI compute, squeezing cloud brokers.

Qualcomm Other 2026-07-02

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Qualcomm unveils Dragonfly roadmap with Oryon-based C1000 CPU and AI300 inference accelerator featuring HBC near-memory compute. Meta and Microsoft are early adopters. The strategy targets AI inference TCO reduction and memory wall breakthrough, bypassing Nvidia's training dominance.

Cloudflare Other 2026-07-01

Announcing the Monetization Gateway: charge for any resource behind Cloudflare via x402

...

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Qualcomm Other 2026-06-25

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Qualcomm announces HBC Gen 1, a 3D-stacked LPDDR memory with integrated compute die, achieving 133 TB/s bandwidth and 6x energy efficiency over HBM. Aimed at replacing HBM in AI accelerators, shipping with AI250 in mid-2027, but supply chain and feasibility remain uncertain.

Anthropic Other 2026-06-25

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Anthropic alerted U.S. senators that Alibaba-linked operators conducted the largest known distillation attack, generating 28.8 million model exchanges via 25,000 fraudulent accounts to harvest Claude's frontier capabilities. The incident exposes a critical vulnerability in AI API security, forcing a rethinking of inference endpoint protection and usage monitoring.

OpenAI Other 2026-06-25

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Oracle announced the third cohort of its Defense Ecosystem at the Brussels summit, adding 10 companies. Concurrently, Whitespace's Saga AI system deployed on Oracle Roving Edge Devices during Royal Navy's Operation HIGHMAST, running classified AI workloads completely offline, proving sovereign edge AI is operational.

Huawei Other 2026-06-25

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

At MWC Shanghai 2026, Huawei unveiled an AI-native network architecture integrating service, network, and compute, shifting from traffic-centric to intelligence-centric operations. The Unified Cache Manager (UCM) extends KV cache to petabyte-scale external storage, achieving 372% token throughput gains on GLM-5.1 at 128K sequence lengths. Token monetization frameworks and agentic operations enable carriers to charge for AI inference capacity and personalize services.

Reports

Filter

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

Meta Invests $9.17B in Canada AI Data Center, Iris AI Chip Mass Production Begins MTIA Roadmap

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Anthropic Australia 1.4GW data center

NVIDIA AI Compute Partnership: Revenue Share and Credit Backstop to Lock Cloud Providers into DSX AI Factories

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Announcing the Monetization Gateway: charge for any resource behind Cloudflare via x402

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

Reports

Filter

NVIDIA Jetson Thor T3000/T2000: Blackwell GPU Crashes Edge AI Cost Barrier

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Intel押注3D堆叠AI芯片 18A-PT+Foveros Direct 3D+EMIB-T全栈整合

Meta Invests $9.17B in Canada AI Data Center, Iris AI Chip Mass Production Begins MTIA Roadmap

Anthropic Locks 3.5GW TPU Compute with Broadcom, Signaling Shift to Custom AI ASICs

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

SambaNova完成11亿美元融资估值110亿美元：推理芯片新格局确立

NVIDIA Vera CPU获Perplexity/OpenAI/Anthropic/Oracle采用 AI Agent性能验证1.5-1.9x加速

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

Anthropic Australia 1.4GW data center

NVIDIA AI Compute Partnership: Revenue Share and Credit Backstop to Lock Cloud Providers into DSX AI Factories

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Announcing the Monetization Gateway: charge for any resource behind Cloudflare via x402

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Anthropic Alleges Largest AI Distillation Attack by Alibaba-Linked Operators, Exposing API Security Gaps

Oracle Defense Ecosystem Cohort 3: Offline AI on Roving Edge Devices Goes Operational

Huawei Unveils AI-Centric Network with Token Monetization, UCM Caching Breaks Long-Context Barriers

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs