Filter

×
Active Filters Clear All
Keyword: GB200 ×
11 Total Reports
NVIDIA Other 2026-06-09

NVIDIA NVFP4: Native 4-Bit Training Boosts Throughput 1.73x, Locks Blackwell Ecosystem

NVIDIA introduces NVFP4, a native 4-bit format on Blackwell, enabling lossless mixed-precision pretraining in JAX/MaxText. Achieves 1.73x throughput gain over FP8 on Llama 3.1 405B (GB300). Techniques like micro-block scaling and Random Hadamard Transform boost performance but lock users into NVIDIA hardware.

NVIDIA Other 2026-06-04

NVIDIA Nemotron 3 Ultra: A MoE-Based Control Plane for Cost-Efficient AI Agent Orchestration

NVIDIA launches Nemotron 3 Ultra, a 550B-parameter MoE model (55B active) purpose-built for AI agent orchestration. Featuring Multi-Teacher On-Policy Distillation (MOPD) and a Hybrid Mamba-Transformer architecture, it achieves 5x throughput and 30% cost savings on tasks like SWE-bench, signaling a shift of reasoning control to a layered agent system.

NVIDIA Product Launch High Signal 2026-04-27

NVIDIA Rubin Delayed, Blackwell to Account for 71% of High-End GPU Shipments in 2026

NVIDIA Rubin GPU production target lowered from 2M to 1.5M units due to HBM4 memory validation delays. TrendForce data shows Blackwell share rising from 61% to 71% in 2026, consolidating dominance. Micron exits Rubin HBM4 supply chain, SK hynix to hold 70% share. Analysts maintain overweight ratings, viewing impact as limited. Rubin delay may extend SK hynix's HBM3E market dominance.

NVIDIA Other High Signal 2026-04-24

NVIDIA Internalizes GPT-5.5 Powered AI Agents at Scale, Defining New Enterprise AI Infrastructure Paradigm

NVIDIA announced that over 10,000 employees have scaled the use of GPT-5.5 via the Codex app, running on NVIDIA GB200 NVL72 infrastructure. This demonstrates the technical feasibility of 'transformative' productivity gains from frontier model inference in enterprise workflows. It also provides a reference architecture for deploying AI agents with auditable, isolated security via dedicated cloud VMs.

NVIDIA Product Launch High Signal 2026-04-23

NVIDIA Deploys OpenAI Codex: 10,000+ Employees Using GPT-5.5

NVIDIA 10,000+ employees using OpenAI Codex with GPT-5.5 on GB200 NVL72 platform, 35x inference cost reduction.

NVIDIA Other High Signal 2026-04-22

NVIDIA and Google Cloud Deepen Collaboration to Build Cloud Infrastructure for AI Factories and Physical AI

NVIDIA and Google Cloud have announced an expanded collaboration, introducing new Vera Rubin and Blackwell GPU-powered instances to build "AI factories" scaling to nearly a million GPUs. The integration of Gemini, Nemotron, and other platforms aims to accelerate production deployment of agentic and physical AI, such as robotics and digital twins.

Microsoft Other High Signal 2026-04-16

Microsoft Activates Fairwater Hyperscale AI Datacenter Ahead of Schedule, Setting New Infrastructure Standard

Microsoft announced the early activation of its Fairwater datacenter in Wisconsin, positioned as the world's most powerful AI facility. It integrates hundreds of thousands of NVIDIA GB200 GPUs into a single seamless cluster via massive fiber interconnect, targeting unprecedented compute scale for next-generation AI training and inference workloads.

TSMC Financial News High Signal 2026-04-16

TSMC Q1 Earnings: Advanced Packaging Capacity Bottleneck to Persist, Constraining AI Chip Supply Through 2025

TSMC Q1 earnings show HPC crossing 60% revenue share for the first time; CoWoS advanced packaging capacity will remain tight through 2027—the real AI chip supply bottleneck is packaging, not processes.

Amazon Partnership High Signal 2026-04-15

AWS Signs $38B AI Cloud Partnership with OpenAI

OpenAI signs 7-year $38B deal with AWS, deploying thousands of NVIDIA GB200/GB300 GPUs. OpenAI's first major Azure infrastructure diversification.

NVIDIA Other High Signal 2026-03-24

NVIDIA Donates GPU Dynamic Resource Allocation Driver to Kubernetes Community

NVIDIA donated its GPU Dynamic Resource Allocation (DRA) driver to the CNCF, making it an upstream Kubernetes project. This move aims to shift the core control point of GPU orchestration from proprietary vendor layers to the open-source community, and drive standardization in collaboration with major cloud providers.

NVIDIA Other 2026-01-23

NVFP4 + TeaCache Drive 10x FLUX.2 Inference Speedup, Locking Blackwell Ecosystem

NVIDIA and BFL optimize FLUX.2 on DGX B200/B300 using NVFP4 4-bit quantization, TeaCache step skipping, CUDA Graphs, and torch.compile, achieving 6.3x (single GPU) to 10.2x (dual GPU) latency reduction vs H200, with 40% memory savings. The stack is tightly coupled to TensorRT-LLM visualgen and Blackwell hardware.