OpenAI and Broadcom launch Jalapeño inference ASIC: 9-month tapeout, 2027 mass production, targets GPU replacement
Summary
Key Takeaways
OpenAI and Broadcom jointly announced the first custom inference chip Jalapeño, with OpenAI handling architecture design, Broadcom silicon and networking, Celestica board and rack integration, and TSMC fabrication. Leveraging in-house LLMs for design optimization, the chip went from architecture to tapeout in just 9 months. Early tests show significant performance-per-watt advantages over existing solutions.
Physical samples delivered June 24, 2026; small-scale deployment planned end of 2026, rapid ramp in 2027, full mass production by H1 2028. Broadcom's AI semiconductor revenue hit $10.8B in Q2 FY2026, up 143% YoY. CEO Hock Tan cited 'near-infinite' compute demand from six core customers.
Industry analysts estimate ASIC TCO advantage over general-purpose GPUs at 40–65% for large-scale inference. TrendForce forecasts 44.6% growth in custom AI chip shipments in 2026 vs. 16.1% for commercial GPUs.
Why It Matters
OpenAI's move is a strategic defense against NVIDIA's GPU hegemony, aiming to lock users into an OpenAI-optimized silicon-software stack. Jalapeño is tailored for current OpenAI inference workloads; future model changes could degrade tail latency and batch efficiency, whereas NVIDIA's CUDA ecosystem offers flexible adaptation.
Tying with Broadcom’s networking hardware (likely Tomahawk/Jericho switches) forces users into RoCEv2 or InfiniBand lock-in, reducing network flexibility. Celestica rack integration adds supply chain concentration risk. The 9-month tapeout may have compromised power optimization and thermal design, leading to higher-than-expected rack power density in production.
PRO Decision
[Vendors (Competitors)]
NVIDIA should accelerate inference-specific GPU optimizations (e.g., L40S, B200) and reinforce CUDA tooling (TensorRT-LLM, Triton) to highlight ASIC inflexibility. AMD can partner with Broadcom rivals (Marvell, Intel) to offer open ASIC reference designs. Google TPU and AWS Trainium should emphasize their mature custom chip ecosystems and cross-model compatibility.
[Enterprises (CIOs/Architects)]
Conduct zero-trust technical audit: demand independent benchmarks of Jalapeño vs. H100/B200 on identical models, covering tail latency, throughput/Watt, and memory bandwidth utilization. Assess Broadcom networking lock-in: support for standard RoCEv2 and open switches? Contractually guarantee chip upgrade path and model migration costs. Maintain multi-cloud/multi-hardware optionality.
[Investors]
See through PR: Jalapeño's TCO advantage is model-specific; model iteration speed outpaces ASIC depreciation (3-4 years). Monitor Broadcom's revenue concentration—OpenAI's success could lose Broadcom a key customer. TSMC advanced process capacity constraints threaten ramp. Long-term custom ASIC trend valid, but over-vertical integration risks ecosystem fragmentation.
Get 3-5 key AI infrastructure signals weekly →
💬 Comments (0)