What is the impact level of this intelligence?

This intelligence is assessed as having Important impact on enterprise technology decisions.

MediaTek 2026-06-23

Technology Integration Impact: Important Conf: 85%

MediaTek Lands Exclusive Google TPU v9 Inference Upgrade Triggerfish with 2x SRAM

Q: Why is this MediaTek update important for enterprises?

Ostensibly a performance boost, but Google is **defending** against NVIDIA's CUDA ecosystem by locking inference workloads onto custom TPUs via **SRAM scaling** and **HBM4E**, reducing user flexibility to switch to general-purpose GPUs. Vendor lock-in: The **simulation die** hints at Google embedding control plane functions (training/inference switching, RL) at the chip level, making third-party orchestration tools (e.g., Kubernetes GPU Operator) obsolete. Engineering limitations: **Tail latency** remains unaddressed; **PFC/ECN congestion control** bottlenecks persist in distributed inference. **Cache coherency** between on-chip SRAM and HBM4E may introduce **head-of-line blocking**. Also, by 2028, NVIDIA's **Rubin architecture** could surpass TPU v9 in performance.

Summary

Google plans a TPU v9 inference upgrade, Triggerfish, exclusively fabbed by MediaTek. It features 2-3x on-chip SRAM, HBM4E DRAM, and a simulation die for local management. Production starts late 2027 with 1-2M units lifecycle, unit price ~30% higher than Humufish.

Key Takeaways

According to Ming-Chi Kuo, Google will launch Triggerfish, a TPU v9 inference-optimized upgrade. Key improvements: on-chip SRAM 2-3x larger than predecessor Humufish, keeping larger working sets local to reduce data movement and boost decoding efficiency; off-chip DRAM upgraded from HBM4 to HBM4E. A simulation die is introduced, possibly for local TPU management, training/inference switching, RL, and AI agent coordination.

MediaTek, initially involved in TPU v7e, now exclusively leads TPU v9 series, signaling a strategic pivot from mobile chips to AI datacenter chips. Triggerfish targets late 2027 production, 1-2M units lifecycle, ~30% higher unit price than Humufish.

Why It Matters

Ostensibly a performance boost, but Google is defending against NVIDIA's CUDA ecosystem by locking inference workloads onto custom TPUs via SRAM scaling and HBM4E, reducing user flexibility to switch to general-purpose GPUs.

Vendor lock-in: The simulation die hints at Google embedding control plane functions (training/inference switching, RL) at the chip level, making third-party orchestration tools (e.g., Kubernetes GPU Operator) obsolete.

Engineering limitations: Tail latency remains unaddressed; PFC/ECN congestion control bottlenecks persist in distributed inference. Cache coherency between on-chip SRAM and HBM4E may introduce head-of-line blocking. Also, by 2028, NVIDIA's Rubin architecture could surpass TPU v9 in performance.

PRO Decision

【Vendors】Competitors (Broadcom, NVIDIA, AMD): Pitch open standard alternatives to Google's TPU customers. Highlight simulation die lock-in risks. Broadcom should accelerate UCIe-based chiplet solutions with pluggable SRAM/HBM4E to break Google's proprietary packaging.
【Enterprises】CIOs: Conduct zero-trust audits. Demand full API specs for simulation die; assess compatibility with Kubernetes and Prometheus. Beware single-source risk from MediaTek. Reserve 30% inference on NVIDIA H200/B200 or AMD MI350 as hedge.
【Investors】MediaTek's exclusive win is positive, but volume only in 2028. By then, NVIDIA Rubin and AMD MI400 will be out. Watch MediaTek's HBM4E packaging yield and IP ownership of simulation die. Long-term, Google TPU remains captive, unable to challenge NVIDIA's third-party cloud dominance.

Source: IT之家

View Original →

Get 3-5 key AI infrastructure signals weekly →

Summary

Key Takeaways

Why It Matters

PRO Decision

💬 Comments (0)