Reports
AI-generated structured vendor updates
台积电首次公开CoWoS玻璃基板开发计划
...
Google TPU 8th Gen Splits Training and Inference Chips, Inflection Point in AI Infra TCO
Google Cloud unveils 8th-gen TPU with separate training (TPU8t) and inference (TPU8i) chips, delivering 3x training pod performance and 80% inference dollar-performance improvement. Vertex AI evolves into Gemini Enterprise Agent Platform, while the Smals sovereign cloud contract validates public sector AI adoption under strict compliance.
Qualcomm AI200 on AWS: Inference Chip Ecosystem Shifts from Nvidia Singularity to Multi-Alliance
Qualcomm's AI200 inference chip (768GB memory) is slated for broad AWS deployment by 2026, aiming to reduce cloud AI inference costs. This marks Qualcomm's strategic pivot from mobile to cloud, leveraging AWS's custom silicon initiative to challenge Nvidia's inference monopoly and restructure the cloud inference chip ecosystem.
Microsoft Maia 200 Mass-Produced, Cobalt 200 Previewed: AI Inference Control Shifts to Azure
At Build 2026, Microsoft announced mass production of Maia 200 AI inference chips, preview of Cobalt 200 ARM processors, and the MAI-Thinking-1 reasoning model (35B params). This signals a full-stack vertical integration to reduce NVIDIA dependency and lock Azure AI workloads.
AWS Launches Inferentia2 Chip for Generative AI Infrastructure Optimization
AWS launched second-gen Inferentia2 AI inference chip, designed for Transformer models with 4x performance boost and support for 175B parameter models. Integrated into EC2 Inf2 instances with UltraClusters architecture for large-scale deployment, offering 40% better cost-performance and 50% lower power consumption than GPU instances.
NVIDIA Acquires Groq LPU: Inference Architecture Shift from HBM to On-Chip SRAM
NVIDIA signs ~$20B licensing deal with Groq for LPU tech, featuring 230MB on-chip SRAM at 80TB/s bandwidth. This targets Transformer inference decode, replacing HBM bottlenecks with ultra-low latency on-chip storage, potentially reshaping the AI inference chip landscape.