Filter

×
Active Filters Clear All
Keyword: 推理 ×
147 Total Reports
1/8 Page
Intel Other 2026-12-30

Intel at Computex 2026: CPU as Agentic AI Orchestrator, x86 Reclaims Inference Control

At Computex 2026, Intel unveiled the 288-core Xeon 6+ (Intel 18A) and 3rd-gen Core Ultra, claiming Agentic AI shifts CPU:GPU ratio from 1:8 to 1:1. Partnering with SambaNova and Foxconn for rack-scale inference systems, Intel repositions the CPU as the orchestrator for multi-step AI reasoning, aiming to reclaim control from GPU-centric architectures.

Google Cloud Other 2026-06-21

Google Trillium TPU: 4.7x Training Boost Masks Vendor Lock-in and Ecosystem Risks

Google Cloud unveils 6th-gen TPU Trillium with 3nm process, delivering 4.7x training and 2.5x inference performance gains, with 2x energy efficiency over NVIDIA H100. However, Trillium is exclusive to Google Cloud TPU v6p instances and deeply integrated into AI Hypercomputer architecture, creating a full-stack lock-in from silicon to networking.

NVIDIA Other 2026-06-21

NVIDIA Blackwell Ultra: AI Factory Ecosystem Lock-in via Omniverse

NVIDIA unveils Blackwell Ultra with 4x inference performance, DGX B200, and partners with Foxconn for the world's largest AI factory (2027). Omniverse now has 700+ customers, positioning as the standard for industrial digital twins, aiming to reshape global compute into AI factories.

Fortinet Other 2026-06-19

Fortinet FortiAIGate with NVIDIA Shifts AI Security Control to GPU-Accelerated Inline

Fortinet launches FortiAIGate integrating NVIDIA Blackwell GPU and Dynamo inference framework for inline AI workload protection across data center, cloud, and edge. Promises ultra-low latency, multi-tenancy, and data sovereignty compliance.

Anthropic Other 2026-06-18

Claude Fable 5: 50M Lines Migrated in One Day, AI Code Refactoring Hits Inflection

Anthropic releases Claude Fable 5, excelling in long-horizon tasks. Stripe migrates 50M lines of Ruby code in one day using the model, demonstrating practical AI-driven code refactoring. A report claims Claude now writes 80%+ of Anthropic's code, with a call for verifiable pause mechanisms.

Google Other 2026-06-17

谷歌推出Android 17系统多项AI功能分阶段上线

...

TSMC Other 2026-06-17

台积电首次公开CoWoS玻璃基板开发计划

...

NVIDIA Other 2026-06-17

NVIDIA and Coherent Scale InP Optical Interconnects for AI Rack-Scale Photonic Fabric

NVIDIA invests in Coherent to build a 6-inch InP wafer fab in Texas, dedicated to optical interconnects for AI racks. The Vera Rubin Ultra NVL576 cluster (576 GPUs across 8 racks) demands photonic links; copper is no longer viable. This signals a fundamental shift from copper to photonics in AI fabric.

ARM Other 2026-06-16

ARM AGI CPU Enters Mass Production with $2B Pre-Orders, Shifting AI Inference to ARM

ARM's self-developed AGI CPU has entered mass production with TSMC, securing $2B in pre-orders. Partnering with Red Hat, ARM aims to bring enterprise software stacks to its CPU, signaling a strategic shift from IP licensing to chip manufacturing and challenging x86 in AI inference.

Google Cloud Other 2026-06-15

Google TPU 8th Gen Splits Training and Inference Chips, Inflection Point in AI Infra TCO

Google Cloud unveils 8th-gen TPU with separate training (TPU8t) and inference (TPU8i) chips, delivering 3x training pod performance and 80% inference dollar-performance improvement. Vertex AI evolves into Gemini Enterprise Agent Platform, while the Smals sovereign cloud contract validates public sector AI adoption under strict compliance.

Qualcomm Other 2026-06-14

Qualcomm AI200 on AWS: Inference Chip Ecosystem Shifts from Nvidia Singularity to Multi-Alliance

Qualcomm's AI200 inference chip (768GB memory) is slated for broad AWS deployment by 2026, aiming to reduce cloud AI inference costs. This marks Qualcomm's strategic pivot from mobile to cloud, leveraging AWS's custom silicon initiative to challenge Nvidia's inference monopoly and restructure the cloud inference chip ecosystem.

NVIDIA Other 2026-06-12

NVIDIA and SK Hynix Lock Down HBM4/5 Roadmap, Cementing Vera Rubin Supply Chain

NVIDIA and SK Hynix sign a multi-year agreement to co-define HBM4 production and HBM5 pre-research for Vera Rubin GPUs. Samsung also enters HBM4 supply as a second source. The deal elevates SK Hynix from vendor to co-developer, potentially creating a de facto memory standard barrier that marginalizes Micron and others.

AMD Other 2026-06-12

AMD Zen 6 Venice 256-Core EPYC Claims 3.3x Rack Performance Over NVIDIA Vera, But Estimates Raise Questions

AMD unveils first estimated performance of Zen 6 Venice EPYC (2nm, 256 cores), claiming 3.3x rack-level integer throughput over NVIDIA Vera at 100kW total power. A direct counter to NVIDIA's Arm push, but based on projected estimates, not silicon.

AMD Other 2026-06-12

AMD Backs All-Instinct GPU Cloud: TensorWave's $350M Series B Signals NVIDIA Ecosystem Breakout

TensorWave closes $350M Series B led by Magnetar and AMD Ventures at $1.55B valuation. The cloud is exclusively built on AMD Instinct GPUs (MI300X to MI455X), targeting memory-intensive AI workloads to offer a viable alternative to NVIDIA CUDA lock-in and validate ROCm software stack maturity in production.

Intel Other 2026-06-06

Intel Unveils Decoupled Inference Architecture and Xeon 6+, Partners with SambaNova and Foxconn for Rack-Scale AI Infrastructure

At Computex 2026, Intel unveiled three innovations: 1) Rack-scale AI infrastructure with SambaNova/Foxconn (production-ready); 2) World's first decoupled inference demo—Xeon 6 orchestrates, SN40 RDU decodes, Blackwell GPU prefill; Together.ai achieved fastest enterprise inference with MiniMax 2.5; 3) Xeon 6+—first Intel 18A data center CPU, 32U rack delivers 36,864 cores at ~100kW. Agent inference shifts CPU:GPU ratio from 1:4 toward 1:1.

Huawei Product Launch 2026-06-05

Huawei Cloud Launches AICS: Control Plane Shift in the Token Industrialization Era

Huawei Cloud unveils four Agentic Infra products, led by the AICS cluster (100K cards/200 EFLOPS). It integrates NPU-direct CMS memory, CCE VolcanoNext unified scheduling, and AgentSphere security sandbox to create a unified control plane for LLM training and Agent inference, aiming to lock in the full-stack AI infrastructure.

Microsoft Azure Product Launch 2026-06-03

Microsoft Maia 200 Mass-Produced, Cobalt 200 Previewed: AI Inference Control Shifts to Azure

At Build 2026, Microsoft announced mass production of Maia 200 AI inference chips, preview of Cobalt 200 ARM processors, and the MAI-Thinking-1 reasoning model (35B params). This signals a full-stack vertical integration to reduce NVIDIA dependency and lock Azure AI workloads.

Meta Other High Signal 2026-06-02

Build 2026: Project Polaris Replaces GPT-4 Turbo, GitHub Copilot Decouples from OpenAI

Microsoft unveiled Project Polaris in-house coding model at Build 2026, planning to replace OpenAI GPT-4 Turbo as GitHub Copilot's default inference engine starting August 2026, with a 3-month transition period. This marks Microsoft's first formal decoupling from OpenAI at the model layer. Anthropic Claude has been integrated into Copilot, supporting multi-model draft+review collaborative workflows. Microsoft publicly named Claude as a primary target for the first time. Strategic signal: model self-reliance, distribution and runtime are durable moats.

Intel Other 2026-06-02

Intel Unveils Rack-Scale AI Inference with Xeon 6+ and SambaNova RDU, Targeting Agentic Workloads

Intel announces rack-scale AI infrastructure combining Xeon 6+ (288 cores, Intel 18A) and SambaNova SN-50 RDU for agentic inference. Also launches Vector Core Compute cloud with decoupled prefill/decode using Xeon, SambaNova, and NVIDIA Blackwell. Aims to disrupt GPU-centric inference by offering lower TCO and higher density.

NVIDIA Other 2026-06-01

NVIDIA RTX Spark: SoC Seizes PC Control, AI Compute Revolution with Ecosystem Lock-in

NVIDIA launches RTX Spark SoC, integrating Blackwell GPU with 20-core Grace CPU (MediaTek co-designed), NVLink-C2C at 600GB/s, up to 128GB unified memory, 1 petaflop FP4 AI, and local 120B-parameter LLM support. This marks a shift from GPU vendor to platform provider, directly challenging Apple M, Qualcomm, and x86 incumbents.