RAG - AI Infrastructure Intelligence Search

Qualcomm Other 2026-07-26

高通第六代骁龙8系列将涨价，2纳米制程推高成本

...

Cisco Other 2026-07-24

Cisco Proposes Logically Air-Gapped Model with eBPF, Shifting Security to Kernel

Cisco introduces a logically air-gapped governance model using eBPF and Cilium to create a software-defined cryptographic perimeter at the kernel level. Integrating Cisco Secure Workload with Isovalent, it aims to provide data residency and regulatory compliance for containerized, virtualized, and bare-metal environments without sacrificing cloud agility.

Microsoft Other 2026-07-24

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

Microsoft introduces MAI-Image-2.5-Pro and MAI-Voice-2-Flash, proprietary models now powering Bing, PowerPoint, OneDrive, and Dynamics 365, replacing third-party models. Claims up to 84% GPU cost reduction and 2x faster voice, signaling a strategic shift to in-house AI.

Qualcomm Other 2026-07-22

Qualcomm Dragonfly C1000 CPU获Meta多代合作协议

...

Hewlett Packard Enterprise Other 2026-07-22

HPE扩展Private Cloud AI产品线，集成NVIDIA Vera Rubin NVL72平台

...

Hewlett Packard Enterprise Other 2026-07-20

HPE Expands Private Cloud AI with NVIDIA Vera Rubin, Enabling Agent-Native AI Factory

HPE expands its Private Cloud AI line with NVIDIA Vera Rubin NVL72 and HGX Rubin NVL8, introducing Compute XD700, Cray GX240 blade with Vera CPU, and Quantum-X800 InfiniBand. New software includes Agent Toolkit and NemoClaw for agent-native AI, with Alletra Storage MP X10000 in Q4 2026.

NVIDIA Other 2026-07-16

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

NVIDIA CUDA 13.3 adds the clmad hardware instruction for carryless multiply-accumulate on Ampere+ GPUs. GHASH throughput reaches 6.3 TB/s on B200, up to 18.8x faster than bitsliced. Sum-check protocol accelerates 3-13x. The instruction also benefits CRC, Reed-Solomon, and post-quantum cryptography.

Qualcomm Other 2026-07-15

Qualcomm Negotiates Custom Chips with ByteDance, Shifts to Data Center Ecosystem

Qualcomm is in talks with ByteDance to develop custom chips, including VPU, AI components, and CPUs, leveraging AlphaWave Semi's interconnect tech. This marks Qualcomm's strategic shift from smartphones to data center custom silicon, with its Dragonfly portfolio featuring C1000 CPU, HBC, and AI300 accelerators.

Google Other 2026-07-15

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Google Cloud enables streaming Gemini Enterprise app telemetry (prompts, responses, activity logs) into BigQuery for real-time analysis. Leveraging BigQuery's AI capabilities (Conversational Analytics, auto-schema), it automates auditing, compliance, and insights for large-scale AI deployments, driving data-driven AI observability.

Samsung Electronics Other 2026-07-10

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Samsung launches GAIA AI PC processor with 4nm process and memory-centric NPU, integrating LPDDR5X controller with NPU for near-memory computing, achieving 40% energy efficiency improvement and 50 TOPS. Certified for Microsoft Copilot+ PC, Lenovo to adopt in Q4 2026.

AMD Other 2026-07-10

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

...

NVIDIA Other 2026-07-07

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

NVIDIA launches Vera CPU, a max single-threaded CPU at scale for agentic AI. With Olympus cores delivering 1.8x sustained per-core performance over x86, 1.2TB/s LPDDR5X bandwidth, and 3.4TB/s core-to-core bandwidth, Vera integrates into NVIDIA's unified AI factory architecture, aiming to lock users into its ecosystem.

NVIDIA Other 2026-07-07

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

...

OpenAI Other 2026-07-05

OpenAI Winds Down Fine-Tuning API: A Strategic Shift in AI Customization Landscape

OpenAI plans to phase out its fine-tuning API by 2027, stopping new task creation but allowing inference on existing models. This forces startups relying on fine-tuning for differentiation to migrate to open-source models or RAG, reshaping the AI customization ecosystem.

Qualcomm Other 2026-07-02

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Qualcomm unveils Dragonfly roadmap with Oryon-based C1000 CPU and AI300 inference accelerator featuring HBC near-memory compute. Meta and Microsoft are early adopters. The strategy targets AI inference TCO reduction and memory wall breakthrough, bypassing Nvidia's training dominance.

TSMC Other 2026-07-01

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

AI chip startup Etched emerges from stealth with Sohu, a Transformer-specific ASIC on TSMC N4P with 144GB HBM3E. By hardwiring attention mechanisms, it claims 20x throughput and 140x price-performance vs. H100 on Llama 70B. With $800M total funding and first racks shipping this summer, it directly challenges NVIDIA's inference dominance.

OpenAI Other 2026-06-26

Making private MCP servers reachable without making them public | OpenAI Developers

...

Qualcomm Other 2026-06-25

Qualcomm Enters AI Datacenter with Dragonfly ARM CPU, Meta Signs Multi-Generation Deal

Qualcomm unveils Dragonfly C1000 ARM-based datacenter CPU, AI300 accelerator, and interconnect. Meta commits to multi-generation CPU supply, Microsoft Azure to deploy HBC chips. Qualcomm targets $15B+ datacenter revenue by FY2029, acquires Modular for software stack.

Huawei Other 2026-06-25

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

At MWC Shanghai 2026, Huawei urged carriers to shift from byte-based to token-based billing for AI workloads, showcasing a 372% token throughput improvement in long-sequence inference via its AI Inference Acceleration Solution. It also highlighted the Upper-6 GHz band as critical for AI wearables requiring 20 Mbps uplink, aiming to reposition 5G-A networks as AI compute delivery infrastructure.

Qualcomm Other 2026-06-25

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Qualcomm announces HBC Gen 1, a 3D-stacked LPDDR memory with integrated compute die, achieving 133 TB/s bandwidth and 6x energy efficiency over HBM. Aimed at replacing HBM in AI accelerators, shipping with AI250 in mid-2027, but supply chain and feasibility remain uncertain.

Reports

Filter

高通第六代骁龙8系列将涨价，2纳米制程推高成本

Cisco Proposes Logically Air-Gapped Model with eBPF, Shifting Security to Kernel

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

Qualcomm Dragonfly C1000 CPU获Meta多代合作协议

HPE扩展Private Cloud AI产品线，集成NVIDIA Vera Rubin NVL72平台

HPE Expands Private Cloud AI with NVIDIA Vera Rubin, Enabling Agent-Native AI Factory

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

Qualcomm Negotiates Custom Chips with ByteDance, Shifts to Data Center Ecosystem

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

OpenAI Winds Down Fine-Tuning API: A Strategic Shift in AI Customization Landscape

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

Making private MCP servers reachable without making them public | OpenAI Developers

Qualcomm Enters AI Datacenter with Dragonfly ARM CPU, Meta Signs Multi-Generation Deal

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Reports

Filter

高通第六代骁龙8系列将涨价，2纳米制程推高成本

Cisco Proposes Logically Air-Gapped Model with eBPF, Shifting Security to Kernel

Microsoft launches MAI-Image-2.5-Pro and MAI-Voice-2-Flash, deepening in-house AI ecosystem

Qualcomm Dragonfly C1000 CPU获Meta多代合作协议

HPE扩展Private Cloud AI产品线，集成NVIDIA Vera Rubin NVL72平台

HPE Expands Private Cloud AI with NVIDIA Vera Rubin, Enabling Agent-Native AI Factory

NVIDIA CUDA 13.3 Introduces clmad for Hardware-Accelerated Carryless Multiplication on GPUs

Qualcomm Negotiates Custom Chips with ByteDance, Shifts to Data Center Ecosystem

Google Deeply Integrates Gemini Enterprise Telemetry with BigQuery for AI Governance

Samsung GAIA AI PC Chip Samples with Memory-Centric NPU, Targeting 50 TOPS

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs

NVIDIA Vera CPU: Max Single-Threaded Performance at Scale for Agentic AI

AI Innovators Adopt NVIDIA Vera — Why Max Single-Threaded CPU at Scale Matters

OpenAI Winds Down Fine-Tuning API: A Strategic Shift in AI Customization Landscape

Qualcomm Enters AI Inference with Dragonfly C1000 CPU and HBC Near-Memory Compute

Etched Unveils Sohu Transformer ASIC: Claims 20x H100 Inference Throughput, Challenging NVIDIA's Grip

Making private MCP servers reachable without making them public | OpenAI Developers

Qualcomm Enters AI Datacenter with Dragonfly ARM CPU, Meta Signs Multi-Generation Deal

Huawei Pushes Token-Based Billing at MWC Shanghai 2026: Shifting Carrier Monetization from Bytes to AI Inference Value

Qualcomm HBC Gen 1 Stacks LPDDR to 133 TB/s, Challenging HBM Dominance

Towards Feature Complete Triton Support in JAX-Triton â ROCm Blogs