🔧 Hardware

9 articles

🟢 🔧 Hardware April 25, 2026 · 3 min read

AMD Primus Projection: Tool for Predicting LLM Training Memory and Speed Before Running on Instinct GPU Clusters

Editorial illustration: AMD Primus Projection — LLM training prediction

AMD Primus Projection is a tool that predicts memory requirements and throughput for LLM training on Instinct GPU clusters before a run begins. It uses analytical formulas combined with real GPU benchmarking, and projections fall within ~10% of measured results on MI325X and MI355X accelerators for Llama and Mixtral models.

🟢 🔧 Hardware April 24, 2026 · 3 min read

Google at Cloud Next '26 unveils TPU 8i and TPU 8t: specialized chips for agentic AI computing

Editorial illustration: Google TPU 8i and 8t — specialized AI chips

Google at Cloud Next '26 unveiled two new generations of TPU chips: TPU 8i for AI agent inference and TPU 8t for training the most complex models. The move formalizes the split of Google's TPU line into two specialized branches within the 'agentic era' of computing.

🟡 🔧 Hardware April 23, 2026 · 2 min read

NVIDIA and Google Cloud announce collaboration for agentic AI and physical AI on shared infrastructure

Editorial illustration: AI čip — hardware

NVIDIA and Google Cloud announced a joint collaboration to accelerate agentic AI and physical AI workloads, combining NVIDIA GPU infrastructure with the Google Cloud platform for robotics, autonomous systems, and agents.

🟢 🔧 Hardware April 23, 2026 · 3 min read

Gemma 4 runs as a Vision Language Agent locally on Jetson Orin Nano Super

Editorial illustration: AI chip — hardware

NVIDIA and HuggingFace demonstrated Gemma 4 as a Vision Language Agent that autonomously decides when to use the camera and runs the entire pipeline, including speech-to-text and TTS, locally on an NVIDIA Jetson Orin Nano Super with 8 GB of memory, with no cloud dependency.

🔴 🔧 Hardware April 22, 2026 · 3 min read

Google unveils 8th-generation TPU chips: two specialized variants for the agentic AI era

Editorial illustration: Two specialized 8th-generation TPU chips for training and inference of agentic AI workloads

At Cloud Next '26, Google introduced the eighth generation of its TPU chips in two specialized variants — TPU 8t for model training and TPU 8i for agentic inference. This is the first generation purpose-built for autonomous AI agents and multi-step reasoning.

🟡 🔧 Hardware April 21, 2026 · 3 min read

AWS G7e Blackwell Instances: Qwen3-32B on SageMaker for $0.41 per Million Tokens — 4× Cheaper Inference

Editorial illustration of a data center with NVIDIA Blackwell GPUs and GDDR7 memory modules

AWS G7e instances are new SageMaker GPU instances with the NVIDIA RTX PRO 6000 Blackwell chip and 96 GB GDDR7 memory, delivering up to 2.3× better inference than G6e. The cost for Qwen3-32B drops from $2.06 to $0.79 per million output tokens, and with EAGLE speculative decoding down to $0.41.

🟡 🔧 Hardware April 16, 2026 · 2 min read

AWS: Speculative Decoding on Trainium Chips Accelerates LLM Inference Up to 3x

Amazon Web Services has published a detailed implementation of speculative decoding on AWS Trainium chips in combination with the vLLM framework, achieving up to 3x faster token generation for decode-heavy workloads. The technique uses a smaller draft model to predict the next N tokens, with a larger target model verifying them in a single pass, eliminating the bottleneck of sequential generation.

🟢 🔧 Hardware April 16, 2026 · 2 min read

NVIDIA: Blackwell Generates Tokens 35x Cheaper Than Hopper — Cost per Token Is the Only Metric

NVIDIA has published an analysis arguing that cost per token is the only relevant metric for AI infrastructure. A comparison of the Blackwell and Hopper generations shows that Blackwell costs twice as much per GPU hour but generates 65x more tokens per second, resulting in a 35x lower cost per million tokens — $0.12 versus $4.20 for Hopper.

🟡 🔧 Hardware April 10, 2026 · 2 min read

NVIDIA unveils RoboLab benchmark and a new wave of physical AI projects at National Robotics Week

As part of National Robotics Week 2026, NVIDIA has presented a series of new physical AI projects, including RoboLab — a benchmark for simulation-to-reality transfer, collaborations with Toyota Research Institute, Mimic Robotics and Doosan Robotics, and open resources for robot policy evaluation such as Isaac Lab-Arena.