<?xml version='1.0' encoding='utf-8'?>
<rss version="2.0"><channel><title>hazemawadalla.com / blog</title><link>https://hazemawadalla.com/blog/</link><description>Writing on storage, AI inference, and systems engineering.</description><language>en-us</language><lastBuildDate>Tue, 02 Jun 2026 15:31:43 +0000</lastBuildDate><item><title>TurboQuant KV-Cache Quantization on a Consumer-Class GPU: An Empirical Evaluation</title><link>https://hazemawadalla.com/blog/1-turboquant-kv-cache-quantization-on-a-consumer-class-gpu-an-empirical-evaluation/</link><guid>https://hazemawadalla.com/blog/1-turboquant-kv-cache-quantization-on-a-consumer-class-gpu-an-empirical-evaluation/</guid><pubDate>Tue, 02 Jun 2026 15:31:43 +0000</pubDate><description>## **TurboQuant KV-Cache Quantization on a Consumer-Class GPU: An Empirical Evaluation**

**Hazem Awadallah — Senior Systems Engineer, Kingston Technology** · Independent evaluation · NVIDIA RTX A6000 · June 2026

**Abstract.** A transformer's **KV cache** is its scratchpad: ever</description></item></channel></rss>