DeepSeek R1 – model quantization for customer grade cards

Posted on January 29, 2025

This model has shaken the NASDAQ, so it is not a big surprise that people started to adapt it further. The requirements are very high for home / hobby usage, but already there is a version which can run on a 4090 24GB VRAM, and it should also work on Apple Silicon with a larger amount of RAM – the performance people achieved is from 2-14 tokens/s, so it is not so bad for free GPT-4o-grade AI.