
The first MoE expert surgery tool. Compress any LLM to fit your hardware.
MoE detected (384 experts, top-32)Pruning experts: 384 → 96Done! 2,080 GB → 145 GB (14.3x)88.4% quality | 6.2 tok/s
Cut inactive experts from MoE models. First production tool — zero competitors.

GGUF Q2 through Q8. One command. Any model. Ready for Ollama.

AI analyzes your model + hardware, picks the optimal compression recipe.
Select your hardware and task. See which models fit.
No models fit 16 GB for this task.
Llama 4 Scout 109B compressed to 16GB GGUF with 82% quality retained

Model frontier
Kimi K2.5 hit 1 trillion.
GPU ceiling
Consumer VRAM since 2022.
Models supported
From 8 countries. All families.
Run uncompressed
Tooling too fragmented.

The first MoE expert surgery tool. Compress any LLM to fit your hardware.