Llama 4 Scout just dropped. Run it locally today.

Run 109B Models on a 16GB Laptop

The first MoE expert surgery tool. Compress any LLM to fit your hardware.

MoE detected (384 experts, top-32)
Pruning experts: 384 → 96
Done! 2,080 GB → 145 GB (14.3x)
88.4% quality | 6.2 tok/s

Capabilities

Three Engines, One Command

Cut inactive experts from MoE models. First production tool — zero competitors.

GGUF Q2 through Q8. One command. Any model. Ready for Ollama.

AI analyzes your model + hardware, picks the optimal compression recipe.

Proof

Llama 4 Scout 109B compressed to 16GB GGUF with 82% quality retained

2,080GB

Raw

145GB

Smelted

14.3x

quality-report.json

InputKimi K2.5 (1T params)

ArchitectureMoE (384 experts)

Original2,080 GB

Compressed145 GB

Ratio14.3x

Quality88.4% retained

Speed6.2 tok/s (2×H100)

Model frontier

Kimi K2.5 hit 1 trillion.

0GB

GPU ceiling

Consumer VRAM since 2022.

Models supported

From 8 countries. All families.

Run uncompressed

Tooling too fragmented.

The first MoE expert surgery tool. Compress any LLM to fit your hardware.