Documentation

What is Smelt?

Last updated: April 9, 2026

Smelt is a Python CLI and cloud platform that takes any open-source large language model — dense or Mixture-of-Experts, 7 billion to 1 trillion parameters — and produces optimized smaller versions that fit your target hardware. You specify your constraints (available RAM, target task, quality threshold) and Smelt handles the rest: quantization to any GGUF format, MoE expert surgery to prune inactive experts, depth and width pruning, and an AI route selector that picks the optimal compression recipe for your model and hardware. The independent LLM compression market has been gutted by acquisitions (Neural Magic, OctoAI, Deci AI, Predibase — all absorbed), MoE models now power over 60% of frontier architectures with zero production compression tools available, and 50%+ of production deployments still run uncompressed because the tooling is too fragmented. Smelt closes that gap with a single command: smelt run.

Core Capabilities

  • Quantization — GGUF Q2 through Q8. One command. Any model. Ready for Ollama, llama.cpp, LM Studio, and any GGUF-compatible runtime.
  • MoE Expert Surgery — Identify and remove inactive experts from Mixture-of-Experts models (Llama 4, DeepSeek V3, Qwen3, Mixtral, DBRX). First production tool — zero competitors.
  • Depth & Width Pruning — Remove entire layers or shrink hidden dimensions to hit aggressive size targets while preserving quality.
  • AI Route Selector — AI analyzes your model architecture and hardware constraints, then picks the optimal compression recipe automatically.
  • Cloud Compression — Run compression jobs on Smelt cloud infrastructure. No GPU required on your end.

Supported Models

Any open-source LLM on Hugging Face — Llama, Qwen, DeepSeek, Mistral, Gemma, Command R, DBRX, Mixtral, and more. Dense and Mixture-of-Experts architectures are both fully supported.

Output Formats

Smelt outputs standard GGUF files compatible with Ollama, llama.cpp, LM Studio, vLLM, and any GGUF-compatible runtime. No lock-in, no proprietary formats.