Skip to content

Intelligent LLM routing. One config file. Zero lock-in.

llmsoup is a lightweight, high-performance LLM routing proxy built in Rust. Route requests to the optimal model based on cost, quality, and latency — with a single YAML config and zero code changes.

Get started in seconds

curl -fsSL https://llmsoup.insideapp.fr/install.sh | sh

Why llmsoup?

Cost-Aware Routing

Route requests to the cheapest model that meets quality thresholds. Track per-request costs with Prometheus metrics.

Privacy First

Use local models, detect and redact PII before it reaches any provider. Your data stays on your infrastructure.

LLM Observability

See which tasks hit which models, track domain classification, and understand your LLM traffic with Prometheus metrics.

Single Binary

One Rust binary, one YAML config. No runtime dependencies, no containers required. Idle memory under 500MB.

OpenAI-Compatible

Drop-in replacement for the OpenAI API. Zero code changes in your existing applications.

Local-First Security

Runs on your infrastructure. Token-based auth on all endpoints. No external dependencies required.