Cost-Aware Routing
Route requests to the cheapest model that meets quality thresholds. Track per-request costs with Prometheus metrics.
curl -fsSL https://llmsoup.insideapp.fr/install.sh | sh Cost-Aware Routing
Route requests to the cheapest model that meets quality thresholds. Track per-request costs with Prometheus metrics.
Privacy First
Use local models, detect and redact PII before it reaches any provider. Your data stays on your infrastructure.
LLM Observability
See which tasks hit which models, track domain classification, and understand your LLM traffic with Prometheus metrics.
Single Binary
One Rust binary, one YAML config. No runtime dependencies, no containers required. Idle memory under 500MB.
OpenAI-Compatible
Drop-in replacement for the OpenAI API. Zero code changes in your existing applications.
Local-First Security
Runs on your infrastructure. Token-based auth on all endpoints. No external dependencies required.