Plugins
Plugins add security, caching, and observability capabilities to routing decisions. They execute as hooks around model calls — some run before the request reaches the model, others inspect the response afterward. This page documents the plugin system and every built-in plugin.
For the YAML field reference and rule syntax, see the Configuration Reference. For signal evaluation and rule matching, see Signals & Routing.
How plugins work
Section titled “How plugins work”Plugins are scoped to routing rules (decisions), not global. Each rule can define its own set of plugins that execute only when that rule matches a request. This gives fine-grained control — you might enable jailbreak detection on public-facing rules while skipping it for internal traffic.
Lifecycle
Section titled “Lifecycle”Plugins hook into two phases of the request lifecycle:
Incoming request │ ▼Signal evaluation & rule matching │ ▼┌─────────────────────────────────┐│ Plugin request hooks (pre) │ ← semantic-cache, jailbreak, pii, system_prompt, header_mutation│ Execute in declaration order │└─────────────────────────────────┘ │ ▼Model HTTP call │ ▼┌─────────────────────────────────┐│ Plugin response hooks (post) │ ← hallucination, semantic-cache (store), header_mutation, router_replay│ Execute in declaration order │└─────────────────────────────────┘ │ ▼Response returned to clientPre-request plugins can inspect or modify the request before it reaches the model. Some can short-circuit the pipeline entirely:
- semantic-cache returns a cached response if a similar request was seen recently, skipping the model call.
- jailbreak and pii can block the request and return an error response.
- system_prompt injects or replaces the system message in the conversation.
- header_mutation adds, sets, or removes headers on the outbound model request.
Post-response plugins inspect or modify the model response before it reaches the client:
- hallucination evaluates response quality and can add warning headers, modify the body, or block the response.
- semantic-cache stores the response for future cache hits.
- header_mutation can also modify response headers sent back to the client.
- router_replay captures the full routing decision and optional request/response data for debugging.
Execution order
Section titled “Execution order”Plugins execute in the order they appear in the plugins array of a rule. This order is deterministic and matters — for example, placing jailbreak before system_prompt ensures malicious prompts are caught before any system prompt injection occurs.
Graceful degradation
Section titled “Graceful degradation”Plugin failures do not block routing. If a plugin encounters an error during execution, the error is logged and metrics are recorded, but the request continues through the remaining plugins and model call. This ensures that a misconfigured or failing plugin never takes down the routing pipeline. For broader context on how llmsoup handles component failures, see Graceful degradation in the Signals & Routing guide.
Plugin configuration
Section titled “Plugin configuration”Plugins are configured within the plugins array of a routing rule. Each plugin entry requires a type field identifying the plugin and a configuration object with plugin-specific settings.
decisions: - name: general-routing priority: 50 conditions: [] action: model: gpt-5-mini strategy: default plugins: - type: jailbreak configuration: enabled: true threshold: 0.7 - type: system_prompt configuration: enabled: true system_prompt: "You are a helpful assistant." mode: replace - type: hallucination configuration: enabled: true threshold: 0.7 action: headerCommon fields
Section titled “Common fields”Every plugin supports these fields in its configuration:
| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | varies | Whether the plugin is active. Disabled plugins are skipped entirely. |
Plugin-specific fields are documented in each plugin section below.
semantic-cache
Section titled “semantic-cache”Caches model responses based on semantic similarity of the input. When a new request is similar enough to a previously cached request, the cached response is returned directly — skipping the model call and saving latency and cost.
How it works
Section titled “How it works”- On request, the plugin computes an embedding of the last user message.
- It searches the cache for entries with the same context hash (conversation history minus the last message) and a cosine similarity above the threshold.
- On a cache hit, the cached response is injected into the plugin context and the model call is skipped.
- On a cache miss, after the model responds, the plugin stores the response keyed by context hash and embedding.
The cache is model-aware — responses are partitioned by the target model name, so a cached GPT-5 response is never returned for a Claude request.
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable the cache. |
similarity_threshold | float | 0.85 | Minimum cosine similarity (0.0–1.0) for a cache hit. Higher values require closer semantic matches. |
ttl_seconds | integer | 3600 | Time-to-live in seconds. Cached entries expire after this duration. |
The cache holds a fixed maximum of 10,000 entries (not configurable via YAML) and uses FIFO eviction when full.
Example
Section titled “Example”- type: semantic-cache configuration: enabled: true similarity_threshold: 0.9 ttl_seconds: 1800jailbreak
Section titled “jailbreak”Detects jailbreak attempts in user messages and blocks them before they reach the model. Uses a hybrid approach combining keyword pattern matching with embedding-based semantic analysis.
How it works
Section titled “How it works”- The plugin scans the last user message against a built-in list of jailbreak keyword patterns (phrases like “ignore previous instructions”, “DAN mode”, etc.).
- If an embedding model is available, it also computes a semantic similarity score against known jailbreak patterns.
- The combined score is calculated as 40% keyword match + 60% embedding similarity (or keyword-only if no embedding model is loaded).
- If the combined score exceeds the threshold, the request is blocked with an OpenAI-compatible error response using
finish_reason: content_filter.
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable jailbreak detection. |
threshold | float | 0.7 | Detection threshold (0.0–1.0). Lower values are more aggressive. |
Example
Section titled “Example”- type: jailbreak configuration: enabled: true threshold: 0.65Detects personally identifiable information (PII) in user messages and blocks requests containing sensitive data. Uses hybrid regex pattern matching combined with embedding-based detection.
How it works
Section titled “How it works”- The plugin runs regex patterns for 14 PII types against the last user message.
- If an embedding model is available, it also computes semantic similarity for PII-related content.
- The combined score is calculated as 70% regex match + 30% embedding similarity. When regex patterns match, the regex component contributes a high confidence score (0.95).
- If PII is detected above the threshold, the request is blocked with an OpenAI-compatible error response using
finish_reason: content_filter.
Detected PII types
Section titled “Detected PII types”EMAIL_ADDRESS, PHONE_NUMBER, US_SSN, CREDIT_CARD, IP_ADDRESS, IBAN_CODE, STREET_ADDRESS, PERSON, DOMAIN_NAME, DATE_TIME, AGE, US_DRIVER_LICENSE, ZIP_CODE, ORGANIZATION
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable PII detection. |
threshold | float | 0.7 | Detection threshold (0.0–1.0). Lower values are more sensitive. |
pii_types_allowed | list | [] | PII types to allow through without blocking. Use type names from the list above. |
Example
Section titled “Example”- type: pii configuration: enabled: true threshold: 0.7 pii_types_allowed: - DOMAIN_NAME - DATE_TIMEsystem_prompt
Section titled “system_prompt”Injects or replaces the system message in the conversation before it reaches the model. Useful for enforcing consistent behavior, adding safety instructions, or customizing model personality per routing rule.
How it works
Section titled “How it works”The plugin operates in one of two modes:
- replace (default) — Replaces the first system message in the conversation. If no system message exists, the configured prompt is prepended as a new system message.
- insert — Always prepends a new system message at position 0, regardless of existing system messages.
This plugin only runs on request (pre-model). It has no post-response behavior.
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable prompt injection. |
system_prompt | string | required | The system prompt text to inject. |
mode | string | "replace" | Injection mode: replace or insert. |
Example
Section titled “Example”- type: system_prompt configuration: enabled: true system_prompt: "You are a helpful coding assistant. Always include code examples." mode: replaceheader_mutation
Section titled “header_mutation”Manipulates HTTP headers on requests sent to models and/or responses returned to clients. Supports adding, setting, and removing headers across request and response phases.
How it works
Section titled “How it works”Each mutation specifies an operation, a header name, an optional value, and a phase:
-
Operations:
add— Adds the header only if it is not already present.set— Sets the header unconditionally, overwriting any existing value.remove— Removes the header if present.
-
Phases:
request— Applies to the outbound request sent to the model.response— Applies to the response returned to the client.both— Applies to both request and response.
Certain restricted headers cannot be modified: authorization, content-type, content-length, and host.
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable header mutation. |
mutations | list | [] | Array of mutation objects (see below). |
Each mutation object:
| Field | Type | Description |
|---|---|---|
operation | string | add, set, or remove. |
header | string | Header name. |
value | string | Header value (not required for remove). |
phase | string | request, response, or both. |
Example
Section titled “Example”- type: header_mutation configuration: enabled: true mutations: - operation: set header: x-routed-by value: llmsoup phase: response - operation: add header: x-request-source value: internal phase: request - operation: remove header: x-debug-trace phase: bothhallucination
Section titled “hallucination”Detects potential hallucinations in model responses using heuristic analysis. Checks for hedging language, fabrication indicators, and self-contradictions. Runs post-response only.
How it works
Section titled “How it works”- The plugin analyzes the model response text for three hallucination signals:
- Hedging (weight: 30%) — phrases indicating uncertainty (“I think”, “it’s possible”, etc.). Produces a continuous score (0.0–1.0) based on pattern density.
- Fabrication indicators (weight: 40%) — patterns suggesting made-up information. Produces a continuous score (0.0–1.0) based on pattern density.
- Self-contradictions (weight: 30%) — conflicting statements within the response. Binary detection: contributes a fixed penalty of 0.5 when contradictions are found, 0.0 otherwise.
- The hallucination score is calculated as:
(hedging × 0.3 + fabrication × 0.4 + contradiction_penalty × 0.3) × (0.5 + sensitivity), wherecontradiction_penaltyis 0.5 when contradictions are detected or 0.0 otherwise. - If the score exceeds the threshold, the configured action is taken.
Actions
Section titled “Actions”| Action | Behavior |
|---|---|
header | Adds x-llmsoup-hallucination-score and x-llmsoup-hallucination-detected response headers. The original response is returned unchanged. |
body | Injects _llmsoup_warnings into the response body alongside the original content. |
block | Replaces the entire response with an error message indicating hallucination was detected. |
log | Logs the detection. No modification to the response. |
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Enable or disable hallucination detection. |
threshold | float | 0.7 | Score threshold (0.0–1.0) to trigger the action. |
action | string | "header" | Action on detection: header, body, block, or log. |
heuristic_sensitivity | float | 0.5 | Sensitivity multiplier (0.0–1.0). Higher values produce higher scores. |
Example
Section titled “Example”- type: hallucination configuration: enabled: true threshold: 0.6 action: header heuristic_sensitivity: 0.7router_replay
Section titled “router_replay”Captures routing decisions and optional request/response data for debugging and replay. This is primarily a development and troubleshooting tool — it is disabled by default.
How it works
Section titled “How it works”- On every request that passes through a rule with this plugin enabled, the plugin records the routing decision: matched rule name, priority, selected model, strategy, and algorithm details.
- Optionally, it captures truncated snippets of the request body and response body.
- Records are stored in a bounded in-memory store with FIFO eviction when
max_recordsis reached.
Configuration
Section titled “Configuration”| Field | Type | Default | Description |
|---|---|---|---|
enabled | boolean | false | Enable or disable replay capture. Disabled by default. |
max_records | integer | 200 | Maximum number of records to keep in the replay store. |
capture_request_body | boolean | false | Whether to capture a snippet of the request body. |
capture_response_body | boolean | false | Whether to capture a snippet of the response body. |
max_body_bytes | integer | 4096 | Maximum bytes to capture per request/response body. |
Example
Section titled “Example”- type: router_replay configuration: enabled: true max_records: 500 capture_request_body: true capture_response_body: true max_body_bytes: 2048Observability
Section titled “Observability”All plugins emit Prometheus metrics for monitoring execution health and performance. These metrics use the standard llmsoup_ prefix.
| Metric | Type | Labels | Description |
|---|---|---|---|
llmsoup_plugin_execution_total | Counter | plugin_type, decision_name, status, user_id | Total plugin executions. status is success or error. |
llmsoup_plugin_execution_duration_seconds | Histogram | plugin_type, user_id | Plugin execution latency distribution. |
llmsoup_plugin_errors_total | Counter | plugin_type, error_reason, user_id | Plugin failures. error_reason is one of: execution_failed, configuration_error, timeout, internal_error. |
The PII plugin also emits:
| Metric | Type | Labels | Description |
|---|---|---|---|
llmsoup_pii_violations_total | Counter | model, pii_type, user_id | PII violations detected, broken down by PII type. |
For the complete metrics catalog, see the Metrics Reference.
Custom plugins
Section titled “Custom plugins”The plugin system is built-in only. llmsoup does not expose a custom plugin API — all available plugins are listed on this page. To add new plugin behavior, modifications to the llmsoup source code are required.