Plugins

Plugins add security, caching, and observability capabilities to routing decisions. They execute as hooks around model calls — some run before the request reaches the model, others inspect the response afterward. This page documents the plugin system and every built-in plugin.

For the YAML field reference and rule syntax, see the Configuration Reference. For signal evaluation and rule matching, see Signals & Routing.

How plugins work

Plugins are scoped to routing rules (decisions), not global. Each rule can define its own set of plugins that execute only when that rule matches a request. This gives fine-grained control — you might enable jailbreak detection on public-facing rules while skipping it for internal traffic.

Lifecycle

Plugins hook into two phases of the request lifecycle:

Incoming request
    │
    ▼
Signal evaluation & rule matching
    │
    ▼
┌─────────────────────────────────┐
│  Plugin request hooks (pre)     │  ← semantic-cache, jailbreak, pii, system_prompt, header_mutation
│  Execute in declaration order   │
└─────────────────────────────────┘
    │
    ▼
Model HTTP call
    │
    ▼
┌─────────────────────────────────┐
│  Plugin response hooks (post)   │  ← hallucination, semantic-cache (store), header_mutation, router_replay
│  Execute in declaration order   │
└─────────────────────────────────┘
    │
    ▼
Response returned to client

Pre-request plugins can inspect or modify the request before it reaches the model. Some can short-circuit the pipeline entirely:

semantic-cache returns a cached response if a similar request was seen recently, skipping the model call.
jailbreak and pii can block the request and return an error response.
system_prompt injects or replaces the system message in the conversation.
header_mutation adds, sets, or removes headers on the outbound model request.

Post-response plugins inspect or modify the model response before it reaches the client:

hallucination evaluates response quality and can add warning headers, modify the body, or block the response.
semantic-cache stores the response for future cache hits.
header_mutation can also modify response headers sent back to the client.
router_replay captures the full routing decision and optional request/response data for debugging.

Execution order

Plugins execute in the order they appear in the plugins array of a rule. This order is deterministic and matters — for example, placing jailbreak before system_prompt ensures malicious prompts are caught before any system prompt injection occurs.

Graceful degradation

Plugin failures do not block routing. If a plugin encounters an error during execution, the error is logged and metrics are recorded, but the request continues through the remaining plugins and model call. This ensures that a misconfigured or failing plugin never takes down the routing pipeline. For broader context on how llmsoup handles component failures, see Graceful degradation in the Signals & Routing guide.

Plugin configuration

Plugins are configured within the plugins array of a routing rule. Each plugin entry requires a type field identifying the plugin and a configuration object with plugin-specific settings.

decisions:
  - name: general-routing
    priority: 50
    conditions: []
    action:
      model: gpt-5-mini
      strategy: default
    plugins:
      - type: jailbreak
        configuration:
          enabled: true
          threshold: 0.7
      - type: system_prompt
        configuration:
          enabled: true
          system_prompt: "You are a helpful assistant."
          mode: replace
      - type: hallucination
        configuration:
          enabled: true
          threshold: 0.7
          action: header

Common fields

Every plugin supports these fields in its configuration:

Field	Type	Default	Description
`enabled`	boolean	varies	Whether the plugin is active. Disabled plugins are skipped entirely.

Plugin-specific fields are documented in each plugin section below.

semantic-cache

Caches model responses based on semantic similarity of the input. When a new request is similar enough to a previously cached request, the cached response is returned directly — skipping the model call and saving latency and cost.

How it works

On request, the plugin computes an embedding of the last user message.
It searches the cache for entries with the same context hash (conversation history minus the last message) and a cosine similarity above the threshold.
On a cache hit, the cached response is injected into the plugin context and the model call is skipped.
On a cache miss, after the model responds, the plugin stores the response keyed by context hash and embedding.

The cache is model-aware — responses are partitioned by the target model name, so a cached GPT-5 response is never returned for a Claude request.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable the cache.
`similarity_threshold`	float	`0.85`	Minimum cosine similarity (0.0–1.0) for a cache hit. Higher values require closer semantic matches.
`ttl_seconds`	integer	`3600`	Time-to-live in seconds. Cached entries expire after this duration.

The cache holds a fixed maximum of 10,000 entries (not configurable via YAML) and uses FIFO eviction when full.

Example

- type: semantic-cache
  configuration:
    enabled: true
    similarity_threshold: 0.9
    ttl_seconds: 1800

jailbreak

Detects jailbreak attempts in user messages and blocks them before they reach the model. Uses a hybrid approach combining keyword pattern matching with embedding-based semantic analysis.

How it works

The plugin scans the last user message against a built-in list of jailbreak keyword patterns (phrases like “ignore previous instructions”, “DAN mode”, etc.).
If an embedding model is available, it also computes a semantic similarity score against known jailbreak patterns.
The combined score is calculated as 40% keyword match + 60% embedding similarity (or keyword-only if no embedding model is loaded).
If the combined score exceeds the threshold, the request is blocked with an OpenAI-compatible error response using finish_reason: content_filter.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable jailbreak detection.
`threshold`	float	`0.7`	Detection threshold (0.0–1.0). Lower values are more aggressive.

Example

- type: jailbreak
  configuration:
    enabled: true
    threshold: 0.65

pii

Detects personally identifiable information (PII) in user messages and blocks requests containing sensitive data. Uses hybrid regex pattern matching combined with embedding-based detection.

How it works

The plugin runs regex patterns for 14 PII types against the last user message.
If an embedding model is available, it also computes semantic similarity for PII-related content.
The combined score is calculated as 70% regex match + 30% embedding similarity. When regex patterns match, the regex component contributes a high confidence score (0.95).
If PII is detected above the threshold, the request is blocked with an OpenAI-compatible error response using finish_reason: content_filter.

Detected PII types

EMAIL_ADDRESS, PHONE_NUMBER, US_SSN, CREDIT_CARD, IP_ADDRESS, IBAN_CODE, STREET_ADDRESS, PERSON, DOMAIN_NAME, DATE_TIME, AGE, US_DRIVER_LICENSE, ZIP_CODE, ORGANIZATION

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable PII detection.
`threshold`	float	`0.7`	Detection threshold (0.0–1.0). Lower values are more sensitive.
`pii_types_allowed`	list	`[]`	PII types to allow through without blocking. Use type names from the list above.

Example

- type: pii
  configuration:
    enabled: true
    threshold: 0.7
    pii_types_allowed:
      - DOMAIN_NAME
      - DATE_TIME

system_prompt

Injects or replaces the system message in the conversation before it reaches the model. Useful for enforcing consistent behavior, adding safety instructions, or customizing model personality per routing rule.

How it works

The plugin operates in one of two modes:

replace (default) — Replaces the first system message in the conversation. If no system message exists, the configured prompt is prepended as a new system message.
insert — Always prepends a new system message at position 0, regardless of existing system messages.

This plugin only runs on request (pre-model). It has no post-response behavior.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable prompt injection.
`system_prompt`	string	required	The system prompt text to inject.
`mode`	string	`"replace"`	Injection mode: `replace` or `insert`.

Example

- type: system_prompt
  configuration:
    enabled: true
    system_prompt: "You are a helpful coding assistant. Always include code examples."
    mode: replace

header_mutation

Manipulates HTTP headers on requests sent to models and/or responses returned to clients. Supports adding, setting, and removing headers across request and response phases.

How it works

Each mutation specifies an operation, a header name, an optional value, and a phase:

Operations:
- add — Adds the header only if it is not already present.
- set — Sets the header unconditionally, overwriting any existing value.
- remove — Removes the header if present.
Phases:
- request — Applies to the outbound request sent to the model.
- response — Applies to the response returned to the client.
- both — Applies to both request and response.

Certain restricted headers cannot be modified: authorization, content-type, content-length, and host.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable header mutation.
`mutations`	list	`[]`	Array of mutation objects (see below).

Each mutation object:

Field	Type	Description
`operation`	string	`add`, `set`, or `remove`.
`header`	string	Header name.
`value`	string	Header value (not required for `remove`).
`phase`	string	`request`, `response`, or `both`.

Example

- type: header_mutation
  configuration:
    enabled: true
    mutations:
      - operation: set
        header: x-routed-by
        value: llmsoup
        phase: response
      - operation: add
        header: x-request-source
        value: internal
        phase: request
      - operation: remove
        header: x-debug-trace
        phase: both

hallucination

Detects potential hallucinations in model responses using heuristic analysis. Checks for hedging language, fabrication indicators, and self-contradictions. Runs post-response only.

How it works

The plugin analyzes the model response text for three hallucination signals:
- Hedging (weight: 30%) — phrases indicating uncertainty (“I think”, “it’s possible”, etc.). Produces a continuous score (0.0–1.0) based on pattern density.
- Fabrication indicators (weight: 40%) — patterns suggesting made-up information. Produces a continuous score (0.0–1.0) based on pattern density.
- Self-contradictions (weight: 30%) — conflicting statements within the response. Binary detection: contributes a fixed penalty of 0.5 when contradictions are found, 0.0 otherwise.
The hallucination score is calculated as: (hedging × 0.3 + fabrication × 0.4 + contradiction_penalty × 0.3) × (0.5 + sensitivity), where contradiction_penalty is 0.5 when contradictions are detected or 0.0 otherwise.
If the score exceeds the threshold, the configured action is taken.

Actions

Action	Behavior
`header`	Adds `x-llmsoup-hallucination-score` and `x-llmsoup-hallucination-detected` response headers. The original response is returned unchanged.
`body`	Injects `_llmsoup_warnings` into the response body alongside the original content.
`block`	Replaces the entire response with an error message indicating hallucination was detected.
`log`	Logs the detection. No modification to the response.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable or disable hallucination detection.
`threshold`	float	`0.7`	Score threshold (0.0–1.0) to trigger the action.
`action`	string	`"header"`	Action on detection: `header`, `body`, `block`, or `log`.
`heuristic_sensitivity`	float	`0.5`	Sensitivity multiplier (0.0–1.0). Higher values produce higher scores.

Example

- type: hallucination
  configuration:
    enabled: true
    threshold: 0.6
    action: header
    heuristic_sensitivity: 0.7

router_replay

Captures routing decisions and optional request/response data for debugging and replay. This is primarily a development and troubleshooting tool — it is disabled by default.

How it works

On every request that passes through a rule with this plugin enabled, the plugin records the routing decision: matched rule name, priority, selected model, strategy, and algorithm details.
Optionally, it captures truncated snippets of the request body and response body.
Records are stored in a bounded in-memory store with FIFO eviction when max_records is reached.

Configuration

Field	Type	Default	Description
`enabled`	boolean	`false`	Enable or disable replay capture. Disabled by default.
`max_records`	integer	`200`	Maximum number of records to keep in the replay store.
`capture_request_body`	boolean	`false`	Whether to capture a snippet of the request body.
`capture_response_body`	boolean	`false`	Whether to capture a snippet of the response body.
`max_body_bytes`	integer	`4096`	Maximum bytes to capture per request/response body.

Example

- type: router_replay
  configuration:
    enabled: true
    max_records: 500
    capture_request_body: true
    capture_response_body: true
    max_body_bytes: 2048

Observability

All plugins emit Prometheus metrics for monitoring execution health and performance. These metrics use the standard llmsoup_ prefix.

Metric	Type	Labels	Description
`llmsoup_plugin_execution_total`	Counter	`plugin_type`, `decision_name`, `status`, `user_id`	Total plugin executions. `status` is `success` or `error`.
`llmsoup_plugin_execution_duration_seconds`	Histogram	`plugin_type`, `user_id`	Plugin execution latency distribution.
`llmsoup_plugin_errors_total`	Counter	`plugin_type`, `error_reason`, `user_id`	Plugin failures. `error_reason` is one of: `execution_failed`, `configuration_error`, `timeout`, `internal_error`.

The PII plugin also emits:

Metric	Type	Labels	Description
`llmsoup_pii_violations_total`	Counter	`model`, `pii_type`, `user_id`	PII violations detected, broken down by PII type.

For the complete metrics catalog, see the Metrics Reference.

Custom plugins

The plugin system is built-in only. llmsoup does not expose a custom plugin API — all available plugins are listed on this page. To add new plugin behavior, modifications to the llmsoup source code are required.