AI Response Integrity

Since 0.9.1, every user-facing Molly answer — interactive chat, proactive triage, and digest notifications — is run through the AI Response Integrity pipeline before being delivered. The pipeline verifies that every number, metric, currency value, and date in the answer can be traced to a real source from the same turn (a tool result or an injected context block).

What it does

Behaviour	Detail
Source verification	Every numeric token in the final answer is matched against the tool results and injected context collected during that turn.
Structured attribution	When the model cooperates (cloud providers with structured-output support), each fact is cited with its tool-call ID and the exact path in the result (e.g. `rows[0].disk_pct`).
Post-hoc scan fallback	When the model returns plain prose (common with weaker local Ollama models), a regex-based post-hoc scan flags unsourced figures with ⚠️.
Re-prompt on failure	If verification fails, the pipeline issues one corrective re-prompt asking the model to re-cite each figure. The budget is capped at one re-prompt to limit latency.
Fail-safe retries	Transient provider errors (timeouts, HTTP 429, 5xx, connection resets) trigger exponential backoff retries (1 → 2 → 4 → 8 s, up to 4 attempts).
Graceful degradation	Integrity failures never hard-fail the user’s answer. If re-prompting cannot resolve unsourced figures, the answer is delivered with ⚠️ markers.

Kill-switch environment variable

The integrity pipeline is on by default. It is intended as a permanent quality feature, not a rollout toggle. An emergency kill-switch is provided for situations where the pipeline causes unexpected issues:


# Disable AI Response Integrity (default: on)
QUAZZAR_AI_INTEGRITY=off

Set this in /etc/quazzar/quazzar.env and restart the service:


sudo systemctl restart quazzar.service

When off, Finalize becomes a pass-through — the model’s answer is delivered verbatim with no verification, re-prompting, or ⚠️ marking. Prompt augmentation (Rule #-1 and the envelope schema block) is also skipped.

Only disable integrity as a temporary emergency measure. With integrity off, Molly may silently fabricate figures without any indication to the user. Re-enable as soon as the underlying issue is resolved.

Observability

The following zerolog fields are emitted on every Molly answer turn when integrity is enabled:

Field	Values	Meaning
`integrity_mode`	`envelope` \| `fallback_scan`	Whether structured attribution or the post-hoc scan was used
`integrity_unsourced`	integer	Count of figures that could not be sourced
`integrity_reprompt_used`	`true` \| `false`	Whether a corrective re-prompt was issued
`integrity_parse_ok`	`true` \| `false`	Whether the model returned a valid envelope JSON

These fields appear at debug level in journalctl -u quazzar.service. Use them to monitor how often local models fail to produce structured attribution (high fallback_scan rate) and whether re-prompts are helping (integrity_reprompt_used).

Performance impact

The pipeline adds latency only to the final answer turn — tool-call progress still streams immediately, so the perceived “thinking” phase is unchanged. Typical overhead:

No figures in the answer — near-zero overhead (empty facts list, no verification).
Verified answer, no re-prompt — one synchronous verification pass over the tool results (sub-millisecond).
Re-prompt issued — one additional gateway call (same latency as a regular chat turn).

Surfaces covered

Surface	Integrity applies
Molly interactive chat (server and personal personas)	Yes
Proactive triage notifications	Yes
Proactive digest notifications	Yes
Control Panel AI chat	Via upstream OS node — see note below

Control Panel chat relay

The Control Panel AI chat relays answers from the OS node (Molly) rather than generating them independently. Integrity runs on the OS node before the answer is sent to the CP relay. The CP relay does not re-run integrity — it would be redundant and would add unnecessary latency. The guarantee is therefore present for all CP chat answers, but the ⚠️ markers and Sources footer are generated on the OS node.

Provider compatibility

Provider	Attribution method
OpenAI-compatible	`response_format: {type:"json_schema", ...}` for structured output
Anthropic	Forced tool-use with the envelope as the tool input schema
Google	`responseSchema` + `responseMimeType: "application/json"`
Ollama (local)	`format: "json"` (best-effort; falls back to post-hoc scan frequently)

Providers without structured-output support receive the Rule #-1 system-prompt block and rely on the post-hoc scanner as the safety net.