Predictive Insights

Quazzar Cloud OS continuously monitors your node’s health and uses classical statistics to surface forward-looking signals — anomaly alerts, capacity forecasts, backup health scores, and app stability ratings — without requiring any ML runtime or cloud dependency.

Everything runs inside the Go process on your node. Data never leaves.

What you get

Four predictors run on a 1-hour scoring cycle and write calibrated results to a local predictions SQLite table. Results are available via:

Molly — ask “What does my server predict?” and Molly queries the table.
MCP tool — the get_predictions tool surfaces predictions to any MCP client (Claude Desktop, Claude Code, Cursor, …).
REST endpoint — GET /api/insights/predictions returns the latest scored row per predictor in JSON.
Dashboard card — the Predictive Insights card shows the four predictors with risk levels and summaries at a glance.

The four predictors

resource_anomaly

Detects when CPU, RAM, disk I/O, or network metrics fall outside their historical normal range for the same hour-of-day and day-of-week.

The detector uses a per-(hour, weekday) EMA baseline and a z-score threshold (|z| > 3 = anomaly). A metric is flagged when its current reading is far outside its normal range for this time of day and day of week, using your node’s own system_metrics table — no external service involved.

The detector needs at least two weeks of data for a given (hour, weekday) slot before it will emit an anomaly score. Before that threshold is reached, the predictor returns insufficient_data rather than a fabricated number.

Output fields: metric, z_score, baseline_mean, baseline_stddev, risk_level (low / medium / high), detected_at.

capacity_forecast

Projects how many days remain before disk (or RAM/CPU, when trend data is sufficient) reaches saturation, using Ordinary Least Squares regression over the 30-day daily rollup.

Disk capacity — always available. Generalises the live OLS model in monitoring/disk_health.go to a longer horizon using the metrics_rollup_daily table.
RAM / CPU saturation trend — enabled when 30+ days of daily rollup data are present; marked as a follow-up capability for v1 and may surface as insufficient_data on fresh nodes.

Honest-refusal: a monotonically-decreasing or flat trend with R² < 0.3 returns insufficient_data rather than a speculative forecast.

Output fields: resource, days_until_full, current_pct, trend_slope, r_squared, risk_level, forecast_date.

backup_health

Scores your backup reliability using three signals:

Failure rate — percentage of backup runs that failed in the last 30 days.
Consecutive failures — how many of the most recent runs in a row have failed.
Duration/size trend — whether backup duration or size is growing abnormally (early signal of bloat or infrastructure problems).

Requires at least 5 backup runs in the window to emit a score.

Output fields: failure_rate_pct, consecutive_failures, last_success_at, risk_level, summary.

app_health

Scores each running app’s stability using:

Restart frequency — restarts per hour over the last 24h and 7d.
Lifecycle events — crash, OOM-kill, and abnormal-exit events captured from Docker event streams and app_events.
OOM / crash flag — any OOM-kill or crash in the last 24h bumps risk to high regardless of restart count.

Requires at least 24 hours of app telemetry before a stable score is emitted.

Output fields: app_name, restarts_24h, restarts_7d, oom_kills_24h, crashes_24h, risk_level, summary.

Honest-refusal

Every predictor is guarded against fabricating numbers when data is insufficient. When a predictor cannot produce a reliable result it returns:


{
  "predictor": "capacity_forecast",
  "status": "insufficient_data",
  "reason": "fewer than 14 daily rollup rows available"
}

This means you will never see an invented “days until full” on a fresh node. Predictions only appear once the statistical confidence is real.

The `get_predictions` MCP tool

Any MCP client connected to your node can call:


get_predictions

Optional parameters:

Parameter	Type	Description
`predictor`	string	Filter to one predictor: `resource_anomaly`, `capacity_forecast`, `backup_health`, `app_health`. Omit for all.
`min_risk`	string	Only return predictions at or above this risk level: `low`, `medium`, `high`.

Example (Claude Desktop):

“Show me any high-risk predictions for my server.”

Molly and Claude will call get_predictions automatically when the query is about server health, disk space, or app stability.

REST endpoint


GET /api/insights/predictions

Query parameters mirror the MCP tool:

Parameter	Example	Description
`predictor`	`capacity_forecast`	Filter to a single predictor.
`min_risk`	`medium`	Minimum risk level to return.

Response:


{
  "scored_at": "2026-05-22T14:00:00Z",
  "predictions": [
    {
      "predictor": "capacity_forecast",
      "status": "ok",
      "risk_level": "high",
      "days_until_full": 12,
      "resource": "disk",
      "current_pct": 87.4
    }
  ]
}

Proactive alerts (gated)

Molly can send a push notification when a high-risk prediction is scored, before you ask. This feature is off by default and must be opted in with an environment variable:


QUAZZAR_INSIGHTS_ALERTS=true

See the admin guide on Predictive Insights alerts for dedup window, notification cap, and how to enable alerts per-predictor.

When enabled, Molly sends at most one alert per predictor per hour (dedup-keyed by predictor + risk_level). A high-risk disk forecast alert at 2 pm will not re-fire at 3 pm if the risk level is unchanged.

Dashboard card

Open Dashboard. The Predictive Insights card appears below the system metrics section and shows:

Current risk level (colour-coded: green / amber / red) for each predictor.
A one-line summary from the most recent scored row.
A “See details” link that opens the full prediction in the Molly chat sidebar.

The card refreshes every 5 minutes (on the same poll cycle as the monitoring widgets) and shows a “Collecting data…” placeholder while a predictor is still in insufficient_data state.