Fleet Management

Fleet Management is the core feature of the Control Panel. It lets you register Cloud OS instances, monitor their health in real time, organize them with tags and groups, and view aggregated metrics across your entire fleet.

Instance Registration

Cloud OS instances connect to the Control Panel via an mTLS agent. The agent authenticates using a CA-signed certificate, ensuring that only trusted instances can join your fleet.

Registration Flow

Generate a registration token in the Control Panel under Fleet > Registration Tokens
On the Cloud OS instance, configure the agent with the token and the Control Panel’s agent server address (port 8443)
The agent presents the token to the Control Panel
The Control Panel validates the token, issues a signed certificate, and registers the instance
The instance appears on the Fleet Dashboard as a new server


┌──────────────┐   1. Token    ┌──────────────────┐
│  Control     │◄──────────────│  Cloud OS        │
│  Panel       │               │  Instance        │
│              │  2. mTLS cert │                  │
│  (CA)        │──────────────►│  (Agent)         │
│              │               │                  │
│              │  3. Heartbeat │                  │
│              │◄──────────────│                  │
└──────────────┘   (ongoing)   └──────────────────┘

Registration Tokens

Registration tokens are one-time or multi-use tokens that authorize new instances to join the fleet. You can manage tokens from the Registration Tokens page:

Create a new token with an optional expiry date and usage limit
Revoke a token to prevent further registrations
View token usage history

Each token can optionally be scoped to a specific tenant in MSP environments, so that instances registered with that token are automatically assigned to the correct client.

Heartbeat Monitoring

Once registered, each instance sends periodic heartbeat signals to the Control Panel. The heartbeat includes system metrics such as CPU usage, memory usage, disk utilization, and uptime.

Staleness Detection

The Control Panel uses a 1-minute staleness check. If no heartbeat is received within 60 seconds, the instance status changes:

Time Since Last Heartbeat	Status
Less than 60 seconds	Online
60 seconds to 5 minutes	Warning
More than 5 minutes	Offline

The Fleet Dashboard displays these statuses with visual indicators so you can spot connectivity issues at a glance.

Fleet Dashboard

The Fleet Dashboard is the main landing page after login. It shows:

Fleet Summary

Total number of registered instances
Online / Warning / Offline counts
Fleet health percentage (online instances divided by total)
Aggregated resource metrics (total CPU, RAM, disk across all instances)

Instance List

Each instance is displayed as a card or row showing:

Hostname — the server’s hostname
Status — online, warning, or offline indicator
IP address — the server’s public or private IP
Uptime — how long the instance has been running
Resource usage — CPU, RAM, and disk as progress bars
Tags — assigned tags for filtering
Last seen — timestamp of the most recent heartbeat

Filtering and Search

Filter the instance list by:

Status (online, warning, offline)
Tags
Groups
Free-text search across hostname and IP

Tags and Groups

Organize your fleet with tags and groups to make filtering and bulk operations easier.

Tag Key	Example Values	Purpose
`env`	production, staging, dev	Environment classification
`region`	us-east, eu-west, ap-south	Geographic location
`team`	backend, data, infra	Team ownership
`tier`	critical, standard	Service tier

Groups

Groups are named collections of instances. Unlike tags, a group is a single label that you can use to target bulk operations. For example, a group called “database-servers” might contain all instances running PostgreSQL.

Metrics Aggregation

The Control Panel aggregates metrics from all registered instances and provides fleet-wide views:

CPU utilization — average and peak CPU usage across the fleet
Memory usage — total and per-instance memory consumption
Disk usage — storage utilization with capacity planning indicators
Network throughput — aggregate inbound and outbound traffic

These metrics are available on the Fleet Dashboard and can be filtered by tag or group.

Server Detail

Click any instance on the Fleet Dashboard to open its detail page. The server detail view includes:

System info — hostname, OS, architecture, Cloud OS version, IP addresses
Live metrics — real-time CPU, RAM, disk, and network graphs
Historical charts — area charts showing CPU, Memory, Disk, and Network over time (see below)
Installed apps — list of applications running on the instance
Command history — recent commands sent to this instance
Policy compliance — whether the instance meets fleet policies
Tags and groups — manage the instance’s organizational metadata

Historical Metric Charts

The node detail page shows area charts for CPU, Memory, Disk I/O, and Network traffic. Use the time-range pills to zoom in or out:

Range	Data source	Retention
1 h	Raw heartbeat data	7 days
6 h	Raw heartbeat data	7 days
24 h	Hourly rollup	30 days
7 d	Hourly rollup	30 days
30 d	Daily rollup	92 days (~3 months)

A background janitor rolls raw data into the instance_metrics_1h and instance_metrics_1d rollup tables once per hour and prunes data outside the retention window automatically. No manual configuration is required.

Open in Node (Impersonate)

The Open in node button on the node detail page opens an authenticated Cloud OS session in a new browser tab — without entering a password.

How it works:

Click Open in node on any fleet node’s detail page.
The Control Panel mints a short-lived, HMAC-signed token bound to that node.
The Cloud OS instance validates the token via a secure backend endpoint, creates a session, and redirects you to the OS dashboard.
The token is single-use (JTI replay defense) and tied to the specific node (audience pinning) — it cannot be replayed or used against a different node.

Impersonate requires the node to have been registered with a shared secret. Nodes registered before CP 0.8.0 will receive the secret automatically on their next heartbeat.

Fleet Health

The fleet health section provides an at-a-glance summary of your fleet’s operational state:

Health score — percentage of instances that are online
Alerts — instances that have transitioned to warning or offline status
Trends — historical uptime data showing fleet reliability over time

If an instance goes offline, the Control Panel retains its last known state. Commands queued for offline instances will be delivered when the instance reconnects.

Next Steps

Fleet Operations — issue commands to your registered instances
Governance — define and enforce policies across your fleet
Billing — manage your subscription and invoices