Fleet Management
Fleet Management is the core feature of the Control Panel. It lets you register Cloud OS instances, monitor their health in real time, organize them with tags and groups, and view aggregated metrics across your entire fleet.
Instance Registration
Cloud OS instances connect to the Control Panel via an mTLS agent. The agent authenticates using a CA-signed certificate, ensuring that only trusted instances can join your fleet.
Registration Flow
- Generate a registration token in the Control Panel under Fleet > Registration Tokens
- On the Cloud OS instance, configure the agent with the token and the Control Panel’s agent server address (port 8443)
- The agent presents the token to the Control Panel
- The Control Panel validates the token, issues a signed certificate, and registers the instance
- The instance appears on the Fleet Dashboard as a new server
┌──────────────┐ 1. Token ┌──────────────────┐
│ Control │◄──────────────│ Cloud OS │
│ Panel │ │ Instance │
│ │ 2. mTLS cert │ │
│ (CA) │──────────────►│ (Agent) │
│ │ │ │
│ │ 3. Heartbeat │ │
│ │◄──────────────│ │
└──────────────┘ (ongoing) └──────────────────┘Registration Tokens
Registration tokens are one-time or multi-use tokens that authorize new instances to join the fleet. You can manage tokens from the Registration Tokens page:
- Create a new token with an optional expiry date and usage limit
- Revoke a token to prevent further registrations
- View token usage history
Each token can optionally be scoped to a specific tenant in MSP environments, so that instances registered with that token are automatically assigned to the correct client.
Heartbeat Monitoring
Once registered, each instance sends periodic heartbeat signals to the Control Panel. The heartbeat includes system metrics such as CPU usage, memory usage, disk utilization, and uptime.
Staleness Detection
The Control Panel uses a 1-minute staleness check. If no heartbeat is received within 60 seconds, the instance status changes:
| Time Since Last Heartbeat | Status |
|---|---|
| Less than 60 seconds | Online |
| 60 seconds to 5 minutes | Warning |
| More than 5 minutes | Offline |
The Fleet Dashboard displays these statuses with visual indicators so you can spot connectivity issues at a glance.
Fleet Dashboard
The Fleet Dashboard is the main landing page after login. It shows:
Fleet Summary
- Total number of registered instances
- Online / Warning / Offline counts
- Fleet health percentage (online instances divided by total)
- Aggregated resource metrics (total CPU, RAM, disk across all instances)
Instance List
Each instance is displayed as a card or row showing:
- Hostname — the server’s hostname
- Status — online, warning, or offline indicator
- IP address — the server’s public or private IP
- Uptime — how long the instance has been running
- Resource usage — CPU, RAM, and disk as progress bars
- Tags — assigned tags for filtering
- Last seen — timestamp of the most recent heartbeat
Filtering and Search
Filter the instance list by:
- Status (online, warning, offline)
- Tags
- Groups
- Free-text search across hostname and IP
Tags and Groups
Organize your fleet with tags and groups to make filtering and bulk operations easier.
Tags
Tags are key-value pairs assigned to instances. Common tagging strategies include:
| Tag Key | Example Values | Purpose |
|---|---|---|
env | production, staging, dev | Environment classification |
region | us-east, eu-west, ap-south | Geographic location |
team | backend, data, infra | Team ownership |
tier | critical, standard | Service tier |
You can assign tags when registering an instance or add them later from the server detail page.
Groups
Groups are named collections of instances. Unlike tags, a group is a single label that you can use to target bulk operations. For example, a group called “database-servers” might contain all instances running PostgreSQL.
Metrics Aggregation
The Control Panel aggregates metrics from all registered instances and provides fleet-wide views:
- CPU utilization — average and peak CPU usage across the fleet
- Memory usage — total and per-instance memory consumption
- Disk usage — storage utilization with capacity planning indicators
- Network throughput — aggregate inbound and outbound traffic
These metrics are available on the Fleet Dashboard and can be filtered by tag or group.
Server Detail
Click any instance on the Fleet Dashboard to open its detail page. The server detail view includes:
- System info — hostname, OS, architecture, Cloud OS version, IP addresses
- Live metrics — real-time CPU, RAM, disk, and network graphs
- Installed apps — list of applications running on the instance
- Command history — recent commands sent to this instance
- Policy compliance — whether the instance meets fleet policies
- Tags and groups — manage the instance’s organizational metadata
Fleet Health
The fleet health section provides an at-a-glance summary of your fleet’s operational state:
- Health score — percentage of instances that are online
- Alerts — instances that have transitioned to warning or offline status
- Trends — historical uptime data showing fleet reliability over time
If an instance goes offline, the Control Panel retains its last known state. Commands queued for offline instances will be delivered when the instance reconnects.
Next Steps
- Fleet Operations — issue commands to your registered instances
- Governance — define and enforce policies across your fleet