AI Orchestration

The AI orchestration system coordinates AI workloads across multiple Cloud OS instances in your fleet, automatically routing requests to the best available instance based on capacity and health.

AI orchestration requires the ai_orchestration license feature (Enterprise plan).

How It Works

When you submit an orchestrated AI task, the system:

Evaluates capacity — checks GPU and CPU utilization across all instances with AI capabilities
Selects instances — routes the task to the least-loaded instance (or distributes across multiple instances for team tasks)
Handles failover — if the selected instance becomes unavailable, the task is re-routed to a healthy instance
Aggregates results — for multi-instance tasks, collects and merges outputs from all participating instances

Submitting an Orchestrated Task

Navigate to AI > Orchestration from the sidebar
Click New Task
Describe the task or select a pre-configured workflow
The system automatically selects the best instance(s)
Monitor progress on the task detail page

Cluster AI Capacity

The capacity overview shows real-time AI resource availability across your fleet:

GPU utilization per instance
CPU availability per instance
Number of active AI tasks
Available model capacity

Orchestration API

Endpoint	Method	Description
`/api/ai/orchestrate`	POST	Submit an orchestrated AI task
`/api/ai/orchestrate/{id}`	GET	Get task status with per-instance results
`/api/ai/capacity`	GET	Cluster AI capacity overview