Skip to Content
DocsControl PanelAI Orchestration

AI Orchestration

The AI orchestration system coordinates AI workloads across multiple Cloud OS instances in your fleet, automatically routing requests to the best available instance based on capacity and health.

AI orchestration requires the ai_orchestration license feature (Enterprise plan).

How It Works

When you submit an orchestrated AI task, the system:

  1. Evaluates capacity — checks GPU and CPU utilization across all instances with AI capabilities
  2. Selects instances — routes the task to the least-loaded instance (or distributes across multiple instances for team tasks)
  3. Handles failover — if the selected instance becomes unavailable, the task is re-routed to a healthy instance
  4. Aggregates results — for multi-instance tasks, collects and merges outputs from all participating instances

Submitting an Orchestrated Task

  1. Navigate to AI > Orchestration from the sidebar
  2. Click New Task
  3. Describe the task or select a pre-configured workflow
  4. The system automatically selects the best instance(s)
  5. Monitor progress on the task detail page

Cluster AI Capacity

The capacity overview shows real-time AI resource availability across your fleet:

  • GPU utilization per instance
  • CPU availability per instance
  • Number of active AI tasks
  • Available model capacity

Orchestration API

EndpointMethodDescription
/api/ai/orchestratePOSTSubmit an orchestrated AI task
/api/ai/orchestrate/{id}GETGet task status with per-instance results
/api/ai/capacityGETCluster AI capacity overview