Xybrid

Internals

How Xybrid works under the hood

This section explains Xybrid's internal architecture for researchers, contributors, and developers who want to understand how the system works.

Overview

Xybrid is a hybrid cloud-edge ML inference orchestrator. It routes ML workloads between on-device execution and cloud APIs based on model availability, device capabilities, and user policies.

Core Architecture

The system is built around these key components:

ComponentPurpose
EnvelopeUniversal data container for audio, text, embeddings
OrchestratorRoutes requests to appropriate execution targets
PipelineChains multiple stages (ASR → LLM → TTS)
ExecutorRuns models via ONNX or Candle runtime
StreamSessionReal-time streaming inference

Data Flow

  1. Input arrives as an Envelope (audio bytes, text, or embeddings)
  2. Orchestrator determines execution target (device, cloud, or fallback)
  3. Executor runs the model and produces output
  4. Output is wrapped in a new Envelope for the next stage

See Data Flow & Execution for details.

Infrastructure

ComponentPurpose
RegistryBundle distribution server
BundlesPackaged models (.xyb format)
Pipeline DSLYAML pipeline definitions
StreamingReal-time inference

On this page