Xybrid

What is Xybrid

Hybrid AI orchestration that runs inference anywhere

Xybrid is a hybrid AI orchestration runtime that lets developers run inference anywhere—device, edge, or cloud—without rewriting code.

The Problem

AI inference is fragmenting across compute surfaces:

  • Mobile NPUs now rival cloud GPUs in capability
  • Privacy regulations demand localized processing
  • Cloud costs escalate with scale
  • Latency requirements push computation to the edge

Yet tooling remains siloed. Developers must manually handle device detection, model routing, format conversion, and fallback logic.

The Solution

Xybrid provides a unified control plane:

name: voice-assistant
stages:
  - whisper-tiny@1.0      # ASR: runs on device (privacy)
  - target: integration
    provider: openai
    model: gpt-4o-mini    # LLM: routes to cloud (capability)
  - kokoro-82m@0.1        # TTS: runs on device (latency)

Three models, two execution locations, one pipeline. Xybrid handles the routing.

Core Principles

PrincipleDescription
Privacy-firstPolicies govern what data may leave the device
Adaptive computeRouting adapts to latency, RTT, and power budgets
Real-time readyStreaming pipelines emit incremental results
Extensible.xyb bundles enable adding models without rewrites

Next Steps

On this page