Xybrid
API

Gateway API

OpenAI-compatible LLM routing API

The Xybrid Gateway provides an OpenAI-compatible REST API that routes requests to multiple LLM providers (OpenAI, Anthropic, Groq).

Base URL

http://localhost:3000/v1  (local standalone gateway)
http://localhost:8000/v1  (local platform backend)
https://api.xybrid.dev/v1 (production)

All API endpoints are prefixed with /v1 for OpenAI compatibility.

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer your-api-key

Endpoints

Health Check

Check gateway status.

GET /health

Response:

{
  "status": "ok",
  "version": "0.1.0",
  "service": "xybrid-gateway"
}

List Models

Get available models from all providers.

GET /v1/models
Authorization: Bearer your-api-key

Response:

{
  "object": "list",
  "data": [
    { "id": "gpt-4o", "object": "model", "created": 1734134400, "owned_by": "openai" },
    { "id": "gpt-4o-mini", "object": "model", "created": 1734134400, "owned_by": "openai" },
    { "id": "claude-3-5-sonnet", "object": "model", "created": 1734134400, "owned_by": "anthropic" },
    { "id": "llama-3.1-70b-versatile", "object": "model", "created": 1734134400, "owned_by": "groq" }
  ]
}

Chat Completions

Generate chat completions (OpenAI-compatible format).

POST /v1/chat/completions
Content-Type: application/json
Authorization: Bearer your-api-key

Request Body:

{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "system", "content": "You are a helpful assistant." },
    { "role": "user", "content": "Hello, how are you?" }
  ],
  "max_tokens": 150,
  "temperature": 0.7
}

Response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1734134400,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 15,
    "total_tokens": 35
  }
}

Request Parameters

ParameterTypeRequiredDescription
modelstringYesModel identifier
messagesarrayYesConversation messages
max_tokensintegerNoMaximum tokens to generate
temperaturefloatNoSampling temperature (0-2)
top_pfloatNoNucleus sampling parameter
streambooleanNoEnable streaming responses

Message Format

{
  "role": "user|assistant|system",
  "content": "Message text"
}

Streaming ASR (WebSocket)

Real-time speech-to-text via WebSocket.

WS /v1/audio/transcriptions/stream

Audio Format

  • Format: PCM 16-bit signed little-endian
  • Sample Rate: 16kHz
  • Channels: Mono

Client Messages

Config:

{
  "type": "config",
  "model": "whisper-tiny",
  "language": "en",
  "vad": true,
  "model_dir": "/path/to/model"
}

Flush: Request final transcription

{ "type": "flush" }

Reset: Clear buffer, start new utterance

{ "type": "reset" }

Close: End session

{ "type": "close" }

Server Messages

Ready:

{
  "type": "ready",
  "session_id": "stream_1734134400000",
  "model": "whisper-tiny-candle",
  "sample_rate": 16000
}

Partial:

{
  "type": "partial",
  "text": "hello wo",
  "is_stable": false,
  "chunk_index": 3
}

Final:

{
  "type": "final",
  "text": "hello world",
  "duration_ms": 2340,
  "chunks_processed": 5
}

Error:

{
  "type": "error",
  "message": "Failed to load model",
  "code": "asr_error"
}

Closed:

{
  "type": "closed",
  "reason": "Client requested close"
}

JavaScript Example

const ws = new WebSocket('ws://localhost:3000/v1/audio/transcriptions/stream');

ws.onopen = () => {
  // Configure session
  ws.send(JSON.stringify({
    type: 'config',
    model_dir: '/path/to/whisper-model',
    language: 'en',
    vad: true
  }));
};

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);

  switch (msg.type) {
    case 'ready':
      console.log(`Session ${msg.session_id} ready with ${msg.model}`);
      break;
    case 'partial':
      console.log(`Partial: ${msg.text}`);
      break;
    case 'final':
      console.log(`Final: ${msg.text} (${msg.duration_ms}ms)`);
      break;
    case 'error':
      console.error(`Error: ${msg.message}`);
      break;
  }
};

// Stream audio from microphone
navigator.mediaDevices.getUserMedia({ audio: true })
  .then(stream => {
    const audioContext = new AudioContext({ sampleRate: 16000 });
    const source = audioContext.createMediaStreamSource(stream);
    const processor = audioContext.createScriptProcessor(4096, 1, 1);

    processor.onaudioprocess = (e) => {
      const samples = e.inputBuffer.getChannelData(0);
      // Convert to 16-bit PCM
      const pcm = new Int16Array(samples.length);
      for (let i = 0; i < samples.length; i++) {
        pcm[i] = Math.max(-32768, Math.min(32767, samples[i] * 32768));
      }
      ws.send(pcm.buffer);
    };

    source.connect(processor);
    processor.connect(audioContext.destination);
  });

// Get final transcription
function flush() {
  ws.send(JSON.stringify({ type: 'flush' }));
}

Supported Models

OpenAI

ModelContextUse Case
gpt-4o128kMost capable
gpt-4o-mini128kFast, cost-effective
gpt-4-turbo128kPrevious generation

Anthropic

ModelContextUse Case
claude-3-5-sonnet200kMost capable
claude-3-opus200kAdvanced reasoning
claude-3-haiku200kFast, efficient

Groq (Fast Inference)

ModelContextUse Case
llama-3.1-70b-versatile32kHigh quality
llama-3.1-8b-instant32kUltra-fast
mixtral-8x7b-3276832kMoE model

Configuration

Environment Variables

VariableDescription
PORTServer port (default: 3000)
OPENAI_API_KEYOpenAI API key
ANTHROPIC_API_KEYAnthropic API key
GROQ_API_KEYGroq API key

Running the Gateway

# Set provider keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."

# Run via cargo
cargo run -p xybrid-gateway

# Or via just
just gateway

Error Responses

All errors return JSON with this format:

{
  "error": {
    "message": "Error description",
    "type": "error_type",
    "code": "error_code"
  }
}

Error Codes

CodeHTTP StatusDescription
authentication_error401Invalid or missing API key
model_not_found404Requested model not available
provider_error502Upstream provider error
rate_limit_exceeded429Too many requests
invalid_request400Malformed request

cURL Examples

Chat Completion

curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test-key" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "max_tokens": 100
  }'

Using Anthropic

curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test-key" \
  -d '{
    "model": "claude-3-5-sonnet",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain quantum computing briefly."}
    ],
    "max_tokens": 200
  }'

Using Groq (Fast)

curl http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test-key" \
  -d '{
    "model": "llama-3.1-8b-instant",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "max_tokens": 50
  }'

Integration with Pipelines

The Gateway integrates with Xybrid pipelines via the integration target:

name: voice-assistant
stages:
  - whisper-tiny@1.0

  - target: integration
    provider: openai
    model: gpt-4o-mini
    options:
      system_prompt: "You are a helpful voice assistant."
      max_tokens: 150

  - kokoro-82m@0.1

When a pipeline stage has target: integration, the Orchestrator routes the request through the Gateway (or directly to the provider if API keys are configured).

Gateway URL Configuration

The SDK determines the gateway URL using this priority:

  1. Per-pipeline override: Set gateway_url in stage options
  2. Environment variable: XYBRID_GATEWAY_URL (explicit override with full path)
  3. Platform URL: XYBRID_PLATFORM_URL + /v1 suffix
  4. Default: https://api.xybrid.dev/v1

Per-Pipeline Override

Override the gateway URL for a specific LLM stage:

stages:
  - id: llm
    model: gpt-4o-mini
    target: integration
    provider: openai
    options:
      gateway_url: "http://localhost:8000/v1"  # Custom gateway
      system_prompt: "You are helpful."

Environment Variables

# Explicit gateway URL (must include /v1)
export XYBRID_GATEWAY_URL="http://localhost:3000/v1"

# Or use platform URL (SDK appends /v1 automatically)
export XYBRID_PLATFORM_URL="http://localhost:8000"

Flutter SDK

void main() async {
  await Xybrid.init();

  // Configure gateway URL (SDK appends /v1 automatically)
  Xybrid.setGatewayUrl('http://localhost:8000');

  // Set API key for authentication
  Xybrid.setApiKey('your-api-key');

  runApp(MyApp());
}

On this page