Xybrid
SDKs

Flutter

On-device ML inference for Flutter applications

The Flutter SDK (xybrid_flutter) provides Dart bindings to the Xybrid Rust core via FFI, enabling on-device ML inference in mobile and desktop applications.

Installation

Add to your pubspec.yaml:

dependencies:
  xybrid_flutter: ^0.1.0-beta5

Initialization

Initialize the SDK before using any Xybrid APIs:

import 'package:xybrid_flutter/xybrid_flutter.dart';

void main() async {
  WidgetsFlutterBinding.ensureInitialized();
  await Xybrid.init();
  runApp(MyApp());
}

Core Pattern: Load → Run

All inference follows the same pattern:

  1. Create a loader — specify the model by registry ID or local bundle
  2. Load — download (if needed) and initialize the model
  3. Run — execute inference with an input envelope
final model = await Xybrid.model('kokoro-82m').load();
final result = await model.run(XybridEnvelope.text('Hello world'));
print(result.text);

Text-to-Speech

final model = await Xybrid.model('kokoro-82m').load();
final envelope = XybridEnvelope.text('Hello, how are you?', voiceId: 'af_heart');
final result = await model.run(envelope);

// Get WAV-wrapped audio bytes for playback
final wavBytes = result.audioAsWav(sampleRate: 24000);

Speech-to-Text

final model = await Xybrid.model('whisper-tiny').load();
final envelope = XybridEnvelope.audio(bytes: audioBytes, sampleRate: 16000);
final result = await model.run(envelope);
print(result.text); // "Hello, how are you?"

LLM Chat with Streaming

final model = await Xybrid.model('gemma-3-1b').load();

final context = ConversationContext();
context.setSystem('You are a helpful assistant.');
context.pushText('Tell me a joke', MessageRole.user);

final envelope = XybridEnvelope.text('Tell me a joke');

await for (final token in model.runStreamingWithContext(envelope, context)) {
  stdout.write(token.token); // Print each token as it arrives
}

Loading with Progress

final loader = XybridModelLoader.fromRegistry('kokoro-82m');

await for (final event in loader.loadWithProgress()) {
  switch (event) {
    case LoadProgress(:final percentage):
      print('Loading: $percentage%');
    case LoadComplete():
      print('Model ready!');
    case LoadError(:final message):
      print('Error: $message');
  }
}

API Reference

Xybrid

Main entry point for the SDK.

class Xybrid {
  static Future<void> init();
  static bool get isInitialized;
  static void setApiKey(String apiKey);

  static XybridModelLoader model(String modelId);
  static XybridPipeline pipeline({String? yaml, String? filePath});
}

XybridModelLoader

Creates and loads models from different sources.

class XybridModelLoader {
  factory XybridModelLoader.fromRegistry(String modelId);
  factory XybridModelLoader.fromBundle(String path);

  Future<XybridModel> load();
  Stream<LoadEvent> loadWithProgress();
}

Load events:

EventPropertiesDescription
LoadProgressprogress (0.0–1.0), percentage (0–100)Download/load progress
LoadCompleteModel is ready
LoadErrormessageLoading failed

XybridModel

A loaded model ready for inference.

class XybridModel {
  Future<XybridResult> run(XybridEnvelope envelope, {GenerationConfig? config});
  Future<XybridResult> runWithContext(XybridEnvelope envelope, ConversationContext context, {GenerationConfig? config});
  Stream<StreamToken> runStreaming(XybridEnvelope envelope, {GenerationConfig? config});
  Stream<StreamToken> runStreamingWithContext(XybridEnvelope envelope, ConversationContext context, {GenerationConfig? config});
}

XybridEnvelope

Input container for model inference.

class XybridEnvelope {
  factory XybridEnvelope.audio({
    required List<int> bytes,
    required int sampleRate,
    int channels = 1,
  });

  factory XybridEnvelope.text(String text, {String? voiceId, double? speed});
  factory XybridEnvelope.embedding(List<double> data);
  factory XybridEnvelope.textWithRole(String text, MessageRole role);

  XybridEnvelope withRole(MessageRole role);
}

XybridResult

Output from model inference.

class XybridResult {
  bool get success;
  String? get text;                  // ASR / LLM output
  Uint8List? get audioBytes;         // TTS output (raw PCM)
  List<double>? get embedding;       // Embedding output
  int get latencyMs;

  Uint8List? audioAsWav({int sampleRate = 24000, int channels = 1});
}

StreamToken

Token from streaming LLM inference.

class StreamToken {
  final String token;            // Generated token text
  final int index;               // Token position
  final String cumulativeText;   // All text so far
  final bool isFinal;            // Last token in sequence
  final String? finishReason;    // "stop", "length", "error"
}

ConversationContext

Multi-turn conversation management for LLM models.

class ConversationContext {
  ConversationContext();
  ConversationContext.withId(String id);

  String get id;
  int get historyLength;
  bool get hasSystem;

  void setSystem(String text);
  void setMaxHistoryLength(int length);
  void push(XybridEnvelope envelope);
  void pushText(String text, MessageRole role);
  void clear();  // Clears history, keeps system prompt
}
enum MessageRole { system, user, assistant }

XybridPipeline

Multi-stage pipeline for chaining models.

class XybridPipeline {
  factory XybridPipeline.fromYaml(String yaml);
  factory XybridPipeline.fromFile(String path);

  String? get name;
  int get stageCount;
  List<String> get stageNames;

  Future<XybridResult> run(XybridEnvelope envelope);
}

Pipeline Example

final yaml = '''
name: voice-assistant
stages:
  - whisper-tiny
  - gemma-3-1b
  - kokoro-82m
''';

final pipeline = XybridPipeline.fromYaml(yaml);
final result = await pipeline.run(
  XybridEnvelope.audio(bytes: audioBytes, sampleRate: 16000),
);
// Result is TTS audio from the final stage
final wav = result.audioAsWav();

XybridException

class XybridException implements Exception {
  final String message;
}

All SDK errors throw XybridException. Loading errors are surfaced via LoadError events when using loadWithProgress().


Platform Support

PlatformStatusAccelerators
macOSSupportedMetal, CoreML
iOSSupportedMetal, CoreML
AndroidSupportedNNAPI
LinuxSupportedCPU
WindowsPlannedDirectML

On this page