WASM API Reference

Complete API reference for the Blazen WebAssembly SDK

init()

Initialize the WASM module. Must be called once before using any other export.

import init from '@blazen/sdk';

await init();

Returns a Promise<void>. Subsequent calls are no-ops.

CompletionModel

A chat completion model. Created via static factory methods for each provider.

Provider Factory Methods

All factory methods take no arguments — API keys are read from environment variables at runtime (OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, FAL_KEY, OPENROUTER_API_KEY, etc.). To override the default model, chain .withModel(...) on the returned instance.

Azure additionally requires resourceName and deploymentName as arguments (there’s no single global endpoint). Bedrock requires region.

Method	Signature
`CompletionModel.openai`	`()`
`CompletionModel.anthropic`	`()`
`CompletionModel.gemini`	`()`
`CompletionModel.azure`	`(resourceName: string, deploymentName: string)`
`CompletionModel.fal`	`()`
`CompletionModel.openrouter`	`()`
`CompletionModel.groq`	`()`
`CompletionModel.together`	`()`
`CompletionModel.mistral`	`()`
`CompletionModel.deepseek`	`()`
`CompletionModel.fireworks`	`()`
`CompletionModel.perplexity`	`()`
`CompletionModel.xai`	`()`
`CompletionModel.cohere`	`()`
`CompletionModel.bedrock`	`(region: string)`

const model = CompletionModel.openai();
const claude = CompletionModel.anthropic();
const gemini = CompletionModel.gemini();
const azure = CompletionModel.azure('my-resource', 'my-deployment');
const fal = CompletionModel.fal();
const groq = CompletionModel.groq().withModel('llama-3.3-70b-versatile');
const bedrock = CompletionModel.bedrock('us-east-1');

`model.withModel(modelId: string): CompletionModel`

Override the default model ID for this provider instance. Returns a new CompletionModel (WASM does not mutate in place).

const model = CompletionModel.openai().withModel('gpt-4o-mini');

Properties

Property	Type	Description
`.modelId`	`string`	The model identifier string

`await model.complete(messages: ChatMessage[]): CompletionResponse`

Perform a chat completion.

const response = await model.complete([
  ChatMessage.system('You are a helpful assistant.'),
  ChatMessage.user('What is 2 + 2?'),
]);
console.log(response.content);

`await model.completeWithOptions(messages: ChatMessage[], options: CompletionOptions): CompletionResponse`

Perform a chat completion with additional options.

const response = await model.completeWithOptions(
  [ChatMessage.user('Write a haiku about WASM.')],
  { temperature: 0.7, maxTokens: 100 }
);

`await model.stream(messages: ChatMessage[], onChunk: (chunk) => void): void`

Stream a chat completion. The callback receives each chunk as it arrives.

await model.stream(
  [ChatMessage.user('Tell me a story')],
  (chunk) => {
    if (chunk.delta) process.stdout.write(chunk.delta);
  }
);

Each chunk has the shape:

{
  delta?: string;              // Text content delta
  finishReason?: string;       // Set on the final chunk
  toolCalls: ToolCall[];       // Tool calls, if any
}

Middleware Decorators

Each decorator returns a new CompletionModel wrapping the original.

`model.withRetry(maxRetries?: number): CompletionModel`

Automatic retry with exponential backoff on transient failures. Defaults to 3 retries.

const resilient = model.withRetry(5);

`model.withCache(ttlSeconds?: number, maxEntries?: number): CompletionModel`

In-memory response cache. Streaming requests bypass the cache.

const cached = model.withCache(600, 500);

Parameter	Default	Description
`ttlSeconds`	`300`	Cache entry TTL in seconds.
`maxEntries`	`1000`	Maximum entries before eviction.

`CompletionModel.withFallback(models: CompletionModel[]): CompletionModel`

Static method. Tries providers in order; falls back on transient errors.

const model = CompletionModel.withFallback([
  CompletionModel.openai(),
  CompletionModel.groq(),
]);

ChatMessage

A class for building typed chat messages.

Static Factory Methods

Method	Description
`ChatMessage.system(content: string)`	Create a system message
`ChatMessage.user(content: string)`	Create a user message
`ChatMessage.assistant(content: string)`	Create an assistant message
`ChatMessage.tool(content: string)`	Create a tool result message
`ChatMessage.toolResultMessage(callId: string, name: string, content: string)`	Create a tool result message with a tool-call ID and function name. (Named `toolResultMessage` to avoid colliding with the `.toolResult` instance getter that surfaces the structured payload of an existing message.)
`ChatMessage.userImageUrl(text: string, url: string, mediaType?: string)`	User message with text and an image URL
`ChatMessage.userImageBase64(text: string, data: string, mediaType: string)`	User message with text and a base64-encoded image

const msg = ChatMessage.user('Hello');
const sys = ChatMessage.system('You are a helpful assistant.');
const img = ChatMessage.userImageUrl('Describe this:', 'https://example.com/photo.jpg');

Constructor

new ChatMessage({ role?: string, content?: string, parts?: ContentPart[] })

Properties

Property	Type	Description
`.role`	`string`	`"system"`, `"user"`, `"assistant"`, or `"tool"`
`.content`	`string \| undefined`	The text content of the message
`.toolCallId`	`string \| undefined`	The tool-call ID this message is responding to (only set for tool-result messages)
`.name`	`string \| undefined`	The function name of the tool that produced this result (only set for tool-result messages)

JSON Shape

ChatMessage.toJSON() (and the entries in the messages array returned by runAgent) match the tsify-generated ChatMessage interface:

interface ChatMessage {
  role: "system" | "user" | "assistant" | "tool";
  content: MessageContent;
  tool_call_id?: string;
  name?: string;
  tool_calls?: ToolCall[];
  tool_result?: ToolOutput;  // structured tool-result payload (see below)
}

The tool_result field is populated when a tool handler returns a non-string value or supplies an llm_override. Plain-string tool results live in content as MessageContent::Text instead. The field name is tool_result (snake_case) because tsify preserves Rust field naming.

Tip. The tsify-generated interface lives in crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.ts and is regenerated on every pnpm build (or wasm-pack build --target bundler) inside crates/blazen-wasm-sdk/. See the WASM Quickstart for the build flow.

ToolOutput

The two-channel tool result emitted by JS tool handlers and surfaced on ChatMessage.tool_result.

interface ToolOutput {
  /**
   * The full structured payload the caller sees programmatically. Any
   * JSON-serializable value (object, array, string, number, etc).
   */
  data: any;

  /**
   * Optional override for the body sent back to the model on the next
   * turn. When `null` / absent, each provider applies its default
   * conversion from `data`.
   */
  llm_override?: LlmPayload | null;
}

The two channels exist because what the rest of your application wants to consume from a tool (full structured data, large blobs, internal IDs) is rarely the best thing to feed back to the LLM (token-heavy, leaks internal shape). Set data to the rich payload your code consumes, and use llm_override when you want to send the model a trimmed summary or a provider-specific shape instead.

The WASM dispatcher accepts either llm_override (snake) or llmOverride (camel) when a JS handler returns a structured object — both spellings are normalized before the value is parsed. This means the spelling you write in JS is up to you; both work.

Tool handler return shapes

The WASM tool dispatcher (js_to_tool_output in crates/blazen-wasm-sdk/src/agent.rs) accepts two shapes from a handler:

// 1. Bare value: wrapped automatically as { data: <value>, llm_override: null }.
const tool = {
  name: 'getWeather',
  description: 'Get the current weather for a city',
  parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] },
  handler: async (args) => ({ temp: 22, condition: 'cloudy', city: args.city }),
};

// 2. Structured ToolOutput: object literal with a `data` key.
const structuredTool = {
  name: 'fetchProfile',
  description: 'Fetch a full user profile',
  parameters: { type: 'object', properties: { userId: { type: 'string' } }, required: ['userId'] },
  handler: async (args) => {
    const profile = await db.users.findById(args.userId);  // huge blob
    return {
      data: profile,                                         // caller sees full record
      llmOverride: {                                         // model sees compact summary
        kind: 'text',
        text: `User ${profile.name} (id=${profile.id})`,
      },
    };
  },
};

If a handler returns a string and that string happens to parse as JSON describing a ToolOutput, the dispatcher unpacks it. Otherwise the string is preserved as plain text. If ToolOutput deserialization fails (for instance, a malformed llm_override), the dispatcher silently falls back to wrapping the raw value as { data, llm_override: null } rather than throwing.

LlmPayload

The override sent to the LLM on the next turn (the optional llm_override field of ToolOutput). Discriminated union keyed by kind:

`kind`	Shape	Description
`"text"`	`{ kind: "text"; text: string }`	Plain text. Works on every provider universally.
`"json"`	`{ kind: "json"; value: any }`	Structured JSON. Anthropic and Gemini consume it natively; OpenAI / Responses / Azure / Fal stringify at the wire boundary.
`"parts"`	`{ kind: "parts"; parts: ContentPart[] }`	Multimodal content blocks. Anthropic supports natively as `tool_result.content` blocks; OpenAI falls back to text; Gemini falls back to a JSON object.
`"provider_raw"`	`{ kind: "provider_raw"; provider: ProviderId; value: any }`	Provider-specific escape hatch. Only the named provider sees `value`; every other provider falls back to the default conversion from `ToolOutput.data`.

ProviderId is the snake_case enum:

type ProviderId =
  | "openai"
  | "openai_compat"
  | "azure"
  | "anthropic"
  | "gemini"
  | "responses"
  | "fal";

Per-provider behavior for `data` (when `llm_override` is null)

When no override is set, each provider applies its default conversion to ToolOutput.data:

OpenAI / OpenAI-compat / Azure / Responses / Fal: the value is JSON-stringified once and sent as the tool-result string.
Anthropic: structured data becomes [{ type: "text", text: <stringified> }] so it lands inside tool_result.content blocks.
Gemini: a structured object passes through as the response field of the function-response part; scalar values (numbers, booleans, strings) are wrapped as { result: scalar } to satisfy Gemini’s object-only contract.

Examples

// Send a compact text summary while keeping the full record in `data`.
return {
  data: { id: 42, items: bigArray, internal: '...' },
  llmOverride: { kind: 'text', text: '42 items processed' },
};

// Force JSON semantics regardless of provider default.
return {
  data: { ok: true, count: 17 },
  llmOverride: { kind: 'json', value: { ok: true, count: 17 } },
};

// Multimodal: send an image back to the model (Anthropic native; falls
// back to text on OpenAI; falls back to a JSON object on Gemini).
return {
  data: { url: 'https://example.com/photo.png' },
  llmOverride: {
    kind: 'parts',
    parts: [
      { type: 'text', text: 'Here is the photo:' },
      { type: 'image', source: { url: 'https://example.com/photo.png' } },
    ],
  },
};

// Anthropic-only escape hatch: send a raw provider blob.
return {
  data: { ok: true },
  llmOverride: {
    kind: 'provider_raw',
    provider: 'anthropic',
    value: { type: 'text', text: 'anthropic-only payload' },
  },
};

CompletionResponse

Returned by model.complete() and model.completeWithOptions().

interface CompletionResponse {
  content?: string;                // The generated text
  toolCalls: ToolCall[];           // Tool calls requested by the model
  usage?: TokenUsage;              // Token usage statistics
  model: string;                   // Model name used
  finishReason?: string;           // "stop", "tool_calls", etc.
  cost?: number;                   // Cost in USD
  timing?: RequestTiming;          // Request timing breakdown
  metadata: object;                // Raw provider-specific metadata
}

CompletionOptions

Options for completeWithOptions().

interface CompletionOptions {
  temperature?: number;        // Sampling temperature (0.0 - 2.0)
  maxTokens?: number;          // Maximum tokens to generate
  topP?: number;               // Nucleus sampling parameter
  model?: string;              // Override the default model ID
  tools?: ToolDefinition[];    // Tool definitions for function calling
}

ToolCall

A tool invocation requested by the model.

Property	Type	Description
`.id`	`string`	Unique identifier for the tool call
`.name`	`string`	Name of the tool to invoke
`.arguments`	`object`	Parsed JSON arguments

ToolDefinition

Describes a tool that the model may invoke.

interface ToolDefinition {
  name: string;            // Unique tool name
  description: string;     // Human-readable description
  parameters: object;      // JSON Schema for the tool's parameters
}

Content Subsystem

Browser/edge-friendly subset of Blazen’s multimodal content store. The WASM surface omits filesystem-bound APIs available in the native crate — there is no localFile() factory, and ContentStore does not expose a metadata() method.

ContentKind

String union describing the taxonomy of stored content. Treat as #[non_exhaustive] — new kinds may be added in future releases.

Value	Description
`"image"`	Raster or vector image
`"audio"`	Audio clip
`"video"`	Video clip
`"document"`	Document (PDF, text, etc.)
`"three_d_model"`	3D model (glTF, OBJ, etc.)
`"cad"`	CAD file (STEP, IGES, etc.)
`"archive"`	Archive (zip, tar, etc.)
`"font"`	Font file
`"code"`	Source code
`"data"`	Structured data (CSV, JSON, etc.)
`"other"`	Catch-all for anything else

ContentHandle

Opaque, store-issued reference to a piece of content. Field names mirror the wire format (snake_case).

interface ContentHandle {
  id: string;              // Opaque store-defined identifier
  kind: ContentKind;       // Used for type-checking at the tool-input boundary
  mime_type?: string;      // MIME type if known
  byte_size?: number;      // Byte size if known
  display_name?: string;   // Human-readable name (e.g. original filename)
}

ImageSource

Discriminated union of every way an image can be supplied to a ChatMessage.

type ImageSource =
  | { type: "url"; url: string }
  | { type: "base64"; data: string }
  | { type: "file"; path: string }
  | { type: "provider_file"; provider: ProviderId; id: string }
  | { type: "handle"; handle: ContentHandle };

The handle variant defers resolution to the active ContentStore, which substitutes one of the other variants at request-build time.

ContentStore

Abstract handle-issuing store. Build built-in instances via the static factories below, supply user-defined backends via ContentStore.custom({...}), or extends ContentStore from JS / TypeScript and override the methods you need. Resources are released by free() or via the explicit-resource-management protocol ([Symbol.dispose]).

class ContentStore {
  // Subclass-friendly base constructor (call from `super()`)
  constructor();

  // Factories
  static inMemory(): ContentStore;
  static openaiFiles(apiKey: string): ContentStore;
  static anthropicFiles(apiKey: string): ContentStore;
  static geminiFiles(apiKey: string): ContentStore;
  static falStorage(apiKey: string): ContentStore;
  static custom(options: CustomContentStoreOptions): ContentStore;

  // Instance methods
  put(
    body: Uint8Array | string,
    kindHint?: string | null,
    mimeType?: string | null,
    displayName?: string | null,
  ): Promise<ContentHandle>;
  resolve(handle: ContentHandle): Promise<unknown>;   // MediaSource-shaped JS object
  fetchBytes(handle: ContentHandle): Promise<Uint8Array>;
  delete(handle: ContentHandle): Promise<void>;

  // Lifecycle
  free(): void;
  [Symbol.dispose](): void;
}

put accepts either a Uint8Array of bytes or a string URL (URL support depends on the backing store). kindHint is the wire string — for example "image" or "three_d_model" — and overrides auto-detection.

import { ContentStore } from '@blazen/sdk';

// Explicit-resource-management form (preferred)
using store = ContentStore.inMemory();
const handle = await store.put(bytes, 'image', 'image/png', 'logo.png');
const media = await store.resolve(handle);

// Manual lifecycle
const fallback = ContentStore.openaiFiles(apiKey);
try {
  const h = await fallback.put(bytes, 'document', 'application/pdf');
} finally {
  fallback.free();
}

Subclassing `ContentStore`

ContentStore is subclassable from JavaScript / TypeScript via wasm-bindgen. Override the methods your backend needs; the SDK wraps your subclass in a Rust adapter that dispatches into your JS async functions via js_sys::Function::call + wasm_bindgen_futures::JsFuture.

import { ContentStore } from "@blazen/sdk";
import type { ContentHandle } from "@blazen/sdk";

class IndexedDBContentStore extends ContentStore {
  constructor() {
    super();
  }

  async put(body, hint) {
    // ... persist to IndexedDB / OPFS / fetch+rehost ...
    return { id: "...", kind: "image" };
  }

  async resolve(handle) {
    return { sourceType: "url", url: "..." };
  }

  async fetchBytes(handle) {
    return new Uint8Array([...]);
  }

  // Optional:
  async fetchStream(handle) { return new Uint8Array([...]); }
  async delete(handle) { /* no-op */ }
}

Subclasses MUST override put, resolve, fetchBytes. The base-class default impls throw a JsError so any missing override fails clearly rather than silently recursing via super().

`ContentStore.custom({...})`

Callback-based factory. Direct JS mirror of Rust CustomContentStore::builder.

ContentStore.custom(options: {
  put: (body: any, hint: any) => Promise<ContentHandle>;
  resolve: (handle: ContentHandle) => Promise<any>;            // serialized MediaSource
  fetchBytes: (handle: ContentHandle) => Promise<Uint8Array>;
  fetchStream?: (handle: ContentHandle) => Promise<Uint8Array>; // single-chunk for now
  delete?: (handle: ContentHandle) => Promise<void>;
}): ContentStore

put, resolve, fetchBytes are required. fetchStream and delete are optional. The body arrives as a JS object shaped like {type: "bytes", data: [...]} / {type: "url", url} / {type: "provider_file", provider, id} / {type: "stream", stream: ReadableStream<Uint8Array>, sizeHint: number | null} (no local_path in WASM since there’s no filesystem).

resolve returns a serialized MediaSource JS object. fetchBytes returns a Uint8Array. fetchStream may return either a Uint8Array / number[] (legacy, single-chunk) or a ReadableStream<Uint8Array> for true chunk-by-chunk streaming.

Built-in stores

Factory	Purpose
`ContentStore.inMemory()`	Ephemeral in-WASM-memory store. Bytes live in WASM heap; URL/provider-file inputs are recorded by reference.
`ContentStore.openaiFiles(apiKey)`	Uploads to the OpenAI Files API. `apiKey` is sent as `Bearer <apiKey>`.
`ContentStore.anthropicFiles(apiKey)`	Uploads to the Anthropic Files API. `apiKey` is sent as `x-api-key`.
`ContentStore.geminiFiles(apiKey)`	Uploads to the Google AI / Gemini Files API.
`ContentStore.falStorage(apiKey)`	Uploads to fal.ai storage.
`ContentStore.custom({...})`	User-defined backend via async callbacks (see above).

Tool-input schema helpers

Each helper returns a JSON Schema fragment with a x-blazen-content-ref extension that tells Blazen’s resolver which content kind the model is expected to pass.

Helper	Kind
`imageInput(name, description)`	`image`
`audioInput(name, description)`	`audio`
`videoInput(name, description)`	`video`
`fileInput(name, description)`	`document`
`threeDInput(name, description)`	`three_d_model`
`cadInput(name, description)`	`cad`

import { imageInput } from '@blazen/sdk';

const params = imageInput('photo', 'The user-supplied photograph to analyze');
// {
//   type: "object",
//   properties: {
//     photo: {
//       type: "string",
//       description: "The user-supplied photograph to analyze",
//       "x-blazen-content-ref": { kind: "image" }
//     }
//   },
//   required: ["photo"]
// }

How resolution works

The model only ever sees the schema’s string type and passes the handle id back as a plain string. The x-blazen-content-ref extension is invisible to providers. Before invoking the tool handler, Blazen’s resolver looks up the id in the active ContentStore, fetches a typed content shape (e.g. an ImageSource), and substitutes it into the tool arguments. Handlers therefore receive resolved content rather than raw ids.

TokenUsage

Property	Type	Description
`.promptTokens`	`number`	Tokens in the prompt
`.completionTokens`	`number`	Tokens in the completion
`.totalTokens`	`number`	Total tokens used

RequestTiming

Property	Type	Description
`.queueMs`	`number \| undefined`	Time in queue (ms)
`.executionMs`	`number \| undefined`	Execution time (ms)
`.totalMs`	`number \| undefined`	Total wall-clock time (ms)

runAgent

Run an agentic tool-calling loop.

const result = await runAgent(model, messages, tools, options?);

Parameters

Parameter	Type	Description
`model`	`CompletionModel`	The completion model to use
`messages`	`ChatMessage[]`	Initial conversation messages
`tools`	`ToolDef[]`	Tool definitions. Each tool object has `name`, `description`, `parameters` (JSON Schema), and a `handler(args) => any \| Promise<any>` that returns either a bare JSON-serializable value or a structured `ToolOutput` (an object with a `data` key, plus an optional `llm_override` / `llmOverride`). See the tool handler return shapes above.
`options`	`AgentRunOptions?`	Optional configuration

AgentRunOptions

interface AgentRunOptions {
  toolConcurrency?: number;  // Max concurrent tool calls per round (default: 0 = unlimited)
  maxIterations?: number;    // Max tool-calling iterations (default: 10)
  systemPrompt?: string;     // System prompt prepended to conversation
  temperature?: number;      // Sampling temperature
  maxTokens?: number;        // Max tokens per call
  addFinishTool?: boolean;   // Add a built-in "finish" tool
}

AgentResult

interface AgentResult {
  content?: string;                // Final text response
  messages: ChatMessage[];         // Full message history (each entry matches the tsify ChatMessage interface)
  iterations: number;              // Number of iterations
  totalUsage?: TokenUsage;         // Aggregated token usage
  totalCost?: number;              // Aggregated cost in USD
}

Tool-result messages in messages carry a tool_result?: ToolOutput field whenever the handler returned a non-string data or supplied an llm_override. Plain string returns appear as content: { Text: "..." } on a role: "tool" message with no tool_result field.

Workflow

`new Workflow(name: string)`

Create a new workflow instance.

`.addStep(name: string, eventTypes: string[], handler: StepHandler)`

wf.addStep('process', ['MyEvent'], async (event, ctx) => {
  return { type: 'blazen::StopEvent', result: { done: true } };
});

`await wf.run(input: object): any`

Run the workflow to completion. The input is passed as the data field of a synthetic StartEvent. Returns a Promise that resolves to the result field of the final StopEvent.

`await wf.runStreaming(input: any, callback: (event: any) => void): Promise<any>`

Run the workflow and forward each lifecycle event to a JS callback as it occurs, resolving with the terminal payload once the workflow completes. Mirrors the Node binding’s runStreaming(input, onEvent). Stream events are subscribed before the engine begins dispatching, so no events are missed between dispatch and subscription.

await wf.runStreaming({ topic: 'TS' }, (event) => {
  console.log(event.event_type, event.data);
});

The callback is invoked with { event_type, data } per event. Errors raised synchronously by the listener are swallowed so a misbehaving callback does not abort the run.

`await wf.runWithHandler(input: any): Promise<WorkflowHandler>`

Build and dispatch the workflow, returning the live WorkflowHandler instead of awaiting the terminal event. Use this when you want to drive the run yourself: call awaitResult() for the final payload, pause() / snapshot() for mid-flight state, nextEvent() / streamEvents() for events, or cancel() / abort() to tear it down. Functionally equivalent to runHandler(input) — the JS-side name runWithHandler exists for parity with the Node binding.

const handler = await wf.runWithHandler({ topic: 'TS' });
await handler.streamEvents((ev) => console.log(ev));
const result = await handler.awaitResult();

`wf.setSessionPausePolicy(policy: string): void`

Configure how live session refs are treated when the workflow is paused or snapshotted. Mirrors the Node binding’s setSessionPausePolicy. The policy is applied when the workflow is dispatched via run / runHandler / runStreaming / runWithHandler / resumeFromSnapshot / resumeWithSerializableRefs.

Policy	Behavior
`"pickle_or_error"` (default)	Pickle live refs if the binding supports it; error otherwise.
`"pickle_or_serialize"`	Pickle if possible; fall back to user-supplied byte serialization.
`"warn_drop"`	Log a warning and drop live refs from the snapshot.
`"hard_error"`	Always error if any live refs are present.

PascalCase spellings (PickleOrError, PickleOrSerialize, WarnDrop, HardError) are also accepted.

WorkflowBuilder exposes the same method (chainable, returns WorkflowBuilder).

`await wf.resumeWithSerializableRefs(snapshot: any, deserializers: Record<string, (bytes: Uint8Array) => unknown>): Promise<WorkflowHandler>`

Resume a workflow from a snapshot whose __blazen_serialized_session_refs sidecar carries JS-serialized session refs. The deserializers object maps type_tag strings to (bytes: Uint8Array) => unknown callbacks. For every entry in the sidecar whose tag appears in deserializers, the callback is invoked synchronously with the captured bytes; the return value is ignored (callbacks should populate any application state they need).

The snapshot’s bytes are also exposed inside step handlers via ctx.getSessionRefSerializable(key) after resume, mirroring the Node binding’s path.

const handler = await wf.resumeWithSerializableRefs(snapshot, {
  'app::EmbeddingHandle': (bytes) => {
    myStore.rehydrate(bytes); // populate user-side state
  },
});
const result = await handler.awaitResult();

Snapshots without serialized session refs work fine with resumeFromSnapshot(); this method is only required when the original pause used SessionPausePolicy::PickleOrSerialize.

WorkflowHandler

Live handle to an in-flight workflow run, returned by runHandler() / runWithHandler() / resumeFromSnapshot() / resumeWithSerializableRefs(). The handler lets JS callers drive a workflow run beyond the simple “fire and forget” pattern of run().

Methods

Method	Signature	Description
`awaitResult`	`() => Promise<any>`	Await the workflow’s terminal payload. Consumes the inner handler.
`pause`	`() => Promise<any>`	Park the event loop and capture a quiescent snapshot.
`snapshot`	`() => Promise<any>`	Capture the current snapshot without halting the loop. Use this for logging or telemetry; pair `pause()` -> `snapshot()` -> `resumeInPlace()` for a quiescent view.
`resumeInPlace`	`() => void`	Resume a paused event loop. The same handler instance remains valid for `awaitResult` / `nextEvent` etc.
`cancel`	`() => void`	Tear down the event loop. Best-effort; errors if the loop has already exited.
`abort`	`() => void`	Pure alias for `cancel()`. Matches `JsWorkflowHandler::abort` in the Node bindings — use whichever name reads better.
`runId`	`() => Promise<any>`	Return the run’s UUID as a string. First call captures a snapshot to read it, then caches.
`nextEvent`	`() => Promise<any>`	Pull the next event from the broadcast stream. Resolves with `null` when the stream closes.
`streamEvents`	`(callback: (event: any) => void) => Promise<void>`	Subscribe to the broadcast stream and forward each event to a JS callback until the stream closes. Mirrors the Node binding. Single Promise drives the subscription — no need to wrap repeated `nextEvent()` calls.
`respondToInput`	`(requestId: string, response: any) => void`	Deliver a human-in-the-loop response to a workflow that auto-parked on an `InputRequestEvent`. Pass the matching `request_id` and a JSON-serializable response value.

Streaming events

const handler = await wf.runWithHandler({ topic: 'WASM' });
await handler.streamEvents((event) => {
  console.log(event.event_type, event.data);
});
const result = await handler.awaitResult();

Events emitted before streamEvents() is called are not replayed — call it before awaitResult() to avoid races.

Pause and snapshot

const handler = await wf.runWithHandler({ topic: 'WASM' });
await handler.pause();
const snap = await handler.snapshot();
localStorage.setItem('snap', JSON.stringify(snap));
handler.resumeInPlace();
const result = await handler.awaitResult();

Human-in-the-loop

const handler = await wf.runWithHandler({});
const event = await handler.nextEvent();
if (event?.event_type === 'InputRequestEvent') {
  const userInput = await prompt(event.data.prompt);
  handler.respondToInput(event.data.request_id, { answer: userInput });
}
const result = await handler.awaitResult();

Pipeline

Pipelines compose multiple Workflows into a sequential or parallel chain. Each Stage wraps one workflow plus optional input-mapping and conditional-execution callbacks; the resulting Pipeline runs the stages in order, threading the previous stage’s output into the next stage’s input.

`new PipelineBuilder(name: string)`

Construct a new builder with the given pipeline name.

Methods

Method	Signature	Description
`pipelineBuilder.stage(stage)`	`(stage: Stage) => void`	Append a sequential stage. Consumes the stage — the same `Stage` instance cannot be added to two pipelines.
`pipelineBuilder.parallel(parallel)`	`(parallel: ParallelStage) => void`	Append a parallel stage that fans out across multiple branches.
`pipelineBuilder.timeoutPerStage(seconds)`	`(seconds: number) => void`	Set a per-stage timeout. Exceeding it surfaces as a stage failure with `WorkflowError::Timeout`.
`pipelineBuilder.onPersist(callback)`	`(callback: (snapshot: any) => Promise<void>) => void`	Persist callback that receives a typed `PipelineSnapshot` (serialized to a JS object via `serde-wasm-bindgen`) after each stage completes. The engine awaits the returned `Promise` before continuing.
`pipelineBuilder.onPersistJson(callback)`	`(callback: (json: string) => Promise<void>) => void`	Persist callback that receives the snapshot serialized as a JSON string. The engine awaits the returned `Promise` before continuing.
`pipelineBuilder.build()`	`() => Pipeline`	Finalize and return a runnable `Pipeline`.

IndexedDB persistence example

const builder = new PipelineBuilder('research');
builder.stage(new Stage('outline', outlineWf));
builder.stage(new Stage('draft', draftWf, (state) => ({ outline: state.outline })));
builder.onPersistJson(async (json) => {
  const db = await openDb();
  await db.put('snapshots', { id: 'research', json });
});
const pipeline = builder.build();

Stage

new Stage(
  name: string,
  workflow: Workflow,
  input_mapper?: ((state: BlazenState) => unknown) | null,
  condition?: ((state: BlazenState) => boolean) | null,
);

input_mapper is an optional (state: BlazenState) => unknown JS callable invoked before the stage runs. Its return value becomes the workflow’s input. When null / undefined, the previous stage’s output (or the pipeline input for the first stage) is passed through directly.

condition is an optional (state: BlazenState) => boolean JS callable that decides whether the stage runs. When null / undefined the stage always runs; when the callable returns false the stage is skipped (its StageResult.skipped is true and output is null).

const stage = new Stage(
  'summarize',
  summarizeWf,
  (state) => ({ text: state.draft, maxWords: 100 }),
  (state) => state.draft != null && state.draft.length > 200,
);

ParallelStage

new ParallelStage(name: string, branches: Stage[], join_strategy?: JoinStrategy | null);

Each branch is a Stage; branches execute concurrently and are joined according to JoinStrategy.WaitAll (default) or JoinStrategy.FirstCompletes. Branch Stage instances are consumed when the parallel stage is constructed.

import { ParallelStage, Stage, JoinStrategy } from '@blazen/sdk';

const fanOut = new ParallelStage(
  'fan-out',
  [new Stage('a', wfA), new Stage('b', wfB)],
  JoinStrategy.WaitAll,
);

Context (WasmContext)

Shared workflow context accessible by all steps. Unlike the Node.js SDK, all methods are synchronous — no await needed.

StateValue

Values stored in the context can be any StateValue:

type StateValue = string | number | boolean | null | Uint8Array | StateValue[] | { [key: string]: StateValue };

Methods

Method	Signature	Description
`ctx.set(key, value)`	`(key: string, value: StateValue) => void`	Store a value. Auto-detects `Uint8Array` and stores it as binary; everything else is stored as-is.
`ctx.get(key)`	`(key: string) => StateValue \| null`	Retrieve a value. Returns `Uint8Array` for binary data, the original `JsValue` for everything else, or `null` if the key is missing.
`ctx.setBytes(key, data)`	`(key: string, data: Uint8Array) => void`	Explicitly store binary data.
`ctx.getBytes(key)`	`(key: string) => Uint8Array \| null`	Retrieve binary data. Returns `null` if the key is missing.
`ctx.sendEvent(event)`	`(event: object) => void`	Queue an event into the workflow event loop.
`ctx.writeEventToStream(event)`	`(event: object) => void`	No-op in WASM. Present for API compatibility with the Node.js and Rust SDKs.
`ctx.runId()`	`() => string`	Returns the unique UUID v4 for the current workflow run.
`ctx.insertSessionRefSerializable(typeName, bytes)`	`(typeName: string, bytes: Uint8Array) => string`	Store an opaque, user-serialized payload in the session-ref registry under a fresh registry key. `typeName` is a stable identifier the caller chooses (e.g. `"app::EmbeddingHandle"`); it is captured into snapshot metadata along with the bytes when the workflow is paused under `SessionPausePolicy::PickleOrSerialize`. Returns the registry key as a string. JS code must serialize the value itself (typically into a `Uint8Array`) before calling and deserialize on retrieval.
`ctx.getSessionRefSerializable(key)`	`(key: string) => { typeName: string; bytes: Uint8Array } \| null`	Retrieve a previously inserted opaque payload. Returns `null` if the registry has no entry under `key`, or if the entry exists but was inserted via the non-serializable path (`set` / `setBytes` / language-specific live refs).

The session-ref-serializable wire format is cross-binding compatible with the Node binding’s NodeSessionRefSerializable: the same typeName / bytes pair round-trips through a snapshot taken on one binding and resumed on the other.

// Inside a step handler.
const key = ctx.insertSessionRefSerializable(
  'app::EmbeddingHandle',
  new TextEncoder().encode(JSON.stringify({ id: 42 })),
);
ctx.state.set('embedding_key', key);

// In a later step (or after resume).
const stored = ctx.getSessionRefSerializable(ctx.state.get('embedding_key') as string);
if (stored) {
  const obj = JSON.parse(new TextDecoder().decode(stored.bytes));
}

Properties

Property	Type	Description
`ctx.workflowName`	`string`	Getter property returning the workflow name.
`ctx.state`	`StateNamespace`	Getter returning the persistable workflow state namespace. Survives snapshotting when the WASM runner gains snapshot support. Routes through the same JS / bytes dispatch as `ctx.set` / `ctx.get`.
`ctx.session`	`SessionNamespace`	Getter returning the live in-process JS reference namespace. Identity IS preserved within a single workflow run (unlike the Node bindings). Excluded from snapshots.

StateNamespace

Persistable workflow state, accessed via ctx.state. Routes values through the same set / get / setBytes / getBytes dispatch as the legacy ctx.set / ctx.get, so anything stored here will survive snapshotting once the WASM runner gains snapshot support.

All methods are synchronous — the WASM runtime has no tokio.

Methods

Method	Signature	Description
`state.set(key, value)`	`(key: string, value: unknown) => void`	Store a value. Auto-detects `Uint8Array` and stores it as binary; everything else is stored as-is.
`state.get(key)`	`(key: string) => unknown`	Retrieve a value. Returns `Uint8Array` for binary data, the original `JsValue` for everything else, or `null` if the key is missing.
`state.setBytes(key, data)`	`(key: string, data: Uint8Array) => void`	Explicitly store binary data.
`state.getBytes(key)`	`(key: string) => Uint8Array \| null`	Retrieve binary data. Returns `null` if the key is missing.

ctx.state.set("counter", 5);
const count = ctx.state.get("counter");

SessionNamespace

Live in-process JS references, accessed via ctx.session. Values are stored as raw JsValue in a separate map and are excluded from any snapshot.

Identity IS preserved within a run on WASM. Because the WASM runtime is single-threaded, session values are stored as raw JsValue and ctx.session.get(key) === obj holds after ctx.session.set(key, obj). This is a meaningful differentiator from the Node bindings, where identity is not preserved due to napi-rs threading constraints (values are round-tripped through serde_json::Value).

All methods are synchronous.

Methods

Method	Signature	Description
`session.set(key, value)`	`(key: string, value: unknown) => void`	Store a live JS reference under `key`. The value is kept as-is.
`session.get(key)`	`(key: string) => unknown`	Retrieve the value previously stored under `key`. Returns `null` if missing.
`session.has(key)`	`(key: string) => boolean`	Check whether a value exists under `key`.
`session.remove(key)`	`(key: string) => void`	Remove the value stored under `key`.

const conn = openConnection();
ctx.session.set("conn", conn);
console.log(ctx.session.get("conn") === conn);   // true

WASM does not currently support cross-process snapshot/resume of session entries. Session values exist only within a single workflow run.

BlazenState

A protocol for structured per-field state storage in the WASM context. Objects carrying the __blazen_state__: true marker are automatically decomposed by ctx.set() and reconstructed by ctx.get().

BlazenStateMeta

Configuration is read from the object’s constructor via a static meta property.

interface BlazenStateMeta {
  /** Field names excluded from serialization (recreated via restore). */
  transient?: string[];
  /** Name of the method to call after reconstruction. */
  restore?: string;
}

Detection

Any JS object with the property __blazen_state__ set to a truthy value is treated as a BlazenState:

const state = new MyState();
state.__blazen_state__ = true;  // enables the protocol

Per-field storage

When ctx.set(key, state) receives a BlazenState object:

Each enumerable field is stored individually at {key}.{fieldName} (skipping the marker and transient fields).
A metadata entry is written at {key}.__blazen_meta__ recording the field list, class name, transient array, and restore method name.

When ctx.get(key) finds a {key}.__blazen_meta__ entry:

Each recorded field is loaded individually.
The fields are assembled into a new plain object.
The __blazen_state__ marker is set on the result.
If a restore method name was recorded, that method is called on the reconstructed object.

restore()

The restore entry in meta is a string naming the method on the instance to call after reconstruction. This method receives no arguments and is called synchronously:

class MyState {
  dbPath = '';
  conn = null;

  static meta = {
    transient: ['conn'],
    restore: 'reconnect',
  };

  reconnect() {
    this.conn = openDb(this.dbPath);
  }
}

Synchronous execution

All BlazenState operations in WASM are synchronous. Unlike the Node.js SDK (where saveTo() / loadFrom() return Promises), the WASM context processes BlazenState objects inline during ctx.set() and ctx.get() — no await needed.

Events

Events are plain objects with a type field.

Start Event

{ type: 'blazen::StartEvent', ...input }

Stop Event

{ type: 'blazen::StopEvent', result: { ... } }

EmbeddingModel

Generate vector embeddings from text. Created via static factory methods. All factory methods take no arguments — API keys are read from environment variables (OPENAI_API_KEY, TOGETHER_API_KEY, COHERE_API_KEY, FIREWORKS_API_KEY).

import { EmbeddingModel } from '@blazen/sdk';

const model = EmbeddingModel.openai();
const together = EmbeddingModel.together();
const cohere = EmbeddingModel.cohere();
const fireworks = EmbeddingModel.fireworks();

Provider Factory Methods

Method	Default Model	Default Dimensions
`EmbeddingModel.openai()`	`text-embedding-3-small`	1536
`EmbeddingModel.together()`	`togethercomputer/m2-bert-80M-8k-retrieval`	768
`EmbeddingModel.cohere()`	`embed-v4.0`	1024
`EmbeddingModel.fireworks()`	`nomic-ai/nomic-embed-text-v1.5`	768

Properties

Property	Type	Description
`.modelId`	`string`	The model identifier.
`.dimensions`	`number`	Output vector dimensionality.

`await model.embed(texts: string[]): Promise<number[][]>`

Embed one or more texts, returning a nested array of float vectors.

const result = await model.embed(['Hello', 'World']);
console.log(result.length);       // 2
console.log(result[0].length);    // 1536

`EmbeddingModel.tract(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<EmbeddingModel>`

Local embedding via tract-onnx — pure-Rust ONNX inference that runs entirely inside the WASM module with no JS libraries required. Both URLs are fetched via web_sys::fetch because the hf-hub crate is not available on wasm32; both endpoints must respond with CORS headers permitting the calling origin.

import { EmbeddingModel, TractOptions } from '@blazen/sdk';

const opts = new TractOptions();
opts.modelName = 'BGESmallENV15';
const embedder = await EmbeddingModel.tract(
  'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/onnx/model.onnx',
  'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/tokenizer.json',
  opts,
);
const vecs = await embedder.embed(['Hello world']);

TractEmbedModel

Standalone wasm-only ONNX embedding model. The same backend powers EmbeddingModel.tract(...), but TractEmbedModel is exposed directly for callers who want the typed class without going through the generic EmbeddingModel factory.

`TractEmbedModel.create(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<TractEmbedModel>`

Async constructor. The ONNX weights and tokenizer.json are fetched over HTTP via web_sys::fetch (the hf-hub crate doesn’t compile to wasm32). modelUrl should point to a raw ONNX protobuf; tokenizerUrl to a HuggingFace-format tokenizer.json. Both URLs must be CORS-enabled.

import { TractEmbedModel, TractOptions } from '@blazen/sdk';

const opts = new TractOptions();
opts.modelName = 'BGESmallENV15';
opts.maxBatchSize = 32;

const model = await TractEmbedModel.create(
  'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/onnx/model.onnx',
  'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/tokenizer.json',
  opts,
);

console.log(model.modelId, model.dimensions);
const vectors = await model.embed(['hello', 'world']);

Properties

Property	Type	Description
`.modelId`	`string`	The Hugging Face model id this instance was loaded from.
`.dimensions`	`number`	Output embedding dimensionality.

`await model.embed(texts: string[]): Promise<Float32Array[]>`

Embed one or more texts. Returns a nested array of Float32Array vectors.

MediaSource

Top-level type alias re-exporting ImageSource so the same Url / Base64 shape is reused across image, audio, video, and file modalities:

type MediaSource = ImageSource;

Use MediaSource in your own type annotations whenever the modality is generic; the runtime shape is identical to ImageSource.

Token Estimation

Lightweight token counting functions available without external data files.

`estimateTokens(text: string, contextSize?: number): number`

Estimate token count for a string (~3.5 characters per token).

import { estimateTokens } from '@blazen/sdk';

const count = estimateTokens('Hello, world!');  // 4

`countMessageTokens(messages: object[], contextSize?: number): number`

Estimate total tokens for an array of chat messages (plain objects with role and content fields). Includes per-message overhead.

import { countMessageTokens } from '@blazen/sdk';

const count = countMessageTokens([
  { role: 'system', content: 'You are helpful.' },
  { role: 'user', content: 'Hello!' },
]);

contextSize defaults to 128000 if omitted.

Custom Providers via JS Handlers

CompletionModel and EmbeddingModel can be created from JavaScript handler functions using static factory methods. This lets you implement custom providers without subclassing.

CompletionModel.fromJsHandler

const model = CompletionModel.fromJsHandler("my-llm", async (request) => {
  // request contains messages, tools, temperature, etc.
  return {
    content: "Hello from my custom model",
    model: "my-llm",
  };
});

const response = await model.complete([ChatMessage.user("Hi")]);

The handler receives a request object and should return a CompletionResponse-shaped object.

EmbeddingModel.fromJsHandler

const embedder = EmbeddingModel.fromJsHandler("my-embedder", 128, async (texts) => {
  return texts.map(() => new Array(128).fill(0.1));
});

const result = await embedder.embed(["Hello", "World"]);

The handler receives a string[] and should return a number[][] of embeddings.

Per-Capability Provider Classes

Seven provider classes let you implement a single compute capability by passing handler functions to the constructor.

Class	Constructor Handler	Description
`TTSProvider`	`(request) => Promise<any>`	Text-to-speech synthesis
`MusicProvider`	`{ generateMusic, generateSfx }`	Music and sound effect generation
`ImageProvider`	`{ generateImage, upscaleImage }`	Image generation and upscaling
`VideoProvider`	`{ textToVideo, imageToVideo }`	Video generation
`ThreeDProvider`	`(request) => Promise<any>`	3D model generation
`BackgroundRemovalProvider`	`(request) => Promise<any>`	Background removal
`VoiceProvider`	`{ cloneVoice, listVoices, deleteVoice }`	Voice cloning and management

Constructor

Single-method providers take a provider ID and a handler function:

const tts = new TTSProvider("elevenlabs", async (request) => {
  const audio = await elevenlabs.textToSpeech(request);
  return { audioData: audio, format: "mp3" };
});

const result = await tts.textToSpeech({ text: "Hello world", voice: "alice" });

Multi-method providers take a provider ID and a handlers object:

const music = new MusicProvider("suno", {
  generateMusic: async (request) => { /* ... */ },
  generateSfx: async (request) => { /* ... */ },
});

const image = new ImageProvider("dalle", {
  generateImage: async (request) => { /* ... */ },
  upscaleImage: async (request) => { /* ... */ },
});

const video = new VideoProvider("runway", {
  textToVideo: async (request) => { /* ... */ },
  imageToVideo: async (request) => { /* ... */ },
});

const voice = new VoiceProvider("elevenlabs", {
  cloneVoice: async (request) => { /* ... */ },
  listVoices: async () => { /* ... */ },
  deleteVoice: async (voiceId) => { /* ... */ },
});

MemoryBackend

Custom memory storage backends are created by passing handler functions to the MemoryBackend constructor.

const backend = new MemoryBackend({
  put: async (entry) => { /* store entry */ },
  get: async (id) => { /* retrieve by id, return null if missing */ },
  delete: async (id) => { /* delete by id, return true if existed */ },
  list: async () => { /* return all entries */ },
  len: async () => { /* return entry count */ },
  searchByBands: async (bands, limit) => { /* return candidates */ },
});

Handler Methods

Method	Signature	Description
`put`	`(entry: any) => Promise<void>`	Insert or update a stored entry.
`get`	`(id: string) => Promise<any \| null>`	Retrieve a stored entry by id.
`delete`	`(id: string) => Promise<boolean>`	Delete an entry by id. Returns `true` if it existed.
`list`	`() => Promise<any[]>`	Return all stored entries.
`len`	`() => Promise<number>`	Return the number of stored entries.
`searchByBands`	`(bands: any, limit: number) => Promise<any[]>`	Return candidate entries sharing at least one LSH band.

InMemoryBackend

A typed, Rust-native in-memory MemoryBackend implementation. Unlike MemoryBackend (which round-trips every call through user-supplied JS callbacks), InMemoryBackend keeps reads and writes inside the WASM linear memory — no JS overhead per call.

import { InMemoryBackend, Memory, EmbeddingModel } from '@blazen/sdk';

const backend = new InMemoryBackend();
const embedder = EmbeddingModel.openai();
const memory = Memory.fromBackend(embedder, backend);
await memory.add('doc1', 'hello world', null);

Methods

Method	Signature	Description
`put`	`(entry: WasmStoredEntry) => Promise<void>`	Insert or update a stored entry.
`get`	`(id: string) => Promise<WasmStoredEntry \| null>`	Retrieve a stored entry by id.
`delete`	`(id: string) => Promise<boolean>`	Delete an entry by id. Returns `true` if it existed.
`list`	`() => Promise<WasmStoredEntry[]>`	Return all stored entries.
`len`	`() => Promise<number>`	Return the number of stored entries.
`isEmpty`	`() => Promise<boolean>`	`true` if the backend contains no entries.
`searchByBands`	`(bands: string[], limit: number) => Promise<WasmStoredEntry[]>`	Return candidate entries sharing at least one LSH band.

Memory factory methods

Factory	Signature	Description
`Memory.fromBackend`	`(embedder: EmbeddingModel, backend: InMemoryBackend) => Memory`	Full-mode memory (embedding-based search) backed by a typed `InMemoryBackend`.
`Memory.localFromBackend`	`(backend: InMemoryBackend) => Memory`	Local-only mode (`SimHash` only, `searchLocal()` available; `search()` rejects) backed by a typed `InMemoryBackend`.

const localMem = Memory.localFromBackend(new InMemoryBackend());
await localMem.add('doc', 'hello world', null);
const hits = await localMem.searchLocal('hello', 5, null);

MemoryResult

Standalone class representing a single result returned by Memory.search() / Memory.searchLocal(). Exposed primarily as a typed return value for downstream code that wants to construct MemoryResults from JS (e.g. when implementing a custom MemoryStore).

Constructor

new MemoryResult(id: string, text: string, score: number, metadata: any);

Properties

Property	Type	Description
`.id`	`string`	The entry identifier.
`.text`	`string`	The stored text content.
`.score`	`number`	Similarity score in `[0, 1]`, higher means more similar.
`.metadata`	`any`	Arbitrary metadata, decoded from JSON to a JS value.

ModelManager

Per-pool memory budget-aware model manager with LRU eviction. Not typically used in WASM (where GPU model loading is uncommon), but available for tracking model state across CPU and GPU pools.

ModelManager is a memory budget bookkeeper, not a performance scheduler. It answers “will this fit?” — not “will this run fast?”. Whether a 70B model loaded on CPU is useful at 1–3 tok/s is a workload-choice question the manager intentionally does not answer.

Backed by the real blazen_manager::ModelManager. Method names match the native and Node bindings; the WASM constructor takes one or two positional number arguments in gigabytes (no options object). Unlike the Node binding, WASM byte-quantity getters return plain number (f64) — JS doubles carry 53 bits of mantissa, more than enough for any realistic memory budget — so there is no BigInt migration on this surface.

Constructor

const manager = new ModelManager(8);       // 8 GB CPU pool budget; no GPU pool
const manager = new ModelManager(8, 24);   // 8 GB CPU pool + 24 GB GPU pool (rare for WASM, but supported)

Argument	Type	Description
`cpuRamGb`	`number`	Host RAM budget in gigabytes for the `"cpu"` pool. Converted to bytes internally (`cpuRamGb * 1_073_741_824`).
`gpuVramGb`	`number?`	Optional VRAM budget in gigabytes for the `"gpu:0"` pool. Omit if your app only loads CPU models (the common case in browsers).

Methods

Method	Signature	Description
`register`	`await manager.register(id, model, memoryEstimateBytes, lifecycle)`	Register a model with its estimated memory footprint (`memoryEstimateBytes: number` bytes — host RAM if on CPU, GPU VRAM otherwise) and a JS lifecycle object. The lifecycle accepts async `load()` / `unload()` / `isLoaded()` plus optional `memoryBytes()` (async) and `device()` (sync — returns the pool string, defaults to `"cpu"`).
`load`	`await manager.load(id)`	Load a model, evicting LRU models in the same pool if needed.
`unload`	`await manager.unload(id)`	Unload a model and free its memory.
`isLoaded`	`await manager.isLoaded(id): boolean`	Check if a model is currently loaded.
`ensureLoaded`	`await manager.ensureLoaded(id)`	Alias for `load()`.
`usedBytes`	`await manager.usedBytes(pool?: string): number`	Bytes currently used by loaded models in the given pool. `pool` defaults to `"cpu"`. Invalid pool labels reject with `invalid pool label '<x>': expected 'cpu', 'gpu', or 'gpu:N' where N is a non-negative integer`.
`availableBytes`	`await manager.availableBytes(pool?: string): number`	Bytes still available within the given pool’s budget. Same default and validation as `usedBytes`.
`pools`	`manager.pools(): Array<{ pool: string; budgetBytes: number }>`	Sync. List every configured pool and its byte budget.
`status`	`await manager.status(): { id: string; loaded: boolean; memoryEstimateBytes: number; pool: string }[]`	Status of all registered models.

Properties

Property	Type	Description
`.budgetBytes`	`number`	Read-only getter returning the configured CPU pool budget in bytes (`cpuRamGb * 1_073_741_824`). For GPU pool budgets, use `pools()`.

ModelRegistry

JS-callback ABC for advertising a model catalog. Wraps a JS object implementing listModels() and getModel(modelId) so browser code can plug a custom registry into Blazen’s model-info lookup surface. Mirrors the trait at blazen_llm::traits::ModelRegistry and reaches parity with PyModelRegistry (Python) / JsModelRegistry (Node).

Constructor

import init, { ModelRegistry } from "@blazen/sdk";
import type { ModelInfo } from "@blazen/sdk";

await init();

const registry = new ModelRegistry({
  async listModels(): Promise<ModelInfo[]> {
    const res = await fetch("/api/models");
    return res.json();
  },
  async getModel(modelId: string): Promise<ModelInfo | null> {
    const res = await fetch(`/api/models/${modelId}`);
    return res.ok ? res.json() : null;
  },
});

const models = await registry.listModels();

The constructor argument must implement the ModelRegistryImpl interface (auto-emitted into crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.ts):

export interface ModelRegistryImpl {
  listModels(): Promise<ModelInfo[]> | ModelInfo[];
  getModel(modelId: string): Promise<ModelInfo | null> | ModelInfo | null;
}

Both methods may return either a Promise or a synchronous value; the binding awaits whichever is produced.

Methods

Method	Signature	Description
`listModels`	`await registry.listModels(): Promise<ModelInfo[]>`	Returns whatever the JS `listModels()` callback resolved to.
`getModel`	`await registry.getModel(modelId: string): Promise<ModelInfo \| null>`	Returns whatever the JS `getModel()` callback resolved to, or `null` if the model is unknown.

The registry returns plain ModelInfo objects — the same tsify-generated shape produced elsewhere on the WASM surface and documented in crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.ts.

Pricing Functions

registerPricing()

import { registerPricing } from "@blazen/sdk";

registerPricing("my-model", 1.0, 2.0);
// Arguments: modelId, inputPerMillion, outputPerMillion

lookupPricing()

Look up pricing for a model by ID. Returns null if the model is unknown.

import { lookupPricing } from "@blazen/sdk";

const pricing = lookupPricing("gpt-4o");
if (pricing) {
  console.log(`Input: $${pricing.inputPerMillion}/M tokens`);
}

The returned object has the shape:

{
  inputPerMillion: number;
  outputPerMillion: number;
}

OTLP Telemetry

OpenTelemetry trace export over HTTP/protobuf. Behind the otlp-http Cargo feature on the blazen-wasm-sdk crate (the default opentelemetry-otlp/grpc-tonic transport is wasm-incompatible because tonic requires tokio networking that does not exist on wasm32). The WASM build instead routes spans through a custom WasmFetchHttpClient that posts protobuf bodies via web_sys::fetch.

`new OtlpConfig(endpoint: string, serviceName: string)`

Argument	Type	Description
`endpoint`	`string`	Full HTTP/protobuf traces endpoint, e.g. `"http://localhost:4318/v1/traces"`.
`serviceName`	`string`	Reported to the backend as the `service.name` resource attribute.

Read-only getters: .endpoint, .serviceName.

`initOtlp(config: OtlpConfig): void`

Install the global OTLP exporter and a tracing-subscriber stack with an OpenTelemetry layer. Must be called once at startup; subsequent calls fail because the global subscriber can only be installed a single time.

import init, { OtlpConfig, initOtlp } from '@blazen/sdk';

await init();
const cfg = new OtlpConfig('http://localhost:4318/v1/traces', 'my-wasm-app');
initOtlp(cfg);
// All subsequent workflow / pipeline / completion spans are exported.

If the collector is unreachable the export simply drops spans; it never blocks the calling workflow.

Error Handling

All errors are thrown as JavaScript Error objects. The message format indicates the category:

Error Pattern	Description
`"authentication failed: ..."`	Invalid or expired API key
`"rate limited"`	Provider rate limit hit
`"timed out after {ms}ms"`	Request timed out
`"{provider} error: ..."`	Provider-specific error
`"invalid input: ..."`	Validation error
`"unsupported: ..."`	Feature not supported by provider

try {
  const response = await model.complete([ChatMessage.user('Hello')]);
} catch (e) {
  if (e.message.startsWith('rate limited')) {
    // Back off and retry
  }
}

WASM API Reference

init()

CompletionModel

Provider Factory Methods

model.withModel(modelId: string): CompletionModel

Properties

await model.complete(messages: ChatMessage[]): CompletionResponse

await model.completeWithOptions(messages: ChatMessage[], options: CompletionOptions): CompletionResponse

await model.stream(messages: ChatMessage[], onChunk: (chunk) => void): void

Middleware Decorators

model.withRetry(maxRetries?: number): CompletionModel

model.withCache(ttlSeconds?: number, maxEntries?: number): CompletionModel

CompletionModel.withFallback(models: CompletionModel[]): CompletionModel

ChatMessage

Static Factory Methods

Constructor

Properties

JSON Shape

ToolOutput

Tool handler return shapes

LlmPayload

Per-provider behavior for data (when llm_override is null)

Examples

CompletionResponse

CompletionOptions

ToolCall

ToolDefinition

Content Subsystem

ContentKind

ContentHandle

ImageSource

ContentStore

Subclassing ContentStore

ContentStore.custom({...})

Built-in stores

Tool-input schema helpers

How resolution works

TokenUsage

RequestTiming

runAgent

Parameters

AgentRunOptions

AgentResult

Workflow

new Workflow(name: string)

.addStep(name: string, eventTypes: string[], handler: StepHandler)

await wf.run(input: object): any

await wf.runStreaming(input: any, callback: (event: any) => void): Promise<any>

await wf.runWithHandler(input: any): Promise<WorkflowHandler>

wf.setSessionPausePolicy(policy: string): void

await wf.resumeWithSerializableRefs(snapshot: any, deserializers: Record<string, (bytes: Uint8Array) => unknown>): Promise<WorkflowHandler>

WorkflowHandler

Methods

Streaming events

Pause and snapshot

Human-in-the-loop

Pipeline

new PipelineBuilder(name: string)

Methods

IndexedDB persistence example

Stage

ParallelStage

Context (WasmContext)

StateValue

Methods

Properties

StateNamespace

Methods

SessionNamespace

Methods

BlazenState

BlazenStateMeta

Detection

Per-field storage

restore()

Synchronous execution

Events

Start Event

Stop Event

EmbeddingModel

`model.withModel(modelId: string): CompletionModel`

`await model.complete(messages: ChatMessage[]): CompletionResponse`

`await model.completeWithOptions(messages: ChatMessage[], options: CompletionOptions): CompletionResponse`

`await model.stream(messages: ChatMessage[], onChunk: (chunk) => void): void`

`model.withRetry(maxRetries?: number): CompletionModel`

`model.withCache(ttlSeconds?: number, maxEntries?: number): CompletionModel`

`CompletionModel.withFallback(models: CompletionModel[]): CompletionModel`

Per-provider behavior for `data` (when `llm_override` is null)

Subclassing `ContentStore`

`ContentStore.custom({...})`

`new Workflow(name: string)`

`.addStep(name: string, eventTypes: string[], handler: StepHandler)`

`await wf.run(input: object): any`

`await wf.runStreaming(input: any, callback: (event: any) => void): Promise<any>`

`await wf.runWithHandler(input: any): Promise<WorkflowHandler>`

`wf.setSessionPausePolicy(policy: string): void`

`await wf.resumeWithSerializableRefs(snapshot: any, deserializers: Record<string, (bytes: Uint8Array) => unknown>): Promise<WorkflowHandler>`

`new PipelineBuilder(name: string)`

`await model.embed(texts: string[]): Promise<number[][]>`

`EmbeddingModel.tract(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<EmbeddingModel>`

`TractEmbedModel.create(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<TractEmbedModel>`

`await model.embed(texts: string[]): Promise<Float32Array[]>`

`estimateTokens(text: string, contextSize?: number): number`

`countMessageTokens(messages: object[], contextSize?: number): number`

`new OtlpConfig(endpoint: string, serviceName: string)`

`initOtlp(config: OtlpConfig): void`